# __Using the DataRobot API to streamline model building and deployment workflow__

## Steps:
- Step 1 - Importing the libraries 
- Setp 2 - Get our API Token 
- Step 3 - Load the dataset 
- Step 4 - Setup the project 
- Step 5 - Run the Automated Time Series Process across multiple stores
- Step 6 - Deploy the best model 
- Step 7 - Make predictions 

But before we begin, let me walk through the problem that we are trying to solve first...

# Step 1 - Importing the libraries

In [1]:
#!pip install datarobot 
import datarobot as dr
import pandas as pd
import datetime
import numpy as np
import os 

# Step 2 - Get our API Token 


In [2]:
token = "yourapitoken"
endpoint = "https://app.datarobot.com/api/v2"
dr.Client(token = token, endpoint = endpoint)

<datarobot.rest.RESTClientObject at 0x7ff85b835590>

# Step 3 - Load the dataset

In [17]:
# The main dataset
filename = '2frag_trend_daily_2020.csv'
project_name = 'BlueScope-NorthStar-TS-Project-2'

In [None]:
# create the calendar
#calendar = dr.CalendarFile.create('event_calendar.csv')

We can check the full list of projects in DataRobot  
```
dr.Project.list()
```

# Step 4 - Setup the project 
## Create project in DataRobot 

If the project already exist we can just pick it up using this command 
```
id = '###'
proj = dr.Project.get(id)
```

or delete an existing project ... 
```
id = '###'
proj = dr.Project.get(id)
proj.delete()
```

## Identify Known-In-Advance Features

In [None]:
#known_in_advance = ['Marketing', 'TouristEvent']
#feature_settings = [dr.FeatureSettings(feat_name, known_in_advance=True) for feat_name in known_in_advance]

In [15]:
#feature_settings


In [21]:
# Create the partition specification 
time_partition = dr.DatetimePartitioningSpecification(
    datetime_partition_column='Date',
    use_time_series=True,
    feature_settings=None,
    calendar_id=None,
    gap_duration='P1D',
)


# Step 5 - Run the Automated Time Series Process across multiple stores

In [5]:
# Lets read the data in pandas
raw_data = pd.read_csv(filename)

In [7]:
raw_data['#2Frag Cu_calc'][1]

0.214951445

In [10]:
TrainingDataSet = pd.read_csv(filename)

In [22]:
# Start time
start = datetime.datetime.now()
project = dr.Project.start(sourcedata=TrainingDataSet,
                                project_name = project_name, 
                                target = '#2Frag Cu_calc',
                                partitioning_method=time_partition,
                                worker_count = -1)


## Whilst we wait for this to finish, lets check out a few things: 
 - Check out the public documentation and examples [here](https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.21.3/examples/index.html)

- Check out our community blog [here](https://community.datarobot.com/t5/guided-ai-learning/introduction-to-a-model-factory/ta-p/1599) to see other plots such as ROC curves and Feature effects

- Check out the DataRobot main console to see what's happening 

- Now lets have a look at a custom model deployment

In [24]:

#Wait for AutoPilot to finish

project.wait_for_autopilot()

end = datetime.datetime.now()
delta = end - start 
print(delta.seconds / 60, 'minutes')

In progress: 1, queued: 0 (waited: 0s)
In progress: 1, queued: 0 (waited: 2s)
In progress: 1, queued: 0 (waited: 4s)
In progress: 1, queued: 0 (waited: 5s)
In progress: 0, queued: 0 (waited: 7s)
In progress: 0, queued: 0 (waited: 10s)
In progress: 0, queued: 0 (waited: 15s)
In progress: 0, queued: 0 (waited: 22s)
In progress: 0, queued: 0 (waited: 36s)
In progress: 0, queued: 0 (waited: 57s)
In progress: 1, queued: 0 (waited: 79s)
In progress: 1, queued: 0 (waited: 100s)
In progress: 1, queued: 0 (waited: 121s)
In progress: 1, queued: 0 (waited: 142s)
In progress: 0, queued: 0 (waited: 164s)
In progress: 1, queued: 0 (waited: 186s)
In progress: 0, queued: 0 (waited: 208s)
In progress: 1, queued: 0 (waited: 229s)
In progress: 1, queued: 0 (waited: 250s)
In progress: 1, queued: 0 (waited: 271s)
In progress: 1, queued: 0 (waited: 293s)
In progress: 1, queued: 0 (waited: 315s)
In progress: 0, queued: 0 (waited: 336s)
In progress: 0, queued: 0 (waited: 357s)
In progress: 0, queued: 0 (waite

# Step 6 - Deploy the best model 


In [25]:
# quick check to see which models we are deploying for each series 

best_models = project.get_models()[0]
print('--------------------------------')
#print('Best model for admission type id: %s' %key)
print(best_models)
print(best_models.metrics['RMSE']['crossValidation'])
print('--------------------------------')

--------------------------------
Model('Ridge Regressor with Forecast Distance Modeling')
None
--------------------------------


In [27]:

# Get the model ID 

model = project.get_models()[0]

# Create the deployment 
deployment = dr.Deployment.create_from_learning_model(
    model.id,
    label='DEMO_DEPLOYMENT_BLUESCOPE'  ,
    default_prediction_server_id=dr.PredictionServer.list()[0].id
    )



Lets have a look at the deployments page whilst we wait for the deployments to be created 

# Step 7 - Make predictions


In [33]:
pred_input = pd.read_csv("2frag_trend_daily_2020_predictions.csv")
print(deployment)

Deployment(DEMO_DEPLOYMENT_BLUESCOPE)


In [34]:

deployment_id = deployment.id
# Check that we are sending the scoring data to the right deployment
print(deployment_id)

dr.BatchPredictionJob.score_to_file(
    deployment_id,
    pred_input,
    'Output_2frag_trend_daily_2020_predictions.csv',
)



601b5739534c868107822b5c


BatchPredictionJob(batchPredictions, '601b60b393557bf44ee0a956', status=INITIALIZING)

This is only a small teaser.

We can of course, easily integrate with other systems seamlessly which is the biggest strength of DataRobot. See this example where a TS model is trained periodically and predictions wrote back to a proper DB 