# Automated Time Series Modeling
## with DataRobot Python API

<pre>raul.arrabales@datarobot.com</pre>

<img src="https://www.datarobot.com/wp-content/uploads/2019/10/Automated-Time-Series.jpg" width=400>

<hr>

- See https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.25.1/setup/getting_started.html#installation

Additionally: 

- API Client Documentation: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.26.0/ 
- API Reference: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.25.1/autodoc/api_reference.html
- Time Series Modeling Example: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.14.0/examples/time_series/Time_Series_Modeling.html 

### Imports

In [8]:
from datetime import date
from datetime import datetime
import pandas as pd

import datarobot as dr

### Interactive shell

In [1]:
# Set interactive shell in Jupyter
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### DataRobot Python Client

In [3]:
# Create and configure the client
dr.Client(config_path = 'drconfig.yaml')

<datarobot.rest.RESTClientObject at 0x19114876d30>

### Create the Time Series Project

In [9]:
filename = 'D:\Dropbox-Array2001\Dropbox\DataSets\DataRobot\TimeSeries\DR_Demo_Sales_Multiseries_training.xlsx'

now = datetime.now().strftime('%Y-%m-%dT%H:%M')

project_name = 'DR_PyAPI_Demo_Sales_Multiseries_{}'.format(now)

proj = dr.Project.create(sourcedata=filename,
                         project_name=project_name,
                         max_wait=3600)

print('Project ID: {}'.format(proj.id))

Project ID: 6192a87a11ec111ba3ad3a9f


### Setting known-in-advanced features

In [10]:
known_in_advance = ['Marketing', 
                    'Near_Xmas', 
                    'Near_BlackFriday',
                    'Holiday', 
                    'DestinationEvent']

feature_settings = [dr.FeatureSettings(feat_name,
                                       known_in_advance=True)
                    for feat_name in known_in_advance]

### Multi Time Series settings
- One time series per store

In [11]:
time_partition = dr.DatetimePartitioningSpecification(
    datetime_partition_column='Date',
    multiseries_id_columns=['Store'],
    use_time_series=True,
    feature_settings=feature_settings,
)

### Set target and start autopilot

In [13]:
proj.set_target(
    target='Sales',
    partitioning_method=time_partition,
    max_wait=3600,
    worker_count=-1
)

print("Project GUI: " + proj.get_leaderboard_ui_permalink())

proj.wait_for_autopilot()

Project(DR_PyAPI_Demo_Sales_Multiseries_2021-11-15T19:35)

Project GUI: https://app.datarobot.com/projects/6192a87a11ec111ba3ad3a9f/models
In progress: 20, queued: 3 (waited: 0s)
In progress: 20, queued: 3 (waited: 1s)
In progress: 20, queued: 3 (waited: 2s)
In progress: 20, queued: 3 (waited: 3s)
In progress: 20, queued: 3 (waited: 5s)
In progress: 20, queued: 3 (waited: 7s)
In progress: 20, queued: 3 (waited: 11s)
In progress: 20, queued: 3 (waited: 19s)
In progress: 20, queued: 3 (waited: 33s)
In progress: 20, queued: 3 (waited: 54s)
In progress: 20, queued: 2 (waited: 75s)
In progress: 16, queued: 0 (waited: 96s)
In progress: 7, queued: 0 (waited: 116s)
In progress: 6, queued: 0 (waited: 137s)
In progress: 4, queued: 0 (waited: 158s)
In progress: 2, queued: 0 (waited: 178s)
In progress: 4, queued: 0 (waited: 199s)
In progress: 4, queued: 0 (waited: 220s)
In progress: 4, queued: 0 (waited: 241s)
In progress: 4, queued: 0 (waited: 261s)
In progress: 1, queued: 0 (waited: 282s)
In progress: 0, queued: 0 (waited: 303s)
In progress: 0, queued: 

### Choose the best model

In [23]:
proj.get_models()[:10]

[Model('Ridge Regressor with Forecast Distance Modeling and Series Scaling'),
 Model('Ridge Regressor with Forecast Distance Modeling and Series Scaling'),
 Model('AVG Blender'),
 Model('Ridge Regressor with Forecast Distance Modeling and Series Scaling'),
 Model('Ridge Regressor with Forecast Distance Modeling and Series Scaling'),
 Model('Ridge Regressor with Forecast Distance Modeling and Series Scaling'),
 Model('eXtreme Gradient Boosted Trees Regressor with Early Stopping (learning rate =0.3)'),
 Model('Temporal Hierarchical Model with Elastic Net and XGBoost'),
 Model('Temporal Hierarchical Model with Elastic Net and XGBoost'),
 Model('eXtreme Gradient Boosted Trees Regressor with Early Stopping (learning rate =0.3)')]

In [15]:
lb = proj.get_models()

valid_models = [m for m in lb if
                m.metrics[proj.metric]['crossValidation']]

best_model = min(valid_models,
                 key=lambda m: m.metrics[proj.metric]['crossValidation'])

print(best_model.model_type)
print(best_model.get_leaderboard_ui_permalink())

Ridge Regressor with Forecast Distance Modeling and Series Scaling
https://app.datarobot.com/projects/6192a87a11ec111ba3ad3a9f/models/6192ad548dcbcd9587ee0fd7


### Unlock holdout

In [20]:
proj.unlock_holdout()

Project(DR_PyAPI_Demo_Sales_Multiseries_2021-11-15T19:35)

In [26]:
# job = best_model.request_frozen_datetime_model()
# retrained_model = job.get_result_when_complete()

retrained_model_id = '6192ae23c4c1fc8b9c4b6f4d'
retrained_model = dr.Model.get(project=proj, model_id=retrained_model_id)

print(retrained_model.get_leaderboard_ui_permalink())

  """


https://app.datarobot.com/projects/6192a87a11ec111ba3ad3a9f/models/6192ae23c4c1fc8b9c4b6f4d


### Make Predictions

In [24]:
d = pd.read_excel('D:\Dropbox-Array2001\Dropbox\DataSets\DataRobot\TimeSeries\DR_Demo_Sales_Multiseries_training.xlsx')
last_train_date = pd.to_datetime(d['Date']).max()

dataset = proj.upload_dataset(
    'D:\Dropbox-Array2001\Dropbox\DataSets\DataRobot\TimeSeries\DR_Demo_Sales_Multiseries_prediction.xlsx',
    forecast_point=last_train_date
)

pred_job = best_model.request_predictions(dataset_id=dataset.id)
preds = pred_job.get_result_when_complete()

In [25]:
preds.head()

Unnamed: 0,forecast_distance,forecast_point,prediction,row_id,series_id,timestamp
0,1,2014-06-14T00:00:00.000000Z,127432.876904,714,Louisville,2014-06-15T00:00:00.000000Z
1,2,2014-06-14T00:00:00.000000Z,126751.622044,715,Louisville,2014-06-16T00:00:00.000000Z
2,3,2014-06-14T00:00:00.000000Z,133469.223391,716,Louisville,2014-06-17T00:00:00.000000Z
3,4,2014-06-14T00:00:00.000000Z,129163.909011,717,Louisville,2014-06-18T00:00:00.000000Z
4,5,2014-06-14T00:00:00.000000Z,149470.124649,718,Louisville,2014-06-19T00:00:00.000000Z


In [27]:
preds.to_csv('D:\Dropbox-Array2001\Dropbox\DataSets\DataRobot\TimeSeries\DR_Demo_Sales_Multiseries_prediction_output.csv', index=False)