# Driverless AI - Time Series Recipes with Rolling Window

The purpose of this notebook is to show an example of using Driverless AI to forecast on dates outside of the forecast horizon.  This can be accomplished in two ways: 

1. **Re-Training**: Trigger a Driverless AI experiment to be trained once the forecast horizon ends
2. **Test Time Augmentation**: Use the same Driverless AI experiment and use Test Time Augmentation to update historical features


Let's take our example of forecasting sales for the [Walmart Kaggle Competition](https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting). 
With Re-Training, we would launch a new Driverless AI experiment every week with the latest data and use the resulting model to forecast for the next week. With Test Time Augmentation, we would continue using the same Driverless AI experiment outside of the initial forecast horizon.

Both options have their advantages and disadvantages.  By re-training an experiment with the latest data, Driverless AI has the ability to possibly improve the model by changing the features used, choosing a different algorithm, and/or selecting different parameters.  As the data changes over time, for example, Driverless AI may find that the best algorithm for this use case has changed.

Using Test Time Augmentation to be able to continue using the same experiment over a longer period of time means there is no need to continually repeat a model review process.  The model may become out of date, however, and the MOJO scoring pipeline is not supported.

For different use cases, there may be clear advantages for Re-Training or Test Time Augmentation.  In this notebook, we will show how to implement either option and evaluate how the model performs over time.

**Note**: This notebook was tested and run on Driverless AI 1.8.1.


## Workflow

1. Import data into Python
2. Format data for Time Series
3. Perform Re-Training
    * create function that slices data by date
    * for each slice of data: 
        * import data into Driverless AI
        * train an experiment
        * combine test predictions
4. Perform Test Time Augmentation
    * train an experiment on historical data
    * forecast sales over future data using Test Time Augmentation
5. Evaluate the performance of Re-Training vs Test Time Augmentation

In [1]:
import pandas as pd
from collections import OrderedDict
from h2oai_client import Client

## Step 1: Import Data

We will begin by importing our data using pandas. 

In [2]:
sales_data = pd.read_csv("./walmart_train.csv")
sales_data.head()

Unnamed: 0,Store,Dept,Date,Weekly_Sales,Temperature,Fuel_Price,MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5,CPI,Unemployment,IsHoliday,sample_weight
0,1,1,2010-02-05,24924.5,42.31,2.572,-1.0,-1.0,-1.0,-1.0,-1.0,211.096358,8.106,0,1
1,1,2,2010-02-05,50605.27,42.31,2.572,-1.0,-1.0,-1.0,-1.0,-1.0,211.096358,8.106,0,1
2,1,3,2010-02-05,13740.12,42.31,2.572,-1.0,-1.0,-1.0,-1.0,-1.0,211.096358,8.106,0,1
3,1,4,2010-02-05,39954.04,42.31,2.572,-1.0,-1.0,-1.0,-1.0,-1.0,211.096358,8.106,0,1
4,1,5,2010-02-05,32229.38,42.31,2.572,-1.0,-1.0,-1.0,-1.0,-1.0,211.096358,8.106,0,1


In [3]:
# Convert Date column to datetime
sales_data["Date"] = pd.to_datetime(sales_data["Date"], format="%Y-%m-%d")

## Step 2: Format Data for Time Series

The data has one record per Store, Department, and Week.  Our goal for this use case will be to forecast the total sales for the next week.

The only features we should use as predictors are ones that we will have available at the time of scoring.  Features like the Temperature, Fuel Price, and Unemployment will not be known in advance.  Therefore, before we start our Driverless AI experiments, we will choose to use the previous week's Temperature, Fuel Price, Unemployment, and CPI attributes.  This information we will know at time of scoring.

In [4]:
lag_variables = ["Temperature", "Fuel_Price", "CPI", "Unemployment"]
dai_data = sales_data.set_index(["Date", "Store", "Dept"])
lagged_data = dai_data.loc[:, lag_variables].groupby(level=["Store", "Dept"]).shift(1)

In [5]:
# Join lagged predictor variables to training data
dai_data = dai_data.join(lagged_data.rename(columns=lambda x: x +"_lag"))

In [6]:
# Drop original predictor variables - we do not want to use these in the model 
dai_data = dai_data.drop(lagged_data, axis=1)
dai_data = dai_data.reset_index()

In [7]:
dai_data.head()

Unnamed: 0,Date,Store,Dept,Weekly_Sales,MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5,IsHoliday,sample_weight,Temperature_lag,Fuel_Price_lag,CPI_lag,Unemployment_lag
0,2010-02-05,1,1,24924.5,-1.0,-1.0,-1.0,-1.0,-1.0,0,1,,,,
1,2010-02-05,1,2,50605.27,-1.0,-1.0,-1.0,-1.0,-1.0,0,1,,,,
2,2010-02-05,1,3,13740.12,-1.0,-1.0,-1.0,-1.0,-1.0,0,1,,,,
3,2010-02-05,1,4,39954.04,-1.0,-1.0,-1.0,-1.0,-1.0,0,1,,,,
4,2010-02-05,1,5,32229.38,-1.0,-1.0,-1.0,-1.0,-1.0,0,1,,,,


## Step 3: Perform Re-Training

For Re-Training, we will create a function that slices the data by date.  For each window, we will train a Driverless Ai model on 139 weeks of data and forecast the next week.

### Get Moving Windows

We will create a function that can split our data by time to create multiple experiments.  Our function will split the data into training and testing based on the training length and testing length specified by the user.  

In [8]:
def get_moving_windows(dataset, train_len, test_len, date_col):
    
    # Calculate windows for the training and testing data based on the train_len and test_len arguments
    unique_dates = dataset[date_col].unique()
    unique_dates.sort()
    num_dates = len(unique_dates)
    num_windows = (num_dates - train_len) // test_len
    print("Number of Training Windows: ", num_windows)
    
    windows = []
    for i in range(num_windows):
        train_start_date = unique_dates[i]
        train_end_date = unique_dates[(i + train_len - 1)]
        test_start_date = unique_dates[(i + train_len)]
        test_end_date = unique_dates[(i + train_len + test_len - 1)]
        
        window = {'train_start_date': train_start_date, 
                  'train_end_date': train_end_date, 
                  'test_start_date': test_start_date,
                  'test_end_date': test_end_date}
        windows.append(window)
        
    return windows    

In [9]:
pd.DataFrame([OrderedDict(x) for x in get_moving_windows(dai_data, 139, 1, "Date")])

Number of Training Windows:  4


Unnamed: 0,train_start_date,train_end_date,test_start_date,test_end_date
0,2010-02-05,2012-09-28,2012-10-05,2012-10-05
1,2010-02-12,2012-10-05,2012-10-12,2012-10-12
2,2010-02-19,2012-10-12,2012-10-19,2012-10-19
3,2010-02-26,2012-10-19,2012-10-26,2012-10-26


### Train Model per Moving Window

Our next function trains the experiment for each subset of data and saves the forecast for the test data.

In [10]:
def dai_get_forecast(train_data, test_data, predictors, target, date_col, time_group_cols, 
                     accuracy, time, interpretability):
    
    # Save dataset
    train_path = "./train_data.csv"
    test_path = "./test_data.csv"
    keep_cols = predictors + [target, date_col] + time_group_cols
    keep_cols = list(set(keep_cols))
    train_data[keep_cols].to_csv(train_path, index = False)
    test_data[keep_cols].to_csv(test_path, index = False)
    
    # Add datasets to Driverless AI
    train_dai = h2oai.upload_dataset_sync(train_path)
    test_dai = h2oai.upload_dataset_sync(test_path)
    
    # Run Driverless AI Experiment
    experiment = h2oai.start_experiment_sync(dataset_key = train_dai.key,
                                             target_col = target,
                                             cols_to_drop = [],
                                             is_classification = False,
                                             accuracy = accuracy,
                                             time = time,
                                             interpretability = interpretability,
                                             scorer = "RMSE",
                                             is_time_series = date_col,
                                             time_groups_columns = time_group_cols,
                                             num_prediction_periods = test_data[date_col].nunique(),
                                             num_gap_periods = 0)
    
    # Predict on the Test Data
    pred_job = h2oai.make_prediction_sync(experiment.key, test_dai.key,
                                          output_margin = False, pred_contribs = False)
    test_predictions_path = h2oai.download(pred_job.predictions_csv_path, "./")
    test_predictions = pd.read_csv(test_predictions_path)
    test_predictions.columns = ["Prediction"]
    
    # Add predictions to original test data
    keep_cols = [target, date_col] + time_group_cols
    test_predictions = pd.concat([test_data[keep_cols].reset_index(drop=True), test_predictions], axis = 1)
    
    return test_predictions

### Run Re-Training

Now that we have our helper functions, we can run Re-Training.

For each 139 weeks of data, we will: 

* train a Driverless AI model
* forcast the next week's prediction

This will give us 4 weeks of predictions.

In [11]:
address = 'http://ip_where_driverless_is_running:12345'
username = 'username'
password = 'password'
h2oai = Client(address = address, username = username, password = password)
# make sure to use the same user name and password when signing in through the GUI

In [12]:
windows = get_moving_windows(dai_data, train_len = 139, test_len = 1, date_col = "Date")    

Number of Training Windows:  4


In [16]:
forecast_predictions = pd.DataFrame([])

In [14]:
predictors = ["MarkDown1", "MarkDown2", "MarkDown3", "MarkDown4", "MarkDown5", "IsHoliday",
              "Temperature_lag", "Fuel_Price_lag", "CPI_lag", "Unemployment_lag"]

In [15]:
for window in windows:
    train_data = dai_data[(dai_data["Date"] >= window.get("train_start_date")) & 
                          (dai_data["Date"] <= window.get("train_end_date"))]

    test_data = dai_data[(dai_data["Date"] >= window.get("test_start_date")) & 
                         (dai_data["Date"] <= window.get("test_end_date"))]

    # Get the Driverless AI forecast predictions
    preds = dai_get_forecast(train_data, test_data, 
                             predictors, 
                             target = "Weekly_Sales", 
                             date_col = "Date", 
                             time_group_cols = ["Store", "Dept"], 
                             accuracy = 1, time = 1, interpretability = 10)
    forecast_predictions = forecast_predictions.append(preds)

In [17]:
forecast_predictions.head()

## Step 4: Perform Test Time Augmentation

For Test Time Augmentation, we will train a single Driverless AI experiment on the first 139 weeks of data.  We will then use it to forecast the next 4 weeks.

Note: Our model is only being told it will need to forecast the next week.  To trigger Test Time Augmentation, I must provide information on everything that happens after the training data ends up until the week I want to forecast for.  For example, let's say my model training data ends on 2012-09-28 and my forecast horizon ends on 2012-10-05 (one week after).  Now if I want to forecast further out, I must provide information about what actually happened on 2012-10-05.  This allows Driverless AI to update any features that use historical information (like what was the sales the week before).

When formatting to predict for 2010-10-12, I would create the test data to look like the following: 

| Date | Store | Dept | Predictors | Weekly_Sales |
|------|-------|------|------------|--------------|
| 2010-10-05 | 1 | 1 | ... | 21904.47 | 
| 2010-10-12 | 1 | 1 | ... | NA | 


When formatting to predict for 2010-10-19, I would create the test data to look like the following: 

| Date | Store | Dept | Predictors | Weekly_Sales |
|------|-------|------|------------|--------------|
| 2010-10-05 | 1 | 1 | ... | 21904.47 | 
| 2010-10-12 | 1 | 1 | ... | 22764.01 | 
| 2010-10-19 | 1 | 1 | ... | NA | 

### Train Driverless AI Experiment

Let's begin by training our Driverless AI model.  We will train on data up until 2012-09-28. We will use the following time series parameters:

    Time Group Columns: [Store, Dept]
    Number of Prediction Periods: 1 (a.k.a., horizon)
    Number of Gap Periods: 0

Note that the period size is unknown to the Python client. To overcome this, you can also specify the optional `time_period_in_seconds` parameter, which can help specify the horizon in real time units. If this parameter is omitted, Driverless AI will automatically detect the period size in the experiment, and the horizon value will respect this period. I.e., if you are sure your data has 1 week period, you can say `num_prediction_periods=14`, otherwise it is possible that the model may not work out correctly.

In [18]:
train_data = dai_data[dai_data["Date"] <= "2012-09-28"]
test_data = dai_data[dai_data["Date"] >= "2012-10-05"]

In [19]:
train_path = "./train_data.csv"
test_path = "./test_data.csv"

keep_cols = predictors + ["Weekly_Sales", "Date", "Dept", "Store"]
keep_cols = list(set(keep_cols))

train_data[keep_cols].to_csv(train_path, index = False)

In [20]:
# Add datasets to Driverless AI
train_dai = h2oai.upload_dataset_sync(train_path)

In [21]:
experiment = h2oai.start_experiment_sync(dataset_key = train_dai.key,
                                         target_col = "Weekly_Sales",
                                         cols_to_drop = [],
                                         is_classification = False,
                                         accuracy = 1,
                                         time = 1,
                                         interpretability = 10,
                                         scorer = "RMSE",
                                         time_col = "Date",
                                         time_groups_columns = ["Store", "Dept"],
                                         num_prediction_periods = 1,
                                         num_gap_periods = 0)

### Get Forecast

Now that we have trained our single Driverless AI model, we will use it to forecast the next 4 weeks of sales.

In order to use the same experiment to forecast past the forecast horizon, we need to use Test Time Augmentation.  Test Time Augmentation means that we want to Driverless AI to augment our test data once it's outside of the forecast horizon.

For example, we may find that an important feature in the model is the `Weekly_Sales` from the previous week for a Store and Department.  Once we are outside of the forecast horizon, Driverless AI no longer has the information about what the previous week's `Weekly_Sales` were.  

If we provide Driverless AI with test data that consists of the previous weeks' target value, then Driverless AI is able to update these historical features before calculating the forecast.

In order to trigger Test Time Augmentation (TTA), we must format our forecast data such that: 

* the date that we want to forecast for has NA's in the target value
* all target values for previous dates are included in the model

Here is an example of performing TTA for the second week in our test data:

In [22]:
# Forecast for the second week

tta_test_data_wk2 = test_data[test_data["Date"] <= "2012-10-12"].copy()
tta_test_data_wk2.loc[tta_test_data_wk2["Date"] == "2012-10-12", "Weekly_Sales"] = None

tta_test_data_wk2.loc[(tta_test_data_wk2["Store"] == 1) & (tta_test_data_wk2["Dept"] == 1), ["Date", "Store", "Dept", "Weekly_Sales"]]

Unnamed: 0,Date,Store,Dept,Weekly_Sales
409695,2012-10-05,1,1,21904.47
412671,2012-10-12,1,1,


The data we will send to Driverless AI to score consists of the data for the first week after training with the target value and the data for the second week after training without the target value.

We want Driverless AI to give us a prediction for the second week after training, so we leave `Weekly_Sales` as NA.  We also provide information about what happened one week prior.  We need this information to update the historical lags created by the Driverless AI experiment.  

In [23]:
# Upload test data to Driverless AI
test_path = "./tta_test_data.csv"
tta_test_data_wk2.to_csv(test_path, index = False)
tta_test_data_wk2_dai = h2oai.upload_dataset_sync(test_path)

In [24]:
# Make Predictions
pred_job = h2oai.make_prediction_sync(experiment.key, tta_test_data_wk2_dai.key, 
                                      output_margin = False, pred_contribs = False)
pred_path = h2oai.download(pred_job.predictions_csv_path, ".")

In [25]:
# Save Predictions
preds = pd.read_csv(pred_path)

actual = tta_test_data_wk2[["Date", "Store", "Dept"]]
actual = actual.reset_index(drop = True)

preds = pd.concat([actual, preds], axis = 1)
preds = preds.loc[preds["Date"] == "2012-10-12"]
preds.head()

Unnamed: 0,Date,Store,Dept,Weekly_Sales
2976,2012-10-12,1,1,603.870605
2977,2012-10-12,1,2,14193.673828
2978,2012-10-12,1,3,13296.996094
2979,2012-10-12,1,4,296.672363
2980,2012-10-12,1,5,1307.100586


We create a helper function that is able to format the data correctly for Test Time Augmentation for each of the 4 weeks into the future.

In [26]:
def get_tta_predictions(test_data, experiment_key, time_col, time_group_cols, target_col):
    
    test_preds = pd.DataFrame()
    dates = test_data[time_col].unique()
    
    # For each date, calculate the prediction from Driverless AI= 
    for i in dates:
        
        # Format data for testing
        tta_test_data = test_data[test_data[time_col] <= i].copy()
        tta_test_data.loc[test_data[time_col] == i, target_col] = None
        
        # Upload test data to Driverless AI
        test_path = "./tta_test_data.csv"
        tta_test_data.to_csv(test_path, index = False)
        test_dai = h2oai.upload_dataset_sync(test_path)
        
        # Make Predictions
        pred_job = h2oai.make_prediction_sync(experiment_key, test_dai.key, 
                                              output_margin = False, pred_contribs = False)
        pred_path = h2oai.download(pred_job.predictions_csv_path, ".")
        
        # Save Predictions
        preds = pd.read_csv(pred_path)
        preds.columns = ["Prediction"]
        
        # Add Date and Time Group columns to the predictions
        actuals = tta_test_data[[time_col] + time_group_cols]
        actuals = actuals.reset_index(drop = True)
        preds = pd.concat([actuals, preds], axis = 1)
        preds = preds.loc[preds[time_col] == i]
        
        # Add actual target to the predictions
        preds = pd.merge(preds, test_data[[time_col, target_col] + time_group_cols], 
                         on = [time_col] + time_group_cols)
        test_preds = pd.concat([test_preds, preds], axis = 0)
        
    return test_preds

In [27]:
forecast_predictions_tta = get_tta_predictions(test_data, experiment.key, "Date", ["Store", "Dept"], "Weekly_Sales")

In [28]:
forecast_predictions_tta.head()

Unnamed: 0,Date,Store,Dept,Prediction,Weekly_Sales
0,2012-10-05,1,1,18676.783203,21904.47
1,2012-10-05,1,2,45474.324219,48577.08
2,2012-10-05,1,3,17708.449219,11676.98
3,2012-10-05,1,4,35674.765625,39311.93
4,2012-10-05,1,5,20880.146484,25508.81


## Step 5: Evaluate the Performance

Now that we have the forecasted predictions when using Re-Training vs Test Time Augmentation, we can compare the performance.

In [30]:
# Calculate some error metric
rmse_retraining = ((forecast_predictions["Weekly_Sales"] - forecast_predictions["Prediction"])**2).mean()**(0.5)
print("Re-Training - RMSE: ${:,.2f}".format(rmse_retraining))

rmse_tta = ((forecast_predictions_tta["Weekly_Sales"] - forecast_predictions_tta["Prediction"])**2).mean()**(0.5)
print("Test Time Augmentation - RMSE: ${:,.2f}".format(rmse_tta))

KeyError: 'Weekly_Sales'