# Creating a Deep Learning Forecasting model with PyTorch
In this notebook, you will create a Recurrent Neural Network (RNN) that you can use to forecast the battery cycles used for time series battery data.

In [None]:
import uuid
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler

np.random.seed(1) # ensure repeatability

import torch
import torch.nn as nn
import torch.optim as optim

pd.set_option('display.max_columns', 10)

torch.manual_seed(0) # ensure repeatability

### Download the data

The following cell will download the data set containing the daily battery time series from the Azure ML Datastore.

In [None]:
from azureml.core import Workspace

subscription_id = '<your-subscription-id>'
resource_group = 'MCW-Machine-Learning'
workspace_name = 'mcwmachinelearning'

# Get our Azure ML Workspace
workspace = Workspace(subscription_id, resource_group, workspace_name)

# Get the workspace's default datastore
datastore = workspace.get_default_datastore()

In [None]:
# Download the processed dataset locally
datastore.download('./data', 'daily-battery-time-series-v3-processed.csv', overwrite=True)

### Load the data

The previously downloaded CSV file will be loaded into a Pandas Dataframe and its first few rows inspected.

In [None]:
# Load the dataset
df = pd.read_csv('./data/daily-battery-time-series-v3-processed.csv', delimiter=',')
df = df[['Date','Battery_ID','Battery_Age_Days','Number_Of_Trips','Daily_Trip_Duration','Daily_Cycles_Used', 'Lifetime_Cycles_Used', 'Battery_Rated_Cycles']]

# Inspect the data frame
df.head()

The time series related to one specific Battery_ID will be isolated and its shape checked. To keep the model simple and make is easier to understand, only one column will be used - Daily_Cycles_Used.

In case the dataset contains more time series, the process of training and prediction must be repeated for each individual series.

In [None]:
# Isolate the time series related to one Battery_ID
df_source = df[df['Battery_ID'] == 0][['Daily_Cycles_Used']]
df_source.shape

Reshape our input as a one dimensional array. This will make some of the operations we'll perform easier to follow and understand.

In [None]:
source = df_source.values.reshape(1, df_source.shape[0])[0]

### Prepare the data

Our time series prediction model will use a special kind of RNN (Recurrent Neural Network) built out of LSTM (Long Short Term Memory) cells. LSTMs are not particularly happy with very long series so we are setting the maximum limit of a time series to 250 steps (```sample_size```). Based on this value, we calculate then the maximum number of non-overlapping samples we can get from our original time series (```num_samples```).

We then consolidate these samples into two matrixes, ```input``` and ```output```. Notice they are built in a way that for every element Xn in every sample in ```input```, the corresponding element from ```output``` is equal to the one that follows Xn (which is Xn+1). The fundamental idea is that we're looking to train a model that will be capable of predicting Xn+1 based on Xn.

In [None]:
from sklearn.preprocessing import scale

# Scale the input data first, to increase the network's performance
source = scale(source)

sample_size = 250
num_samples = source.shape[0] // sample_size

input = np.zeros((num_samples, sample_size))
output = np.zeros((num_samples, sample_size))

for i in range(num_samples):
  input[i] = source[-(i+1) * sample_size - 2 : -i * sample_size - 2]
  output[i] = source[-(i+1) * sample_size - 1 : -i * sample_size - 1]

Since we are using PyTorch, we're moving ```input``` and ```output``` into tensor space. The \_t notation is used to identify a variable that is a tensor.

In [None]:
input_t = torch.from_numpy(input)
target_t = torch.from_numpy(output)
print(input_t.shape)
print(target_t.shape)

### Build and train the model

Next, we define our model as a class derived from the base class ```nn.Module```. Our model contains two hidden LSTM layers of sizes ```hidden_layer1_size``` and ```hidden_layer2_size``` respectively. The output of the second LSTM layer is fed into a linear layer that will combine all components into a single output.

Each hidden layer also needs a pair of variables to hold internal state (```h_t```,```c_t``` and ```h_t2```,```c_t2``` respectively). They are used by the LSTM cells to keep track of their "memory" during the run of every epoch (implemented by the ```forward``` method). Notice how the internal state is reset at the beginning of each epoch run.

Also notice the ```future``` parameter which controls whether we want to also make predictions into the future or not. The value of this parameter will be 0 during the training process and set to a number of days when the model is called to make a prediction once it is trained.

In [None]:
# The number of nodes in the hidden layers
hidden_layer_size = 150


class LSTMPredictor(nn.Module):
    def __init__(self):
        super(LSTMPredictor, self).__init__()
        self.lstm = nn.LSTMCell(1, hidden_layer_size)
        self.linear = nn.Linear(hidden_layer_size, 1)

    def forward(self, input, future = 0):
        outputs = []
        h_t = torch.zeros(input.size(0), hidden_layer_size, dtype=torch.double)
        c_t = torch.zeros(input.size(0), hidden_layer_size, dtype=torch.double)

        for i, input_t in enumerate(input.chunk(input.size(1), dim=1)):
            h_t, c_t = self.lstm(input_t, (h_t, c_t))
            output = self.linear(h_t)
            outputs += [output]
        for i in range(future):# if we should predict the future
            h_t, c_t = self.lstm(output, (h_t, c_t))
            output = self.linear(h_t)
            outputs += [output]
        outputs = torch.stack(outputs, 1).squeeze(2)
        return outputs

Set the number of epochs, the learning rate, the method to calculate the loss function, and the optimizer used for the backwards pass on the network during training.

In [None]:
# Increase the number of epochs for better results
epochs = 300
learning_rate = 0.04

# build the model
pred = LSTMPredictor()
pred.double() #convert all internal values to doubles
criteria = nn.MSELoss()
optimizer = torch.optim.Adam(pred.parameters(), lr=learning_rate)

Perform the actual training on the model.

For each epoch, we will perform the following steps:

- Make a prediction using the ```input_t``` input tensor
- Calculate how far is the predicted result from the expected result (stored in the ```target_t``` tensor). The distance is given by the value of the loss function, which we also save.
- Zero out the gradients (we are resetting them on each epoch)
- Trigger the backpropagation process through which we are recalibrating the internal weights of the network
- Activate the optimizer to help the recalibration process

In [None]:
losses = []

for epoch in np.arange(1, epochs + 1):
    
    if epoch%10 == 1:
        print('Starting epoch %s...' % (epoch))
    
    # Feed the input through the network
    out = pred(input_t)

    # Calculate loss tensor
    loss = criteria(out, target_t)
    losses += [loss.item()]
    if epoch%10 == 1:
        print('Current loss: %s' % (loss.item()))
    
    optimizer.zero_grad()
    
    # Trigger backpropagation
    loss.backward()
    # Move on
    optimizer.step()

Display the evolution of the loss function. We would expect the graph to flatline after a few initial pulses.

In [None]:
fig = plt.figure(figsize=(30,10))
plt.title('The evolution of the LOSS function during training', fontsize=10)
plt.xlabel('x', fontsize=20)
plt.ylabel('y', fontsize=20)
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)

plt.plot(np.arange(epochs), losses, 'r', linewidth=1.0)

display(fig)
plt.close()

### Predict the future

One the training process is finished, we are using the trained model to predict the values for the next 30 days. Since our sample size in ```sample_size``` we are just taking the last ```sample_size``` elements from the original time series and feed them to the model.

Notice the ```with torch.no_grad()``` option which basically tells PyTorch this is not part of any training process, hence there is no need to track the gradients on the tensors involved.

In [None]:
# The model is trained, predict the next 30 days
days_to_predict = 30

# Get the tensor with the last sample_size values
final_input = torch.from_numpy(source[-sample_size:].reshape(1, sample_size))

# No need to track gradient anymore
with torch.no_grad():
    y_t = pred(final_input, future=days_to_predict)
    y = y_t.detach().numpy()

The result of the prediction will contain the predicted output corresponding to the input plus a number of elements equal to the number of future days we need prediction for.

We'll just take a look at the future values predicted.

In [None]:
future_predictions = y[0, - days_to_predict:]
print(future_predictions)

Plot the last ```sample_size``` elements from the original time series in green and the predicted values for the next 30 days in red.

Please note that we are using synthetic training data and the target value was randomly generated around a mean, thus you will observe that the predictions are closer to the mean of the dataset.

In [None]:
fig = plt.figure(figsize=(30,10))
plt.title('Predict future values \n(Red values are predicted values)', fontsize=10)
plt.xlabel('x', fontsize=20)
plt.ylabel('y', fontsize=20)
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)

plt.plot(np.arange(sample_size), source[-sample_size:], 'g', linewidth=1.0)
plt.plot(np.arange(sample_size, sample_size + days_to_predict), future_predictions, 'r', linewidth=1.0)

display(fig)
plt.close()

Register model

In [None]:
# Persist model to disk
torch.save(pred, './model.pt')
import joblib

# Load model from disk
persisted_model = torch.load('./model.pt')

# Check if persisted model works
with torch.no_grad():
    y_t_2 = persisted_model(final_input, future=days_to_predict)
    y_2 = y_t_2.detach().numpy()

y_2[0, - days_to_predict:]

In [None]:
from azureml.core import Model
from azureml.core.resource_configuration import ResourceConfiguration

# Register model
model = Model.register(workspace=workspace,
                       model_name='ForecastingModel',                # Name of the registered model in your workspace.
                       model_path='./model.pt',                      # Local file to upload and register as a model.
                       resource_configuration=ResourceConfiguration(cpu=1, memory_in_gb=0.5),
                       description='Car battery cycles forecaster',
                       model_framework=Model.Framework.PYTORCH,
                       tags={'type': 'forecasting'})

print('Name:', model.name)
print('Version:', model.version)

In [None]:
# Persist sample data as json

import json

with open('data.json', 'w', encoding='utf-8') as f:
    sample_data = source[-sample_size:].reshape(1, sample_size).tolist()
    json.dump(sample_data, f)

json.dumps(sample_data)