<h1><center>Recurrent Neural Network in PyTorch</center></h1>

Table of Contents: <a id=100></a>

1. [Packages](#1)
2. [Data definition](#2)
    - 2.1 [Declaring a tensor `x`](#3)
    - 2.2 [Creating a tensor `y` as a sin function of `x`](#4)
    - 2.3 [Plotting `y`](#5)
3. [Batching the data](#6)
    - 3.1 [Splitting the data in train/test set](#7)
    - 3.2 [Creating the batches of data](#8)
4. [Defining the model](#9)
    - 4.1 [Model class](#10)
    - 4.2 [Model instantiation](#11)
    - 4.3 [Training](#12)
5. [Alcohol Sales dataset](#13)
    - 5.1 [Loading and plotting](#14)
    - 5.2 [Prepare and normalize](#15)
    - 5.3 [Modelling](#16)
    - 5.4 [Predictions](#17)

Recurrent Neural Networks are a type of neural networks that are designed to work on sequence prediction models. RNNs can be used for text data, speech data, classification problems and generative models. Unlike ANNs, RNNs' prediction are based on the past prediction as well as the current input. RNNs are networks with loops in them allowing information to persist.

Each node of an **RNN** consists of 2 inputs:
1. Memory unit
2. Event unit

`M(t-1)` is the memory unit or the output of the previous prediction. `E(t)` is the current event or the information being provided at the present time. `M(t)` is the output of the current node or the output at the present time in the sequence.

### 1. Packages <a id=1></a>
[back to top](#100)

In [None]:
import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import matplotlib.pyplot as plt
%matplotlib inline

### 2. Data definition <a id=2></a>
[back to top](#100)

In this notebook, I'm going to train a very simple LSTM model, which is a type of RNN architecture to do time series prediction. Given some input data, it should be able to generate a prediction for the next step. I'll be using a **Sin** wave as an example as it's very easy to visualiase the behaviour of a sin wave.


#### 2.1 Declaring a tensor `x` <a id=3></a>

In [None]:
x = torch.linspace(0,799,800)

#### 2.2 Creating a tensor `y` as a sin function of `x` <a id=4></a>

In [None]:
y = torch.sin(x*2*3.1416/40)

#### 2.3 Plotting `y` <a id=5></a>

In [None]:
plt.figure(figsize=(12,4))
plt.xlim(-10,801)
plt.grid(True)
plt.xlabel("x")
plt.ylabel("sin")
plt.title("Sin plot")
plt.plot(y.numpy(),color='#8000ff')
plt.show()

### 3. Batching the data <a id=6></a>
[back to top](#100)

#### 3.1 Splitting the data in train/test set <a id=7></a>

In [None]:
test_size = 40
train_set = y[:-test_size]
test_set = y[-test_size:]

##### 3.1.1 Plotting the training/testing set

In [None]:
plt.figure(figsize=(12,4))
plt.xlim(-10,801)
plt.grid(True)
plt.xlabel("x")
plt.ylabel("sin")
plt.title("Sin plot")
plt.plot(train_set.numpy(),color='#8000ff')
plt.plot(range(760,800),test_set.numpy(),color="#ff8000")
plt.show()

#### 3.2 Creating the batches of data <a id=8></a>

While working with LSTM models, we divide the training sequence into series of overlapping windows. The label used for comparison is the next value in the sequence.

For example if we have series of of 12 records and a window size of 3, we feed [x1, x2, x3] into the model, and compare the prediction to `x4`. Then we backdrop, update parameters, and feed [x2, x3, x4] into the model and compare the prediction to `x5`. To ease this process, I'm defining a function `input_data(seq,ws)` that created a list of (seq,labels) tuples. If `ws` is the window size, then the total number of (seq,labels) tuples will be `len(series)-ws`.

In [None]:
def input_data(seq,ws):
    out = []
    L = len(seq)
    
    for i in range(L-ws):
        window = seq[i:i+ws]
        label = seq[i+ws:i+ws+1]
        out.append((window,label))
    
    return out

##### 3.2.1 Calling the `input_data` function
The length of `x` = 800

The length of `train_set` = 800 - 40 = 760

The length of `train_data` = 760 - 40 - 720

In [None]:
window_size = 40
train_data = input_data(train_set, window_size)
len(train_data)

##### 3.2.2 Checking the 1st value from train_data

In [None]:
train_data[0]

### 4. Defining the model <a id=9></a>
[back to top](#100)

#### 4.1 Model Class <a id=10></a>

In [None]:
class LSTM(nn.Module):
    
    def __init__(self,input_size = 1, hidden_size = 50, out_size = 1):
        super().__init__()
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size)
        self.linear = nn.Linear(hidden_size,out_size)
        self.hidden = (torch.zeros(1,1,hidden_size),torch.zeros(1,1,hidden_size))
    
    def forward(self,seq):
        lstm_out, self.hidden = self.lstm(seq.view(len(seq),1,-1), self.hidden)
        pred = self.linear(lstm_out.view(len(seq),-1))
        return pred[-1]

#### 4.2 Model Instantiation <a id = 11></a>

In [None]:
torch.manual_seed(42)
model = LSTM()
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

##### 4.2.1 Printing the model

In [None]:
model

#### 4.3 Training <a id = 12></a>

During training, I'm visualising the prediction process for the test data on the go. It will give a better understanding of how the training is being carried out in each epoch. The training sequence is represented in <span style="color:#8000ff">purple</span> while the predicted sequence in represented in <span style="color:#ff8000">orange</span>.

In [None]:
epochs = 10
future = 40

for i in range(epochs):
    
    for seq, y_train in train_data:
        optimizer.zero_grad()
        model.hidden = (torch.zeros(1,1,model.hidden_size),
                       torch.zeros(1,1,model.hidden_size))
        
        y_pred = model(seq)
        loss = criterion(y_pred, y_train)
        loss.backward()
        optimizer.step()
        
    print(f"Epoch {i} Loss: {loss.item()}")
    
    preds = train_set[-window_size:].tolist()
    for f in range(future):
        seq = torch.FloatTensor(preds[-window_size:])
        with torch.no_grad():
            model.hidden = (torch.zeros(1,1,model.hidden_size),
                           torch.zeros(1,1,model.hidden_size))
            preds.append(model(seq).item())
        
    loss = criterion(torch.tensor(preds[-window_size:]), y[760:])
    print(f"Performance on test range: {loss}")
    
    plt.figure(figsize=(12,4))
    plt.xlim(700,801)
    plt.grid(True)
    plt.plot(y.numpy(),color='#8000ff')
    plt.plot(range(760,800),preds[window_size:],color='#ff8000')
    plt.show()

### 5. Alcohol Sales dataset <a id=13></a>
[back to top](#100)

#### 5.1 Loading and plotting <a id=14></a>

##### 5.1.1 Importing the data

In [None]:
df = pd.read_csv("/kaggle/input/for-simple-exercises-time-series-forecasting/Alcohol_Sales.csv", index_col = 0, parse_dates = True)
df.head()

##### 5.1.2 Dropping the empty rows

In [None]:
df.dropna(inplace=True)
len(df)

##### 5.1.3 Plotting the Time Series Data

In [None]:
plt.figure(figsize = (12,4))
plt.title('Alcohol Sales')
plt.ylabel('Sales in million dollars')
plt.grid(True)
plt.autoscale(axis='x',tight=True)
plt.plot(df['S4248SM144NCEN'],color='#8000ff')
plt.show()

#### 5.2 Prepare and normalize <a id=15></a>

##### 5.2.1 Preparing the data

In [None]:
#extracting the time series values
y = df['S4248SM144NCEN'].values.astype(float) 

#defining a test size
test_size = 12

#create train and test splits
train_set = y[:-test_size]
test_set = y[-test_size:]
test_set

##### 5.2.2 Normalize the data

In [None]:
from sklearn.preprocessing import MinMaxScaler

# instantiate a scaler
scaler = MinMaxScaler(feature_range=(-1, 1))

# normalize the training set
train_norm = scaler.fit_transform(train_set.reshape(-1, 1))

##### 5.2.3 Prepare data for LSTM model

In [None]:
# convert train_norm to a tensor
train_norm = torch.FloatTensor(train_norm).view(-1)

# define a window size
window_size = 12
# define a function to create sequence/label tuples
def input_data(seq,ws):
    out = []
    L = len(seq)
    for i in range(L-ws):
        window = seq[i:i+ws]
        label = seq[i+ws:i+ws+1]
        out.append((window,label))
    return out

# apply input_data to train_norm
train_data = input_data(train_norm, window_size)
len(train_data)

##### 5.2.4 Printing the first tuple

In [None]:
train_data[0]

#### 5.3 Modelling <a id=16></a>

##### 5.3.1 Model definition

In [None]:
class LSTMnetwork(nn.Module):
    def __init__(self,input_size=1,hidden_size=100,output_size=1):
        super().__init__()
        self.hidden_size = hidden_size
        
        # add an LSTM layer:
        self.lstm = nn.LSTM(input_size,hidden_size)
        
        # add a fully-connected layer:
        self.linear = nn.Linear(hidden_size,output_size)
        
        # initializing h0 and c0:
        self.hidden = (torch.zeros(1,1,self.hidden_size),
                       torch.zeros(1,1,self.hidden_size))

    def forward(self,seq):
        lstm_out, self.hidden = self.lstm(
            seq.view(len(seq),1,-1), self.hidden)
        pred = self.linear(lstm_out.view(len(seq),-1))
        return pred[-1]

##### 5.3.3 Instantiation, loss and optimizer

In [None]:
torch.manual_seed(42)

# instantiate
model = LSTMnetwork()

# loss
criterion = nn.MSELoss()

#optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

model

##### 5.3.4 Training

In [None]:
epochs = 100

import time
start_time = time.time()

for epoch in range(epochs):
    for seq, y_train in train_data:
        optimizer.zero_grad()
        model.hidden = (torch.zeros(1,1,model.hidden_size),
                        torch.zeros(1,1,model.hidden_size))
        
        y_pred = model(seq)
        
        loss = criterion(y_pred, y_train)
        loss.backward()
        optimizer.step()
        
    print(f'Epoch: {epoch+1:2} Loss: {loss.item():10.8f}')
    
print(f'\nDuration: {time.time() - start_time:.0f} seconds')

#### 5.4 Predictions <a id=17></a>

##### 5.4.1 Test set predictions

In [None]:
future = 12

preds = train_norm[-window_size:].tolist()

model.eval()

for i in range(future):
    seq = torch.FloatTensor(preds[-window_size:])
    with torch.no_grad():
        model.hidden = (torch.zeros(1,1,model.hidden_size),
                        torch.zeros(1,1,model.hidden_size))
        preds.append(model(seq).item())
preds[window_size:]

##### 5.4.2 Original test set

In [None]:
df['S4248SM144NCEN'][-12:]

##### 5.4.3 Inverting the normalised values

In [None]:
true_predictions = scaler.inverse_transform(np.array(preds[window_size:]).reshape(-1, 1))
true_predictions

##### 5.4.4 Plotting

In [None]:
x = np.arange('2018-02-01', '2019-02-01', dtype='datetime64[M]').astype('datetime64[D]')
plt.figure(figsize=(12,4))
plt.title('Alcohol Sales')
plt.ylabel('Sales in million dollars')
plt.grid(True)
plt.autoscale(axis='x',tight=True)
plt.plot(df['S4248SM144NCEN'], color='#8000ff')
plt.plot(x,true_predictions, color='#ff8000')
plt.show()

##### 5.5.5 Zooming the test predictions

In [None]:
fig = plt.figure(figsize=(12,4))
plt.title('Alcohol Sales')
plt.ylabel('Sales in million dollars')
plt.grid(True)
plt.autoscale(axis='x',tight=True)
fig.autofmt_xdate()

plt.plot(df['S4248SM144NCEN']['2017-01-01':], color='#8000ff')
plt.plot(x,true_predictions, color='#ff8000')
plt.show()

### If you liked the notebook, consider giving an upvote.
[back to top](#100)