Hi there and welcome to this kernel. This kernel focuses on implementation of Long-Short Term Memory Networks which come under Recurrent Neural Networks. If you're new to RNN and LSTM we request to visit:-

RNN -> https://en.wikipedia.org/wiki/Recurrent_neural_network

LSTM -> https://en.wikipedia.org/wiki/Long_short-term_memory

For now let's begin with loading necessary libraries which will help carrying out our tasks.

In [None]:
import numpy as np
import pandas as pd
from pandas.plotting import autocorrelation_plot as acp
import matplotlib.pyplot as plt
import plotly_express as px
%matplotlib inline
from sklearn.preprocessing import MinMaxScaler
import sklearn.metrics as mt
import math
import keras
from keras.layers import Dense,LSTM,Dropout
from keras.models import Sequential
df = pd.read_csv("../input/portland-oregon-average-monthly-.csv")
df.head()

# Data Cleaning
Our first step is to clean the data in order to correct the data types of column(s) and removing irrelevent items from the dataframe. We can also rename the column(s) name(s) to ease their accessibility. 

In [None]:
df.columns = ['Month','Avg Ridership']

In [None]:
df.info()

While investigating we found that Average Ridership is object, instead of integer, which should be. So let's try change it, but first let's if there's any object item/element itself within the column.

In [None]:
df['Avg Ridership'].unique()

When uniquely identifying the Average Ridership column we can clearly see ' n=114' is object which needs to be removed.

In [None]:
df['Avg Ridership'] = df['Avg Ridership'].replace(' n=114',np.nan)
df = df.dropna()

We replace ' n=114' with NaN and simple drop it from the dataframe. Just to make sure it's correctly replaced it we uniquely identify the column again 

In [None]:
df = df.dropna()
df['Avg Ridership'].unique()

In [None]:
df['Avg Ridership'] = pd.to_numeric(df['Avg Ridership'])

And now we can easily change the data type of the column

In [None]:
df.info()

# The Time Series

In [None]:
px.line(df,x='Month',y='Avg Ridership').show()

**Graph Description:** Above is Time Series. Plotly (which is used over here) is a data visualization library build to be used for making interactive chart. Feel free to hover over the line and see the instances.

# LSTM forecasting
Let's begin with the forecasting process,which involve the following preprocessing steps:-

1) Setting 'Month' Column as index as it's a time series dataset

2) Scaling the Average Ridership column with MinMaxScaler

3) Splitting the entire dataset into Train and Test, in which Training will be used for LSTM model learning and Testing would be used to test the performance of the model.

In [None]:
df = df.set_index('Month')

In [None]:
s = MinMaxScaler(feature_range=(0,1))
DF = s.fit_transform(df)

Train Test splitted in to 66:34 ratio

In [None]:
train_size = int(len(DF) * 0.66)
test_size = len(DF) - train_size
train, test = DF[0:train_size,:], DF[train_size:len(DF),:]
print(f'Training Size = {len(train)}, Testing Size = {len(test)}')

In [None]:
def create_dataset(S, look_back=1):
    dataX, dataY = [], []
    for i in range(len(S)-look_back-1):
        a = S[i:(i+look_back), 0]
        dataX.append(a)
        dataY.append(S[i + look_back, 0])
    return np.array(dataX), np.array(dataY)

Look Back is the number of previous time steps to use as input variables to predict the next time period

In [None]:
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

In [None]:
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

# The Model Formation and Learning
The model is setup with 128 cells with look_back as 1, Dropout 0.2, with lastly with 1 single node since we're dealing with regression problem. The loss is measured through Mean Squared Error (MSE) with 'Adam' as optimizer. Validation is also performed considering the testing dataset.

In [None]:
model = Sequential()
model.add(LSTM(128, input_shape=(1, look_back)))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit(trainX, trainY, epochs=100, batch_size=2,validation_data=(testX,testY), verbose=2)

In [None]:
plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='test')
plt.legend()
plt.show()

**Graph Description:** Training and Testing Loss as depicted by the above diagram

# Measuring Model's Performance
For measuring model's performance, Mean Squared Error (MSE) and Root Mean Square Error (RMSE) were taken as performance measuring measures.

In [None]:
trainPredict = model.predict([trainX])
testPredict = model.predict([testX])
#Changing prediction to it's original units
trainPredict = s.inverse_transform(trainPredict)
trainY = s.inverse_transform([trainY])
testPredict = s.inverse_transform(testPredict)
testY = s.inverse_transform([testY])

trainScore = math.sqrt(mt.mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score = %.2f MSE' % mt.mean_squared_error(trainY[0],trainPredict[:,0]))
print('Train Score =  %.2f RMSE' % (trainScore))
testScore = math.sqrt(mt.mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score = %.2f MSE' % mt.mean_squared_error(testY[0],testPredict[:,0]))
print('Test Score = %.2f RMSE' % (testScore))

In [None]:
trainPredictPlot = np.empty_like(DF)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(DF)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(DF)-1, :] = testPredict
# plot baseline and predictions
plt.plot(s.inverse_transform(DF))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

**Graph Description:** The Blue color represent the original series, orange represents the series generated on training dataset, and at lastly green represents the series generated on testing dataset.

We hope you liked this kernel and helped you understand the concept of LSTM and it's application in Time Series Forecasting.