In [23]:
#Data Preprocessing
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
os.chdir('D:\\DL Colab Changes\\DL Colab Changes\\Recurrent_Neural_Networks 4')

In [24]:
#Import dataset
dataset_train = pd.read_csv('Google_Stock_Price_Train.csv')
training_set = dataset_train.iloc[:, 1:2].values  #Single column with opening stock prices

In [26]:
training_set.shape

(1258, 1)

In [27]:
#Feature Scaler
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler()
training_set_scaled = sc.fit_transform(training_set)

In [28]:
training_set_scaled.shape[0]

1258

In [29]:
#Create a datastructure with 60 timestamps and one output
X_train =[]
y_train =[]
for i in range(60, training_set_scaled.shape[0]):
    X_train.append(training_set_scaled[i-60:i,0 ])
    y_train.append(training_set_scaled[i, 0])
    
X_train, y_train = np.array(X_train), np.array(y_train)

In [31]:
X_train.shape, y_train.shape

((1198, 60), (1198,))

In [34]:
#Reshaping to 3D (RNN asks this)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

In [35]:
X_train.shape

(1198, 60, 1)

In [36]:
 X_train.shape[1]

60

In [None]:
#Build RNN
from keras.models import Sequential 
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout

#Inialize RNN
regressor = Sequential()

#Adding first layer and some droupout regulaization
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

#Adding the second layer and some droupout regulaization
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

#Adding the third layer and some droupout regulaization
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))


#Adding the fourth layer and some droupout regulaization
regressor.add(LSTM(units = 50))  #Default return_sequences = False for the last LSTM layer
regressor.add(Dropout(0.2))

#Adding the output layer (fully connected network)
regressor.add(Dense(units = 1))

#Compiling RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

#Fitting RNN to training set
regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100

In [None]:
#Making predictions

#Getting the real stock price of 2017
dataset_test = pd.read_csv('Google_Stock_Price_Test.csv')
real_stock_price = dataset_test.iloc[:, 1:2].values

#Getting the predicted stock price
dataset_total = pd.concat(dataset_train['Open'], dataset_test['Open'], axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60 :].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)

X_test = []
for i in range(60,80):
    X_test.append(inputs[i-60:i,0])
    
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)


In [None]:
#Visualizing results
plt.plot(real_stock_price, color = 'red', label= "Real google stock price")
plt.plot(predicted_stock_price, color = 'blue', label='Predicted Google Stock Price')
plt.title('Google stock price prediction')
plt.xlabel('Time')
plt.ylabel('Google Stock Price')
plt.legend()
plt.show()
                  

## Evaluating the RNN
Hi guys,

as seen in the practical lectures, the RNN we built was a regressor. Indeed, we were dealing with Regression because we were trying to predict a continuous outcome (the Google Stock Price). For Regression, the way to evaluate the model performance is with a metric called RMSE (Root Mean Squared Error). It is calculated as the root of the mean of the squared differences between the predictions and the real values.

However for our specific Stock Price Prediction problem, evaluating the model with the RMSE does not make much sense, since we are more interested in the directions taken by our predictions, rather than the closeness of their values to the real stock price. We want to check if our predictions follow the same directions as the real stock price and we don’t really care whether our predictions are close the real stock price. The predictions could indeed be close but often taking the opposite direction from the real stock price.

Nevertheless if you are interested in the code that computes the RMSE for our Stock Price Prediction problem, please find it just below:

import math
from sklearn.metrics import mean_squared_error
rmse = math.sqrt(mean_squared_error(real_stock_price, predicted_stock_price))
Then consider dividing this RMSE by the range of the Google Stock Price values of January 2017 (that is around 800) to get a relative error, as opposed to an absolute error. It is more relevant since for example if you get an RMSE of 50, then this error would be very big if the stock price values
ranged around 100, but it would be very small if the stock price values ranged around 10000.

Enjoy Deep Learning!

## Improving the RNN
Hi guys,

here are different ways to improve the RNN model:

Getting more training data: we trained our model on the past 5 years of the Google Stock Price but it would be even better to train it on the past 10 years.
Increasing the number of timesteps: the model remembered the stock prices from the 60 previous financial days to predict the stock price of the next day. That’s because we chose a number of 60 timesteps (3 months). You could try to increase the number of timesteps, by choosing for example 120 timesteps (6 months).
Adding some other indicators: if you have the financial instinct that the stock price of some other companies might be correlated to the one of Google, you could add this other stock price as a new indicator in the training data.
Adding more LSTM layers: we built a RNN with four LSTM layers but you could try with even more.
Adding more neurones in the LSTM layers: we highlighted the fact that we needed a high number of neurones in the LSTM layers to respond better to the complexity of the problem and we chose to include 50 neurones in each of our 4 LSTM layers. You could try an architecture with even more neurones in each of the 4 (or more) LSTM layers.
Enjoy Deep Learning!

## Tuning the RNN
Hi guys,

you can do some Parameter Tuning on the RNN model we implemented.

Remember, this time we are dealing with a Regression problem because we predict a continuous outcome (the Google Stock Price).

Parameter Tuning for Regression is the same as Parameter Tuning for Classification which you learned in Part 1 - Artificial Neural Networks, the only difference is that you have to replace:

scoring = 'accuracy'  

by:

scoring = 'neg_mean_squared_error' 

in the GridSearchCV class parameters.

Enjoy Deep Learning!