# Recurrent neural network for Google Stock Price
### Importing the libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from sklearn.preprocessing import MinMaxScaler

from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout

## Part 1 - Data Preprocessing
### Importing the training set

Here, we are specifically importing and utilizing the training set in this analysis to highlight the fact that our model will be trained solely on this data. During the training phase, our model will have no knowledge of the test set, and there will be no equivalent of the test set available during training. Essentially, it's as if the test set doesn't exist for our model during the training process.

However, once the training is completed, we will introduce the test set to assess and validate the model's performance by making predictions on future stock prices.

In [2]:
dataset_train = pd.read_csv('./data/Google_Stock_Price_Train.csv')


In [3]:
dataset_train.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume
0,1/3/2012,325.25,332.83,324.97,663.59,7380500
1,1/4/2012,331.27,333.87,329.08,666.45,5749400
2,1/5/2012,329.83,330.75,326.89,657.21,6590300
3,1/6/2012,328.34,328.77,323.68,648.24,5405900
4,1/9/2012,322.04,322.29,309.46,620.76,11688800


Check for missing values in each column

In [4]:
missing_values = dataset_train.isnull().sum()
print("Missing values per column:")
print(missing_values)

Missing values per column:
Date      0
Open      0
High      0
Low       0
Close     0
Volume    0
dtype: int64


We don't have any missing values.

Now we define the real data input for our model (training set) by selecting the necessary column (Open) and converting them into a NumPy array, which will serve as the input data for training our model.

In [5]:
training_set = dataset_train[['Open']].values

In [6]:
training_set

array([[325.25],
       [331.27],
       [329.83],
       ...,
       [793.7 ],
       [783.33],
       [782.75]])

### Feature Scaling

Now, we are going to apply the appropriate feature scaling to our data to optimize the training process.

We have two possibilities:
- Standardization
- Normalization

I have chosen to use Normalization as it is more relevant in this context. When building an RNN, especially when a sigmoid function is used as an activation function in the output layer, it is recommended to apply normalization for improved performance.

Normalization helps in bringing all features to a similar scale, which can aid in the training process by ensuring that no particular feature dominates due to its larger scale. This is particularly important for activation functions like sigmoid, where small input values can result in vanishing gradients, impacting learning during backpropagation.

In [7]:
scaler = MinMaxScaler(feature_range=(0,1))

In [8]:
training_set_scaled = scaler.fit_transform(training_set)

In [9]:
print(training_set_scaled)

[[0.08581368]
 [0.09701243]
 [0.09433366]
 ...
 [0.95725128]
 [0.93796041]
 [0.93688146]]


### Create a specific data structure
Now, we will define a specific data structure that outlines what the RNN needs to remember when predicting the next stock price. This structure is referred to as the 'number of time steps.' It plays a critical role in determining the temporal memory or context the RNN will consider during its prediction of future stock prices.

In this case, we have 60 timesteps and one output. This implies that at each time 't,' the RNN will analyze the 60 stock prices leading up to time 't' (or the 60 days prior to time 't'), and then we will attempt to predict the subsequent output.

X_train: The input for the RNN, consisting of the 60 previous stock prices.
y_train: The output representing the stock price for the next financial day.

In [10]:
X_train = []
y_train = []

nb_timesteps = 60

for i in range(nb_timesteps, len(training_set_scaled)):
    X_train.append(training_set_scaled[i-nb_timesteps:i, 0])
    y_train.append(training_set_scaled[i,0])

X_train, y_train = np.array(X_train), np.array(y_train) 

In [11]:
print(X_train)

[[0.08581368 0.09701243 0.09433366 ... 0.07846566 0.08034452 0.08497656]
 [0.09701243 0.09433366 0.09156187 ... 0.08034452 0.08497656 0.08627874]
 [0.09433366 0.09156187 0.07984225 ... 0.08497656 0.08627874 0.08471612]
 ...
 [0.92106928 0.92438053 0.93048218 ... 0.95475854 0.95204256 0.95163331]
 [0.92438053 0.93048218 0.9299055  ... 0.95204256 0.95163331 0.95725128]
 [0.93048218 0.9299055  0.93113327 ... 0.95163331 0.95725128 0.93796041]]


In [12]:
print(y_train)

[0.08627874 0.08471612 0.07454052 ... 0.95725128 0.93796041 0.93688146]


### Reshaping 
We are now going to reshape the data structure to introduce additional dimensions to the previous data structure, allowing for the inclusion of more indicators if desired.

The input shape with Keras should be a 3D tensor with dimensions (batch_size, timesteps, input_dim) for Recurrent Layers. 'Batch_size' corresponds to the number of observations.


In [13]:
batch_size, timesteps = X_train.shape
input_dim = 1

X_train = np.reshape(X_train, (batch_size, timesteps, input_dim))

Now we have the right structure expected for our RNN.

## Part 2 - Building the RNN

We will construct a robust architecture by not only using a simple LSTM but also implementing a stacked LSTM with dropout regularization to prevent overfitting.

### Initialising the RNN
Initialising the RNN as a sequence of layer.

In [14]:
regressor = Sequential()

### Adding the first LSTM layer and some Dropout regularisation
As mentionned before, we will use some dropout regularization, but what's it ? 

This is a technique used in neural network training to prevent overfitting and improve the model's generalization performance. During the training process, dropout randomly sets a fraction (rate) of the neurons in a layer to zero, effectively 'dropping out' those units. This means that the model trains on a reduced network for each batch, as different neurons are dropped out in each training iteration.

By doing this, dropout helps prevent the neural network from relying too heavily on a specific set of neurons and encourages the network to learn more robust and generalizable features. It essentially forces the model to learn redundant representations of information, reducing the risk of overfitting to the training data.

When utilizing multiple LSTM layers, it is necessary to set 'return_sequences' to True. 'Units' represents the number of LSTM cells, memory units, or neurons that we intend to have in this initial LSTM layer.

In [15]:
regressor.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1],1)))
regressor.add(Dropout(rate=0.2))

### Adding others LSTM layers with Dropout regularisation

In [16]:
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(rate=0.2))

In [17]:
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(rate=0.2))

In [18]:
regressor.add(LSTM(units=50))
regressor.add(Dropout(rate=0.2))

### Adding the output layer

In [19]:
regressor.add(Dense(units=1))

### Compiling the RNN

In [20]:
regressor.compile(optimizer="adam", loss="mean_squared_error")

### Fitting the RNN on the training set
Epochs represent the number of times the entire dataset is used for training. With 100 epochs, the model learns from the data 100 times.

Batch size refers to the number of data samples processed in a single training iteration. A batch size of 32 means 32 samples are used to update the model's weights at each step, enhancing training efficiency

In [21]:
regressor.fit(x=X_train, y=y_train, epochs=85, batch_size=32)

Epoch 1/85
Epoch 2/85
Epoch 3/85
Epoch 4/85
Epoch 5/85
Epoch 6/85
Epoch 7/85
Epoch 8/85
Epoch 9/85
Epoch 10/85
Epoch 11/85
Epoch 12/85
Epoch 13/85
Epoch 14/85
Epoch 15/85
Epoch 16/85
Epoch 17/85
Epoch 18/85
Epoch 19/85
Epoch 20/85
Epoch 21/85
Epoch 22/85
Epoch 23/85
Epoch 24/85
Epoch 25/85
Epoch 26/85
Epoch 27/85
Epoch 28/85
Epoch 29/85
Epoch 30/85
Epoch 31/85
Epoch 32/85
Epoch 33/85
Epoch 34/85
Epoch 35/85
Epoch 36/85
Epoch 37/85
Epoch 38/85
Epoch 39/85
Epoch 40/85
Epoch 41/85
Epoch 42/85
Epoch 43/85
Epoch 44/85
Epoch 45/85
Epoch 46/85
Epoch 47/85
Epoch 48/85
Epoch 49/85
Epoch 50/85
Epoch 51/85
Epoch 52/85
Epoch 53/85
Epoch 54/85
Epoch 55/85
Epoch 56/85
Epoch 57/85
Epoch 58/85
Epoch 59/85
Epoch 60/85
Epoch 61/85
Epoch 62/85
Epoch 63/85
Epoch 64/85
Epoch 65/85
Epoch 66/85
Epoch 67/85
Epoch 68/85
Epoch 69/85
Epoch 70/85
Epoch 71/85
Epoch 72/85
Epoch 73/85
Epoch 74/85
Epoch 75/85
Epoch 76/85
Epoch 77/85
Epoch 78/85
Epoch 79/85
Epoch 80/85
Epoch 81/85
Epoch 82/85
Epoch 83/85
Epoch 84/85
E

<keras.src.callbacks.History at 0x20e7da25fd0>

The fitting process appears to converge around the 80th epoch, suggesting that the model's performance stabilizes and further epochs may not significantly enhance performance.

## Part 3 - Making the predictions and visualising the results