# Recurrent Neural Network



## Part 1 - Data Preprocessing

### Importing the libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### Importing the training set

In [2]:
dataset_train = pd.read_csv('Google_Stock_Price_Train.csv')
training_set = dataset_train.iloc[:, 1:2].values

Only training set is imported and trained on RNN. 
Test set will be imported after training.
Only numpy arrays can be input of NNs in Keras.

For selecting column, we cannot simply put 1. Because we want to create a numpy array. Not a simple vector. The trick is to put the range 1:2 and .values.

### Feature Scaling

In [None]:
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)

For building RNN, Normalization is recommended for FS.
feature_range = (0, 1) because of normalization formula.
Recommended to keep original datasets therefore a new variable trainingset_scaled is created.
fit_transform finds min and max then compute scaled values for each data.

### Creating a data structure with 60 timesteps and 1 output

In [None]:
X_train = []
y_train = []
for i in range(60, 1258):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)

60 timesteps means that at each time T, the RNN is going to look at the 60 stock prices before time T, that is the stock prices between 60 days before time T and time T. 
Based on the trends it captured during these 60 previous timesteps, it will try to predict the next output.

So 60 timesteps of the past information from which our RNN is gonna try to learn and understand some correlations, or some trends, and based on its understanding, it's going to try to predict the next output, that is, the stock price at time T plus one.

### Reshaping

In [None]:
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

Now we need to create this new dimensionality of this new data structure, because that is expected by the future RNN that we're gonna build in the second part.
So that's not only for you to be able to use some more indicators, that's also to be compatible with the input format, the input shape, as we call it of the RNN.

## Part 2 - Building and Training the RNN

### Importing the Keras libraries and packages

In [None]:
from keras.models import Sequential
#Sequential allows us to create a neural network object representing a sequence of layers

from keras.layers import Dense
#Dense class to add output layer

from keras.layers import LSTM
#LSTM class to LSTM layers

from keras.layers import Dropout
#Dropout class to add some dropout regularization

In this part,we are gonna build the whole architecture of this neural network.
A robust architecture, because we're not only gonna make a simple LSTM.
We're gonna make a stacked LSTM with some dropout regularization to prevent over fitting.

### Initialising the RNN

In [None]:
regressor = Sequential()

### Adding the first LSTM layer and some Dropout regularisation

In [None]:
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

units - number of LSTM cells/memory units/neurons

return_sequences - set to true to build a stacked LSTM. (Set to true to add another layer)
set to false if there is no more layer to be addes.(Default is false)

input_shape - input shape in 3-D, corresponding to the observations, the time steps, and the indicators.
But in this third argument of the LSTM class, we  have to includeonly the two last ones because 
the first one will be automatically taken into account.

### Adding a second LSTM layer and some Dropout regularisation

In [None]:
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

### Adding a third LSTM layer and some Dropout regularisation

In [None]:
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

### Adding a fourth LSTM layer and some Dropout regularisation

In [None]:
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

### Adding the output layer

In [None]:
regressor.add(Dense(units = 1))

### Compiling the RNN

In [None]:
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

### Fitting the RNN to the Training set

In [None]:
regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)

## Part 3 - Making the predictions and visualising the results

### Getting the real stock price of 2017

In [None]:
dataset_test = pd.read_csv('Google_Stock_Price_Test.csv')
real_stock_price = dataset_test.iloc[:, 1:2].values

### Getting the predicted stock price of 2017

In [None]:
dataset_total = pd.concat((dataset_train['Open'], dataset_test['Open']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values

inputs = inputs.reshape(-1,1) 
#This will get the inputs with the different stock prices of January 3rd minus three months, 
#up to the final stock prices, in lines and in one column.

inputs = sc.transform(inputs)
#Not fit_transform because our sc object was already fitted to the training set.
#transform method because the scaling we need to apply to our input must be 
#the same scaling that was applied to the training set

#Creating 3D structure for RNN
X_test = []
for i in range(60, 80):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_stock_price = regressor.predict(X_test)
predicted_stock_price = sc.inverse_transform(predicted_stock_price)

In [None]:
print(predicted_stock_price)

### Visualising the results

In [None]:
plt.plot(real_stock_price, color = 'red', label = 'Real Google Stock Price')
plt.plot(predicted_stock_price, color = 'blue', label = 'Predicted Google Stock Price')
plt.title('Google Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('Google Stock Price')
plt.legend()
plt.show()