## GOOGLE STOCK PREDICTOR USING  (RNNs) for Stock Price Prediction

Predicting stock prices is a challenging task due to the dynamic and non-linear nature of financial markets. Traditional methods, such as statistical models, often struggle to capture the intricate patterns and dependencies in time series data. However, advancements in machine learning, particularly Recurrent Neural Networks (RNNs), have opened new avenues for making more accurate predictions by leveraging sequential data.

## What are Recurrent Neural Networks (RNNs)?

Recurrent Neural Networks (RNNs) are a class of artificial neural networks designed to recognize patterns in sequences of data, such as time series. Unlike traditional feedforward neural networks, RNNs have connections that form directed cycles, enabling them to maintain a memory of previous inputs. This characteristic makes RNNs particularly well-suited for tasks where context and temporal dynamics are crucial, such as stock price prediction.

## How RNNs Work

In an RNN, the hidden state is updated at each time step, allowing information to persist. This is achieved through the following process:
1. **Input Layer:** Receives the input data at each time step.
2. **Hidden Layer:** Maintains a hidden state that captures information from previous time steps. The hidden state is updated based on the current input and the previous hidden state.
3. **Output Layer:** Generates predictions based on the hidden state.

## Advantages of RNNs for Stock Price Prediction

RNNs offer several advantages for stock price prediction:
- **Temporal Dependency:** They can capture temporal dependencies in stock price movements, recognizing patterns over time.
- **Memory Retention:** RNNs maintain information from previous time steps, allowing them to understand the context of recent price trends.
- **Flexibility:** They can be adapted to various types of sequential data, including daily, hourly, or minute-by-minute stock prices.



In [2]:
#import libraries

In [1]:
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## data preprocessing

In [2]:
#load the data

In [6]:
train_df = pd.read_csv("training.csv")
train_df.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2019-07-15,57.342999,57.541,56.970001,57.516998,57.451622,18076000
1,2019-07-16,57.299999,57.929001,57.25,57.679001,57.613438,24776000
2,2019-07-17,57.5485,57.917999,57.288502,57.317501,57.25235,23400000
3,2019-07-18,57.087002,57.380249,56.636501,57.316502,57.25135,25814000
4,2019-07-19,57.4095,57.556999,56.480999,56.505001,56.440773,32944000


In [8]:
train_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1006 entries, 0 to 1005
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       1006 non-null   object 
 1   Open       1006 non-null   float64
 2   High       1006 non-null   float64
 3   Low        1006 non-null   float64
 4   Close      1006 non-null   float64
 5   Adj Close  1006 non-null   float64
 6   Volume     1006 non-null   int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 55.1+ KB


In [9]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 252 entries, 0 to 251
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       252 non-null    object 
 1   Open       252 non-null    float64
 2   High       252 non-null    float64
 3   Low        252 non-null    float64
 4   Close      252 non-null    float64
 5   Adj Close  252 non-null    float64
 6   Volume     252 non-null    int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 13.9+ KB


## Feature Scaling

In [10]:
#picking the open column

In [12]:
training_set = train_df.iloc[:,1:2].values

In [13]:
training_set.shape, train_df.shape

((1006, 1), (1006, 7))

In [22]:
from sklearn.preprocessing import MinMaxScaler

## Feature scaling

### Normalization

In [14]:
from sklearn.preprocessing import MinMaxScaler

In [18]:
my_sc = MinMaxScaler(feature_range=(0,1))

In [19]:
scaled_train_set = my_sc.fit_transform(training_set)

In [20]:
scaled_train_set

array([[0.0456138 ],
       [0.04517962],
       [0.04768877],
       ...,
       [0.66887966],
       [0.64555529],
       [0.67120203]])

In [24]:
scaled = MinMaxScaler(feature_range=(0,1))

In [25]:
scaled_trainset = scaled.fit_transform(training_set)

In [27]:
training_set

array([[ 57.342999],
       [ 57.299999],
       [ 57.5485  ],
       ...,
       [119.07    ],
       [116.760002],
       [119.300003]])

In [28]:
#create a data structure with 70 timestamps

In [29]:
X_train =[]
y_train = []
for i in range(70, 1006):
    X_train.append(scaled_trainset[i-70:i,0])
    y_train.append(scaled_trainset[1,0])

In [30]:
#convert X_train and y_train into np array

In [31]:
X_train,y_train = np.array(X_train),np.array(y_train)

In [32]:
X_train

array([[0.0456138 , 0.04517962, 0.04768877, ..., 0.09815426, 0.09943154,
        0.0988257 ],
       [0.04517962, 0.04768877, 0.04302896, ..., 0.09943154, 0.0988257 ,
        0.09659931],
       [0.04768877, 0.04302896, 0.04628527, ..., 0.0988257 , 0.09659931,
        0.09382763],
       ...,
       [0.5066187 , 0.50379151, 0.49086719, ..., 0.6788758 , 0.68473215,
        0.68725643],
       [0.50379151, 0.49086719, 0.49359338, ..., 0.68473215, 0.68725643,
        0.66887966],
       [0.49086719, 0.49359338, 0.50328662, ..., 0.68725643, 0.66887966,
        0.64555529]])

In [34]:
X_train.shape

(936, 70)

In [35]:
#change shape
X_train = X_train.reshape(936,70,1)

### LSTM building

In [36]:
#initializing RNN

In [38]:
model = tf.keras.models.Sequential()

In [39]:
#first layer lstm

In [40]:
model.add(tf.keras.layers.LSTM(units=70, activation = 'relu', return_sequences=True,input_shape=(70,1)))

  super().__init__(**kwargs)


In [41]:
#drop layer.dropping 0.2 %neurons

In [42]:
model.add(tf.keras.layers.Dropout(0.2))

In [43]:
#second lstm layer

In [46]:
model.add(tf.keras.layers.LSTM(units = 70, activation='relu', return_sequences=True, input_shape=(70,1)))

In [47]:
#drop layer

In [48]:
model.add(tf.keras.layers.Dropout(0.2))

In [49]:
#third lstm layer

In [50]:
model.add(tf.keras.layers.LSTM(units = 90, activation='relu', return_sequences=True, input_shape=(70,1)))

In [51]:
#drop layer

In [52]:
model.add(tf.keras.layers.Dropout(0.2))

In [53]:
#fourth lstm layer

In [54]:
model.add(tf.keras.layers.LSTM(units = 120, activation='relu', input_shape=(70,1)))

In [55]:
model.add(tf.keras.layers.Dropout(0.2))

In [56]:
#output layer

In [57]:
model.add(tf.keras.layers.Dense(units=1))

In [58]:
model.summary()

In [59]:
#compiling the model.ie reconfigure the learning process
#use adam for optimization(stochastic gradient descent)

In [60]:
model.compile(optimizer='adam',loss = 'mean_squared_error')

In [61]:
#training the model

In [63]:
model.fit(X_train,y_train, batch_size=32,epochs=100)

Epoch 1/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 188ms/step - loss: 1.3831e-05
Epoch 2/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 182ms/step - loss: 1.1255e-05
Epoch 3/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 181ms/step - loss: 8.1151e-06
Epoch 4/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 186ms/step - loss: 7.0730e-06
Epoch 5/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 178ms/step - loss: 6.0589e-06
Epoch 6/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 184ms/step - loss: 6.0581e-06
Epoch 7/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 186ms/step - loss: 4.6327e-06
Epoch 8/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 186ms/step - loss: 4.3703e-06
Epoch 9/100
[1m30/30[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 181ms/step - loss: 4.9465e-06
Epoch 10/100
[1m30/30[0m [32m━━━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x15edb6455d0>

## Making predictions

In [64]:
test_df = pd.read_csv("testing.csv")
test_df.head()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2023-07-13,121.540001,125.334999,121.059998,124.830002,124.68811,31535900
1,2023-07-14,125.129997,127.089996,124.900002,125.699997,125.557121,20482800
2,2023-07-17,126.059998,127.279999,124.5,125.059998,124.917847,20675300
3,2023-07-18,124.904999,124.989998,123.300003,124.080002,123.938965,21071200
4,2023-07-19,124.790001,125.470001,122.470001,122.779999,122.640442,22313800


In [65]:
test_df.shape

(252, 7)

In [67]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 252 entries, 0 to 251
Data columns (total 7 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       252 non-null    object 
 1   Open       252 non-null    float64
 2   High       252 non-null    float64
 3   Low        252 non-null    float64
 4   Close      252 non-null    float64
 5   Adj Close  252 non-null    float64
 6   Volume     252 non-null    int64  
dtypes: float64(5), int64(1), object(1)
memory usage: 13.9+ KB


In [68]:
#select open column
stock_test = test_df.iloc[:, 1:2].values

In [69]:
stock_test.shape

(252, 1)

In [72]:
train_df.head(2)

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2019-07-15,57.342999,57.541,56.970001,57.516998,57.451622,18076000
1,2019-07-16,57.299999,57.929001,57.25,57.679001,57.613438,24776000


In [74]:
test_df.head(2)

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,2023-07-13,121.540001,125.334999,121.059998,124.830002,124.68811,31535900
1,2023-07-14,125.129997,127.089996,124.900002,125.699997,125.557121,20482800


In [77]:
dataset = pd.concat([train_df["Open"], test_df["Open"]], axis=0)

In [78]:
#stock prices of previous 70 days

In [81]:
input = dataset[len(dataset)-len(test_df)-70:].values

  input = dataset[len(dataset)-len(test_df)-60:].values


In [82]:
#reshape into numpy array

In [83]:
inputs = input.reshape(-1,1)

In [84]:
#feature scaling

In [85]:
inputs = scaled.transform(inputs)

In [86]:
#test set

In [89]:
X_test = []
for i in range(70, 90):
    X_test.append(inputs[i-70:i, 0])
    

In [90]:
#convert to np array

In [91]:
X_test = np.array(X_test)

In [92]:
#conver to 3d

In [93]:
X_test = np.reshape(X_test,(X_test.shape[0],X_test.shape[1],1))

In [94]:
#getting predicted stock prices

In [95]:
#variable predicted prices

In [97]:
predicted_stocks = model.predict(X_test)
predicted_stocks = scaled.inverse_transform(predicted_stocks)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 163ms/step


In [98]:
print(predicted_stocks[0])

[57.299965]
