<a href="https://colab.research.google.com/github/tejatanush/Stock_price_prediction/blob/main/Stock_price_prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Model Description:**

Developed a model for stock price prediction, leveraging time series data to capture temporal dependencies and forecast future prices. The model utilizes historical stock prices to train a deep learning network, ensuring accurate trend predictions and informed investment strategies.

# 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

# 2.Import Dataset

This dataset was imported from kaggel website. It consists of many independent features on which stock price is is predicted accurately.
# Reference:
https://www.kaggle.com/datasets/jainilcoder/netflix-stock-price-prediction

In [None]:
df=pd.read_csv("NFLX.csv")
df['Date']=pd.to_datetime(df['Date'])

These are the columns in our dataset and make them to an array

In [None]:
data=df[['Date','Open','High','Low','Close']].values

# 3.Split into Training and Test set

In [None]:
train_size=int(len(data)*0.8)
train_data,test_data=data[:train_size],data[train_size:]
test_dates=df['Date'][train_size:]

# 4.Feature Scaling

In [None]:
from sklearn.preprocessing import MinMaxScaler
sc=MinMaxScaler(feature_range=(0,1))
scaled_train_data=sc.fit_transform(train_data[:,1:])
scaled_test_data=sc.transform(test_data[:,1:])

# 5.Create Dataset
Dataset should be created in a sequence manner so that these data can be act as an input for recurrent nueral network with a timestep of 60.

In [None]:
def create_dataset(data,time_step=1):
  X,Y=[],[]
  for i in range(len(data)-time_step-1):
    X.append(data[i:(i+time_step)])
    Y.append(data[i+time_step])
  return np.array(X),np.array(Y)

time_step=60
X_train,Y_train=create_dataset(scaled_train_data,time_step)
X_test,Y_test=create_dataset(scaled_test_data,time_step)

# 6.Reshape X values in correct format

In [None]:
X_train=X_train.reshape(X_train.shape[0],X_train.shape[1],X_train.shape[2])
X_test=X_test.reshape(X_test.shape[0],X_test.shape[1],X_test.shape[2])

#7. Build a Model

**Create model:** To predict stock price we need to use recurrent nueral networks. So that we can use LSTM which have a good structure.LSTM-Long Short term memory. To maintain and remember all the historic values LSTM layer is good. At last we need to predict the price so we can use dense layer.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,LSTM
model=Sequential()
model.add(LSTM(units=50,return_sequences=True,input_shape=(time_step,4)))
model.add(LSTM(units=50,return_sequences=False))
model.add(Dense(4))

  super().__init__(**kwargs)


**Compile Model:** We can use Adam as optimizer and mean_squared_error as loss function

In [None]:
model.compile(optimizer='adam',loss='mean_squared_error')

**Fit Mode:** Fit the model with X_train,Y_train and validation data as X_test and Y_test. Train the model upto 10 epochs

In [None]:
model.fit(X_train,Y_train,validation_data=(X_test,Y_test),epochs=10,batch_size=1)

Epoch 1/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 10ms/step - loss: 0.0199 - val_loss: 0.0118
Epoch 2/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 7ms/step - loss: 0.0031 - val_loss: 0.0096
Epoch 3/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 7ms/step - loss: 0.0023 - val_loss: 0.0046
Epoch 4/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 6ms/step - loss: 0.0015 - val_loss: 0.0077
Epoch 5/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 8ms/step - loss: 0.0012 - val_loss: 0.0036
Epoch 6/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 6ms/step - loss: 0.0010 - val_loss: 0.0021
Epoch 7/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 8ms/step - loss: 9.1783e-04 - val_loss: 0.0023
Epoch 8/10
[1m746/746[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 7ms/step - loss: 9.3708e-04 - val_loss: 0.0031
Epoch 9/10
[1m746/746[0m [

<keras.src.callbacks.history.History at 0x7a814168c580>

We can see that our trained with less loss including validation data

# 8.Prediction

Predict X_test and store them in test_predict. It is compulsory to inverse transform predicted data as they are not real values...they are normalized values

In [None]:
test_predict=model.predict(X_test)
test_predict=sc.inverse_transform(test_predict)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 68ms/step


Make Y_test to y_test because Y_test is in normalized form......so make them to nuetral

In [None]:
y_test=sc.inverse_transform(Y_test)

In [None]:
test_dates=test_dates[time_step+1:].reset_index(drop=True)

# 9.Accuracy

In [None]:
from sklearn.metrics import r2_score
r2_score(y_test,test_predict)

0.9369780949639537

We have got the good results with an accuracy of 93.69%.

# 10. Final Dataframe
Make a dataframe to compare all predicted and actual prices

In [None]:
result_df=pd.DataFrame({'Date':test_dates,'Actual_open':y_test[:,0],
                        'Actual_high':y_test[:,1],
                        'Actual_low':y_test[:,2],
                        'Actual_close':y_test[:,3],
                        'Predicted_open':test_predict[:,0],
                        'Predicted_high':test_predict[:,1],
                        'Predicted_low':test_predict[:,2],
                        'Predicted_close':test_predict[:,3]})
print(result_df)

          Date  Actual_open  Actual_high  Actual_low  Actual_close  \
0   2021-07-19   541.809998   544.059998  527.049988    530.309998   
1   2021-07-20   526.049988   534.909973  522.239990    532.280029   
2   2021-07-21   526.070007   536.640015  520.299988    531.049988   
3   2021-07-22   526.130005   530.989990  505.609985    513.630005   
4   2021-07-23   510.209991   513.679993  507.000000    511.769989   
..         ...          ...          ...         ...           ...   
136 2022-01-31   386.760010   387.000000  372.079987    384.359985   
137 2022-02-01   401.970001   427.700012  398.200012    427.140015   
138 2022-02-02   432.959991   458.480011  425.540009    457.130005   
139 2022-02-03   448.250000   451.980011  426.480011    429.480011   
140 2022-02-04   421.440002   429.260010  404.279999    405.600006   

     Predicted_open  Predicted_high  Predicted_low  Predicted_close  
0        550.912964      551.247192     538.386658       551.241699  
1        538.518738

We can see that predicted and actual values are very close to each other.