![RNN.gif](attachment:4692c423-aa44-4a4e-83c0-77d12414c333.gif)

<p>The data here is presented in CSV format as follows: Date, Open, High, Low, Close, Volume, OpenInt.</p>

<p> What are we going to do in this notebook? </p>

<p>We will look at where RNN is used and how to implement the same for a task. </p>

<p> So, let's get started. </p>
    
<h2>Content :</h2>

<ul>
    <li> Load and Check Data </li> 
        <li> Recurrent Neural Networks (RNN) </a>
            <ul>
         <li> Why Recurrent Neural Networks </li>
                                   <li> What are the types of RNN <ul>
                                       <li> One to One </li>
                                        <li> One to Many</li>
                                        <li> Many to One </li>
                                        <li> Many to Many </li>
                                       </ul></li>
    <li> Data Preprocessing</a> 
        <ul>
            <li>  Split the data as train and test </li>
            <li>  Normalize Data</a> 
                <ul>
                    <li>   Why do we normalize data? </li>
                </ul>
            </li>
            <li>  X_train - y_train </a> 
                <ul>
                    <li>   What is the steps logic? </li>
                </ul>
            </li>
            <li>  Reshape </a> 
                <ul>
                    <li>   why do we reshape ?</li>
                </ul>
            </li>
        </ul>
    </li>
            <li> Implementing with Keras  <ul>
          <li> Create Model </li>
                                        <li> Compile Model</li>
                                        <li> Epochs and Batch Size </li>
                                        <li> Fit the model </li>
                                       </ul></li>
                          <li> Predict</li>
                             <li> Evaluate the model</li>
       </ul>
    </li>
</ul>


<h2 >Import Libraries </h2>

In [2]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

<h2>Load and Check Data </h2>

In [4]:
data = pd.read_csv("data.txt")

In [5]:
#Let's examine a few examples from our data.
data.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,OpenInt
0,2010-07-21,24.333,24.333,23.946,23.946,43321,0
1,2010-07-22,24.644,24.644,24.362,24.487,18031,0
2,2010-07-23,24.759,24.759,24.314,24.507,8897,0
3,2010-07-26,24.624,24.624,24.449,24.595,19443,0
4,2010-07-27,24.477,24.517,24.431,24.517,8456,0


In [6]:
print("Data Shape -->", data.shape)

Data Shape --> (1565, 7)


In [7]:
data.describe()

Unnamed: 0,Open,High,Low,Close,Volume,OpenInt
count,1565.0,1565.0,1565.0,1565.0,1565.0,1565.0
mean,36.01455,36.13712,35.855319,35.987517,6452.979553,0.0
std,6.957747,7.002548,6.878264,6.933814,12047.101114,0.0
min,23.936,23.946,23.867,23.946,2.0,0.0
25%,29.829,29.966,29.819,29.862,529.0,0.0
50%,36.512,36.571,36.322,36.464,1559.0,0.0
75%,38.957,39.123,38.787,38.838,5993.0,0.0
max,58.62,58.72,57.7,58.43,106139.0,0.0


In [8]:
print("null column? \n", data.isna().sum())

null column? 
 Date       0
Open       0
High       0
Low        0
Close      0
Volume     0
OpenInt    0
dtype: int64


<h2 >Recurrent Neural Networks (RNN) </h2>


<h3>Why Recurrent Neural Networks </h3>

<p>forward neural network : </p>

<ul> 
    <li>Cannot handle sequential data</li>
        <li>Takes into account the current entry.</li>
        <li>It does not consider previous entries, so it does not remember</li>
</ul>

<p> The solution to these problems is RNN. RNN can process sequential data by accepting current state input data and previous timestep output and due to these internal memories, RNNs can remember previous entries. </p>

<h3>What are the types of RNN </h3>

<ol> 
    <li> One to One </p> </li>
        <li> One to Many </p> </li>
        <li> Many to One </p> </li>
            <li > Many to Many </p> </li>
</ol>

<h4> One to One </h4>

![RNNEX.png](attachment:b3836f81-4faa-4c65-b922-8f88da75da70.png)

<p> This type of neural network can be called the vanilla neural network. It has a single entrance and a single exit. It is often used for machine learning problems. </p>

<h4> One to Many </h4>

![OnetoMany.png](attachment:78913137-2b2c-48de-a6c8-248348238b10.png)

<p> this neural network has one input and multiple outputs.  </p>

<p> Example : An example is the picture and the sentence that describes it. A picture is input as input and a sentence describing it comes out. </p>

<h4> Many to One </h4>

![manytoone.png](attachment:ea318661-d352-4401-a8ba-ecdf2dd6ca1b.png)

<p> This RNN takes multiple inputs and yields a single output.  </p>

<p> Example : Suppose it takes a sentence as input. It can also give the emotion of this sentence as output. as in the picture above. </p>

<h4> Many to Many  </h4>

![manytomany.png](attachment:84edcfa2-e8d1-443b-b996-9694f8f2d7d8.png)

<p> It is a type of RNN that gives many inputs and many outputs.  </p>

<p> Example : An example is machine translation from one language to another. </p>

<h2> Data Preprocessing</h2>

<p> In this section, we will make the data available for RNN. </p>
<ul>
     <li >Split the data as train and test</li>
     <li >Normalize data.</li>
     <li >X_train - y_train </li>
    <li >Reshape</li>
</ul>

<h3> Split the data as train and test</h3>

<ul>
    <li>In this section, we will separate the data we have as a train and test. </li>
</ul>

In [None]:
training_size = int(len(data)*0.80)
data_len = len(data)

train, test = data[0:training_size],data[training_size:data_len]

In [None]:
print("total length of data --> ", data_len)
print("Train length --> ", len(train))
print("Test length --> ", len(test))

<h3> Normalize data </h3>

<ul>
    <li> In this section, we will normalize the data we have.</li>
</ul>

<a id ='15' ></a>
<p> Why do we normalize data? </p>

<ul>
        <li>Normalization is very important in all deep learning in general. </li>
            <li>Normalization makes the properties more consistent. This allows the model to predict its output more accurately. </li>
                <li>In this notebook, since we use the "open" feature from our data to train the model, it will be sufficient to normalize it. </li>
</ul>

In [None]:
# the part of data that we will use as training.
train = train.loc[:, ["Open"]].values

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))
train_scaled = scaler.fit_transform(train)

In [None]:
train

<h3> X_train - y_train </h3>

<ul>
    <li>We will separate the normalized data into x_train and y_train. </li>
        <li >We will make this distinction for 40 steps. So we will train in 40 steps and We will predict step 41. Let's examine the picture below and to visualize this. </li>
</ul>

<p > What is the steps logic? </p>

![Steps.png](attachment:f039f066-9cc8-4f01-98ad-8b672763020e.png)

<ul>
    <li >Let's think of the number of steps as 3, not 40. </li>
     <li>then we reserve the first 3 numbers for training data since our step count is 3. </li>
         <li>We try to guess the number that comes after these 3 numbers. so we separate it as y_train. </li>
             <li >And take this one step further, as you can see in the picture, we continue to do this. You can think of it exactly like this in our real data. </li>
</ul>

In [None]:
end_len = len(train_scaled)
X_train = []
y_train = []
timesteps = 40

for i in range(timesteps, end_len):
    X_train.append(train_scaled[i - timesteps:i, 0])
    y_train.append(train_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)

<h3> Reshape </h3>

<p > why do we reshape ?  </p>

<ul>
    <li > One of the most basic points to be considered in RNNs is that they want their input to have 3 dimensions.  </li>
     <li >These 3 sizes are typically: <ul>
         <li>the size of data we have</li>
         <li >Number of steps</li>
         <li>Number of features</li>
         </ul> </li>       
</ul>

In [None]:
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
print("X_train --> ", X_train.shape)
print("y_train shape --> ", y_train.shape)

<h2>Implementing with Keras </h2>

<p> In this section, we create and fit  RNN model. </p>
<ul>
    <li>Create Model</li>
    <li>Compile Model</li>
    <li>Epochs and Batch Size</li>
    <li>Fit the model</li>
</ul>

<h3 > Create Model</h3>

<ul>
    <li>We are importing the libraries we will use for our model.</li>
        <li>Later, we will create our RNN model.</li>
</ul>

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import SimpleRNN
from tensorflow.keras.layers import Dropout

<ul>
    <li >units --> # of neurons, dimensionality of the output space.</li>
        <li>activation --> Activation function to use. Default: hyperbolic tangent (tanh). If you pass None, no activation is applied (ie. "linear" activation: a(x) = x). </li>
            <li >return_sequences --> Boolean. Whether to return the last output in the output sequence, or the full sequence. Default: False. </li>
                <li  >inputs --> A 3D tensor, with shape [batch, timesteps, feature]. </li>
</ul>

<p > for detail referance --> <a href = "https://keras.io/api/layers/recurrent_layers/simple_rnn/" >https://keras.io/api/layers/recurrent_layers/simple_rnn/ </a> </p>

In [None]:
regressor = Sequential()

regressor.add(SimpleRNN(units = 50, activation = "tanh", return_sequences = True, input_shape = (X_train.shape[1],X_train.shape[2])))
regressor.add(Dropout(0.2))

regressor.add(SimpleRNN(units = 50, activation = "tanh", return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(SimpleRNN(units = 50, activation = "tanh", return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(SimpleRNN(units = 50))
regressor.add(Dropout(0.2))

regressor.add(Dense(units = 1))

<h3 > Compile Model</h3>

<p > Yes, now we need to compile our model.  </p>

<ul>
    <li>optimizer --> The optimizer does the process of updating our parameters for us here. some kind of healer I can say. There are methods used for multiple optimizers, and you should choose the most suitable one for the model. </li>
    <li>loss --> It is a number that indicates how good or bad the model is to its prediction. As it approaches 0, the error starts to decrease.</li>
</ul>

In [None]:
regressor.compile(optimizer= "adam", loss = "mean_squared_error")

<h3 > Epochs and Batch Size </h3> 

<ul>
    <li >Epochs : the forward and backward processing of data for one full pass</li>
        <li>Batch Size : During training how much of the data,indicates that it will be trained. </li>
</ul>

In [None]:
epochs = 20 
batch_size = 20

<h3> Fit the model </h3> 

<ul>
    <li>We train the model we created above using our data. </li>
</ul>

In [None]:
regressor.fit(X_train, y_train, epochs = epochs, batch_size = batch_size)

<h3> Predict </h3> 

<ul>
    <li >We will make predictions using the model we have created.</li>
</ul>

In [None]:
test.head()

<ul>
    <li style = "color:darkblue;font-family:Segoe Print;font-weight:bold" > <p style = "color:black;font-family:Segoe Print;font-weight:bold" > We use the data we separated above as our test data. </p> </li>
</ul>

In [None]:
real_price = test.loc[:, ["Open"]].values
print("Real Price Shape --> ", real_price.shape)

<ul>
    <li >Since we use the "open" feature while training the model, we will use the same feature while testing. </li>
</ul>

In [None]:
dataset_total = pd.concat((data["Open"], test["Open"]), axis = 0) #1565+313=1878
inputs = dataset_total[len(data["Open"]) - timesteps:].values.reshape(-1,1) #1878[1565-40:]=353 
inputs = scaler.transform(inputs)  

<ul>
    <li >now distinguishing the values.</li>
</ul>

In [None]:
X_test = []

for i in range(timesteps, len(inputs)):
    X_test.append(inputs[i-timesteps:i, 0])
X_test = np.array(X_test)

print("X_test shape --> ", X_test.shape)

<ul>
    <li>Finally, we trained our model according to the steps above. We generate our test data based on this number of steps. </li>
</ul>

In [None]:
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predict = regressor.predict(X_test)
predict = scaler.inverse_transform(predict)

<ul>
    <li>We can now make an estimate here as data is ready to predict. </li>
        <li>inverse_transform --> If you remember before training our model, we normalized our data. converts these values ​​to before normalization. </li>
</ul>

<h3  > Evaluate the model ❔</h3> 

<ul>
    <l > yes, finally let's take a look at our results by comparing our predictions with real data. </li>
</ul>

In [None]:
plt.plot(real_price, color = "red", label = "Real Stock Price")
plt.plot(predict, color = "black", label = "Predict Stock Price")
plt.title("Stock Price Prediction")
plt.xlabel("Time")
plt.ylabel("Tesla Stock Price")
plt.legend()
plt.show()

<ul>
    <li >Our model has made good predictions until a certain time, but after a while the difference with the real data has started to increase. </li>
</ul>