<h1 align = center>Implemention Of RNN Using TensorFlow/Keras </h1>

#### What  is RNN ?

A Recurrent Neural Network (RNN) is a type of artificial neural network that uses sequential data, such as text, audio, and video. RNNs have been widely used in various fields, including natural language processing, speech recognition, and image and video recognition.



<h2 align = center> Importing Necessary Libraries </h2>

In [50]:
import pandas as pd
import numpy as np
import math
import tensorflow

from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error

import matplotlib.pyplot as plt

<h2 align =center> Importing Data </h2>

In [51]:
text = ["Maybe life is random, but I doubt it.","It's takin' whatever comes your way, the good AND the bad, that give life flavor. It's all the stuff rolled together that makes life worth livin'.","The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny ...'","My guess is that well over 80 percent of the human race goes without having a single original thought."]

<h2 align = center> Data Preprocessing </h2>

### Tokenization

In [52]:

tokenizer = Tokenizer()
tokenizer.fit_on_texts(text)

### Creating Input Sequence

#### What is Input Sequence in RNN ?

A sequence of data points that are fed into a neural network, such as a recurrent neural network (RNN), is called an input sequence. In RNN, the output of a layer at time step t is used as the input for the layer at time step t+1.



In [53]:

sequences = tokenizer.texts_to_sequences(text)

# Generate input sequences and corresponding labels
input_seq = []
for seq in sequences:
    for i in range(1, len(seq)):
        n_gram_sequence = seq[:i+1]
        input_seq.append(n_gram_sequence)

input_seq


[[10, 3],
 [10, 3, 4],
 [10, 3, 4, 11],
 [10, 3, 4, 11, 5],
 [10, 3, 4, 11, 5, 6],
 [10, 3, 4, 11, 5, 6, 12],
 [10, 3, 4, 11, 5, 6, 12, 7],
 [8, 13],
 [8, 13, 14],
 [8, 13, 14, 15],
 [8, 13, 14, 15, 16],
 [8, 13, 14, 15, 16, 17],
 [8, 13, 14, 15, 16, 17, 1],
 [8, 13, 14, 15, 16, 17, 1, 18],
 [8, 13, 14, 15, 16, 17, 1, 18, 19],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22, 8],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22, 8, 23],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22, 8, 23, 1],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22, 8, 23, 1, 24],
 [8, 13, 14, 15, 16, 17, 1, 18, 19, 1, 20, 2, 21, 3, 22, 8, 23, 1, 24, 25],
 [8, 13, 14, 15, 16, 17, 1, 18, 19,

### Padding Sequence

#### Why Padding is important in Sequence ?

Padding is necessary in sequence data to make all sequences of the same length. This is because RNNs require all sequences to have the same length, and if a sequence is shorter than the maximum sequence length, padding is added to make it longer.

In [54]:
max_seq_len = max([len(x) for x in input_seq])

input_seq = np.array(pad_sequences(input_seq,maxlen=max_seq_len , padding='pre'))

input_seq[0]

array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0, 10,  3], dtype=int32)

### Splitting Data into input and output

We splits sequences into input (X) with all tokens except the last one and output (y) with only the last token, to train a model to predict the next token based on the preceding ones.

In [55]:
X = input_seq[:,:-1]
y = input_seq[:,-1]


### Converting y to one-hot encoding

In [None]:
# Convert output labels to categorical format
num_classes = len(tokenizer.word_index) + 1
y = tensorflow.keras.utils.to_categorical(y, num_classes=num_classes)



### Define RNN Model

In [57]:
model = tensorflow.keras.models.Sequential([
    keras.layers.Embedding(input_dim=max_seq_len, output_dim=100),
    keras.layers.SimpleRNN(units=300, activation='relu', return_sequences=True),
    keras.layers.SimpleRNN(units=100, activation='relu'),


    keras.layers.Dense(units=num_classes, activation='softmax')
])

### Model Compilation

In [58]:
model.compile(optimizer='adamW', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

### Train the model


In [59]:
history = model.fit(X, y, epochs=40, batch_size=32)

Epoch 1/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 436ms/step - accuracy: 0.0068 - loss: 4.1360
Epoch 2/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 15ms/step - accuracy: 0.0427 - loss: 4.1034     
Epoch 3/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.0679 - loss: 4.0785
Epoch 4/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.0466 - loss: 4.0488
Epoch 5/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step - accuracy: 0.0768 - loss: 3.9873 
Epoch 6/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step - accuracy: 0.0903 - loss: 3.9242
Epoch 7/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.1194 - loss: 3.8457
Epoch 8/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.0864 - loss: 3.6774 
Epoch 9/40
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37

### Making A Prediction From The Passage

In [67]:
number_of_words = 6


prediction = "My guess is that well over 80 percent of the human race goes"
seq = tokenizer.texts_to_sequences([prediction])[0]
predicted_words = []
for i in range(number_of_words):
    paded_seq = pad_sequences([seq] , maxlen=max_seq_len-1 , padding= 'pre')
    predicted = model.predict(paded_seq)
    predicted_word_index = np.argmax(predicted)
    predicted_word = tokenizer.index_word.get(predicted_word_index,'Unknown')

    predicted_words.append(predicted_word)

    seq.append(predicted_word_index)

predicted_sentence = ' '.join(predicted_words)

print(f"Next {number_of_words} Words After '{prediction}' are '{predicted_sentence}'")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step
Next 6 Words After 'My guess is that well over 80 percent of the human race goes' are 'without having a single original thought'
