# RNN Implementation

In [22]:
!pip install --upgrade keras
!pip install --upgrade tensorflow

Collecting keras<2.16,>=2.15.0 (from tensorflow)
  Downloading keras-2.15.0-py3-none-any.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: keras
  Attempting uninstall: keras
    Found existing installation: keras 3.0.5
    Uninstalling keras-3.0.5:
      Successfully uninstalled keras-3.0.5
Successfully installed keras-2.15.0


In [1]:
import numpy as np
import keras
from keras.datasets import reuters

## Load Dataset



1. Reuters - This dataset is a collection of newswire articles and their corresponding topics.
2.  The loading of dataset is done by a utility function provided by the Keras library specifically for loading the Reuters dataset.
  *   The function loads the data and returns two tuples.
  *   Each element of **X** is a list of word indices representing a document (newswire article), and each element of **y** is an integer representing the category or topic of the corresponding document.
  *   The **num_words** parameter in *reuters.load_data(num_words=None, test_split=0.2)* controls the maximum number of words to include in the dataset.





In [2]:
(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words = None, test_split = 0.2)

## Visualizing the dataset

In [3]:
print(f"The shape of X_train is {X_train.shape} and the shape of X_test is {X_test.shape}")
print(f"The shape of X_train is {y_train.shape} and the shape of X_test is {y_test.shape}")

The shape of X_train is (8982,) and the shape of X_test is (2246,)
The shape of X_train is (8982,) and the shape of X_test is (2246,)


#### Explanation of output of X_train[0]

The output of *print(X_train[0])* for the Reuters dataset will be the first document of newswire article in the training set represented as a list of word indices. Each index corresponds to a specific word in the vocabulary.

In [4]:
print(X_train[0])

[1, 27595, 28842, 8, 43, 10, 447, 5, 25, 207, 270, 5, 3095, 111, 16, 369, 186, 90, 67, 7, 89, 5, 19, 102, 6, 19, 124, 15, 90, 67, 84, 22, 482, 26, 7, 48, 4, 49, 8, 864, 39, 209, 154, 6, 151, 6, 83, 11, 15, 22, 155, 11, 15, 7, 48, 9, 4579, 1005, 504, 6, 258, 6, 272, 11, 15, 22, 134, 44, 11, 15, 16, 8, 197, 1245, 90, 67, 52, 29, 209, 30, 32, 132, 6, 109, 15, 17, 12]


#### Checking of word indices of the Reuters dataset


1.   word_indices is a dict with an example of output as {'mdbl': 10996, 'fawc': 16260, 'degussa': 12089, 'woods': 8803} and so on
2.   reverse_word_indices is a dict with an example of output as {10996: 'mdbl', 16260: 'fawc', 12089: 'degussa', 8803: 'woods'} and so on



In [5]:
word_indices = reuters.get_word_index()
# word_indices is the dictinory of word - word_index pair.
# Example of Words and their respectives indices in the Reuters Dataset
for word, index in list(word_indices.items())[:10]:
    print(f"For Word: {word} we have Index: {index}")

For Word: mdbl we have Index: 10996
For Word: fawc we have Index: 16260
For Word: degussa we have Index: 12089
For Word: woods we have Index: 8803
For Word: hanging we have Index: 13796
For Word: localized we have Index: 20672
For Word: sation we have Index: 20673
For Word: chanthaburi we have Index: 20675
For Word: refunding we have Index: 10997
For Word: hermann we have Index: 8804


In [6]:
def decode_article_from_indices(word_indices, word_index):
    # Create reverse word index dictionary
    reverse_word_indices = {value: key for key, value in word_index.items()}
    # Decode the article
    decoded_article = ' '.join([reverse_word_indices.get(i - 3, '?') for i in word_indices])

    return decoded_article

In [7]:
decoded_article = decode_article_from_indices(X_train[0], word_indices)
print(f"The Decoded article - {decoded_article}")

The Decoded article - ? mcgrath rentcorp said as a result of its december acquisition of space co it expects earnings per share in 1987 of 1 15 to 1 30 dlrs per share up from 70 cts in 1986 the company said pretax net should rise to nine to 10 mln dlrs from six mln dlrs in 1986 and rental operation revenues to 19 to 22 mln dlrs from 12 5 mln dlrs it said cash flow per share this year should be 2 50 to three dlrs reuter 3


## Data preprocessing

### Text Data Preprocessing with Tokenization and Binary Vectorization using Keras Tokenizer

**Steps:**
1.   Importing keras.preprocessing.Tokenizer which is a text preprocessing utility in Keras that is used to tokenize text (split it into words or subwords).
2.   **Initialize Tokenizer**: An instance of Tokenizer is created with num_words=100.
3. **Convert Sequences to Binary Matrix:**
      *   The *sequences_to_matrix method* of the Tokenizer object is used to convert the sequences of word indices *(X_train and X_test)* into binary matrices.
      *    In this mode *(mode='binary')*, the matrix representation is such that if a word from the vocabulary exists in a sequence, the corresponding entry in the matrix is set to 1; otherwise, it is set to 0.

By performing these steps, the text data is transformed into a format suitable for machine learning models, where each document (news article) is represented as a binary vector indicating the presence or absence of each word from the vocabulary. This preprocessing step is common when working with text data in machine learning tasks. It helps in standardizing the input data and making it compatible with various machine learning algorithms.

In [8]:
from tensorflow.keras.preprocessing.text import Tokenizer

max_words = 100
tokenizer = Tokenizer(num_words=max_words)

If you have text sequences as your dataset instead of word indices as compared to this dataset you need to tokensize your dataset first and then you need to apply the *sequences_to_matrix* text

In [9]:
X_Train = tokenizer.sequences_to_matrix(X_train, mode = 'binary')
X_Test = tokenizer.sequences_to_matrix(X_test, mode = 'binary')

#### Visualizing the Tokenized Dataset

In [10]:
print(f"The shape of the Train tokenized dataset is {X_Train.shape} \nwhereas the shape of the non-tokenized dataset Train is {X_train.shape}\n")
print(f"The shape of the Test tokenized dataset is {X_Test.shape} \nwhereas the shape of the non-tokenized dataset Test is {X_test.shape}\n")

The shape of the Train tokenized dataset is (8982, 100) 
whereas the shape of the non-tokenized dataset Train is (8982,)

The shape of the Test tokenized dataset is (2246, 100) 
whereas the shape of the non-tokenized dataset Test is (2246,)



In [11]:
print(f"The non-tokenized train dataset is {X_train[0]}\n")
print(f"The type of the non-tokenized train dataset is {type(X_train[0])}\n")
print(f"The shape of the non-tokenized train dataset is {len(X_train[0])}\n")
print(f"The tokenized train dataset is {X_Train[0]}")
print(f"The type of the tokenized train dataset is {type(X_Train[0])}\n")
print(f"The shape of the tokenized train dataset is {X_Train[0].shape}\n")

The non-tokenized train dataset is [1, 27595, 28842, 8, 43, 10, 447, 5, 25, 207, 270, 5, 3095, 111, 16, 369, 186, 90, 67, 7, 89, 5, 19, 102, 6, 19, 124, 15, 90, 67, 84, 22, 482, 26, 7, 48, 4, 49, 8, 864, 39, 209, 154, 6, 151, 6, 83, 11, 15, 22, 155, 11, 15, 7, 48, 9, 4579, 1005, 504, 6, 258, 6, 272, 11, 15, 22, 134, 44, 11, 15, 16, 8, 197, 1245, 90, 67, 52, 29, 209, 30, 32, 132, 6, 109, 15, 17, 12]

The type of the non-tokenized train dataset is <class 'list'>

The shape of the non-tokenized train dataset is 87

The tokenized train dataset is [0. 1. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 1. 1. 1. 0. 1. 0. 0. 1. 0.
 0. 1. 1. 0. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 1. 0. 0. 0.
 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]
The type of the tokenized train dataset is <class 'numpy.ndarray'>

The shape of the tokenized train dataset is (100,)



### Converting Class Labels to One-Hot Encoded Vectors for Multi-Class Classification using Keras

1. **Importing Necessary Function**: The code imports the `to_categorical` function from `keras.utils`. This function is used to convert class labels (integers) into one-hot encoded vectors.

2. **Defining Number of Classes**: `num_classes` is set to 46. This typically indicates that there are 46 different classes or categories in the classification problem.

3. **Converting Class Labels to One-Hot Encoded Vectors**: The `to_categorical` function is then used to convert the class labels (`y_train` and `y_test`) into one-hot encoded vectors. This means that each class label is converted into a binary vector where the index corresponding to the class is set to 1 and all other indices are set to 0.

In [12]:
from keras.utils import to_categorical
num_classes = 46

In [13]:
y_Train = to_categorical(y_train,num_classes)
y_Test = to_categorical(y_test,num_classes)

#### Visualizing the categorical dataset

In [14]:
print(f"The non-categorized train dataset is {y_train[0]}\n")
print(f"The type of the non-categorized train dataset is {type(y_train[0])}\n")
print(f"The categorized train dataset is {y_Train[0]}\n")
print(f"The type of the categorized train dataset is {type(y_Train[0])}\n")
print(f"The shape of the total categorized train dataset is {y_Train.shape}\n")
print(f"The shape of the categorized train dataset is {y_Train[0].shape}\n")

The non-categorized train dataset is 3

The type of the non-categorized train dataset is <class 'numpy.int64'>

The categorized train dataset is [0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

The type of the categorized train dataset is <class 'numpy.ndarray'>

The shape of the total categorized train dataset is (8982, 46)

The shape of the categorized train dataset is (46,)



## RNN Proper

#### Importing Necessary Module
   - `Sequential`: This is a linear stack of layers in the neural network model.
   - `Dense`: This is a fully connected layer.
   - `Dropout`: This is a regularization technique that randomly drops a fraction of input units during training to prevent overfitting.
   - `Activation`: This specifies the activation function to be applied to the output of the layers.
   - `optimizers`: This provides optimization algorithms to train the neural network.
   - `pad_sequences`: This is used to pad sequences to the same length.

In [15]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, SimpleRNN
from keras import optimizers
from keras.preprocessing.sequence import pad_sequences

#### Padding Sequences
   - `pad_sequences`: This function pads sequences to ensure that they all have the same length. In this case, padding is added at the end of each sequence (`padding='post'`).

In [16]:
X_Train = pad_sequences(X_Train, padding='post')
X_Test = pad_sequences(X_Test, padding='post')

#### Reshaping the data
   The input data is reshaped to have a third dimension of 1, which is usually required when using RNN layers in Keras.



In [17]:
X_Train = np.array(X_Train).reshape(X_Train.shape[0], X_Train.shape[1], 1)
X_Test = np.array(X_Test).reshape(X_Test.shape[0], X_Test.shape[1], 1)

#### Defining the RNN Model
   - This function defines a vanilla RNN model.
   - `Sequential()` initializes the model.
   - `model.add(SimpleRNN(50, input_shape=(max_words,), return_sequences=False))`: Adds a SimpleRNN layer with 50 units and specifies the input shape. `return_sequences=False` indicates that only the final output of the RNN sequence will be returned.
   - `model.add(Dense(num_classes))`: Adds a fully connected layer with `num_classes` units.
   - `model.add(Activation['Softmax'])`: Applies the softmax activation function to the output layer.
   - `adam = optimizers.Adam(lr=0.001)`: Initializes the Adam optimizer with a learning rate of 0.001.
   - `model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])`: Compiles the model with categorical cross-entropy loss function and the Adam optimizer. Accuracy is also specified as a metric.


In [23]:
def basic_rnn():
  model = Sequential()
  model.add(SimpleRNN(50, input_shape=(X_Train.shape[1], X_Train.shape[2]), return_sequences=False))
  model.add(Dense(num_classes))
  model.add(Activation('softmax'))

  adam = optimizers.Adam(learning_rate=0.001)
  model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
  return model

### Training Keras Models with scikit-learn Integration

1. **Importing `KerasClassifier`**: This code imports the `KerasClassifier` class from `keras.wrappers.scikit_learn` module. `KerasClassifier` allows using Keras models as scikit-learn estimators, enabling seamless integration of Keras models with scikit-learn functionality such as grid search and cross-validation.

2. **Creating a KerasClassifier Instance**:
   - `model = KerasClassifier(build_fn=basic_rnn, epochs=5, batch_size=50)`: This line creates a `KerasClassifier` instance. It takes several arguments:
     - `build_fn`: A function that constructs and returns a Keras model. In this case, `basic_rnn` is assumed to be a function that returns a Keras model.
     - `epochs`: The number of epochs (iterations over the entire training dataset) to train the model.
     - `batch_size`: The number of samples to use in each training batch.

3. **Training the Model**:
   - `model.fit(X_Train, y_Train)`: This line trains the model on the provided training data `X_Train` and corresponding labels `y_Train`. The `fit` method adjusts the parameters of the model to minimize the specified loss function, typically using gradient descent or its variants.

In [47]:
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import train_test_split
# Create a KerasClassifier instance
model = KerasClassifier(build_fn=basic_rnn, epochs=50, batch_size=50)

# Split data into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_Train, y_Train, test_size=0.2, random_state=42)

# Train the model
model.fit(X_train, y_train, validation_data=(X_val, y_val))


Epoch 1/50


  X, y = self._initialize(X, y)


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Accuracy Calcualtion of the model

In [48]:
from sklearn.metrics import accuracy_score

y_pred = model.predict(X_Test)
y_test_ = np.argmax(y_Test,axis=1)

print(accuracy_score(y_pred,y_Test))


0.5440783615316117
