<a href="https://colab.research.google.com/github/mesushan/Recurrent-Neural-Network-for-Text-Classification/blob/master/Building_a_Recurrent_Neural_Network_in_TensorFlow_2_0_with_imdb_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[Dataset link for more information](https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews)

## Step 1: Installing the dependencies and setting up a GPU environment

In [1]:
%tensorflow_version 2.x

TensorFlow 2.x selected.


## Step 2: Importing the libraries

In [0]:
import tensorflow as tf
from tensorflow.keras.datasets import imdb

In [3]:
tf.__version__

'2.1.0'

## Step 3: Data Preprocessing

### Setting up the dataset parameters

In [0]:
number_of_words = 20000
max_len = 100

### Loading the IMDB dataset

In [6]:
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=number_of_words)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


### Padding all sequences to be the same length 

In [0]:
X_train = tf.keras.preprocessing.sequence.pad_sequences(X_train, maxlen=max_len)

In [0]:
X_test = tf.keras.preprocessing.sequence.pad_sequences(X_test, maxlen=max_len)

## Step 4: Building a Recurrent Neural Network

### Defining the model

In [0]:
model = tf.keras.Sequential()

### Adding the embedding layer

In [0]:
model.add(tf.keras.layers.Embedding(input_dim=number_of_words, output_dim=128, input_shape=(X_train.shape[1],)))

### Adding the LSTM layer

- units: 128
- activation: tanh

In [0]:
model.add(tf.keras.layers.LSTM(units=128, activation='tanh'))

### Adding the output layer

- units: 1
- activation: sigmoid

In [0]:
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

### Compiling the model

In [0]:
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

In [14]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 100, 128)          2560000   
_________________________________________________________________
lstm (LSTM)                  (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
Total params: 2,691,713
Trainable params: 2,691,713
Non-trainable params: 0
_________________________________________________________________


### Training the model

In [15]:
model.fit(X_train, y_train, epochs=3, batch_size=128)

Train on 25000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7f8fd2460f98>

### Evaluating the model

In [16]:
test_loss, test_acurracy = model.evaluate(X_test, y_test)



In [17]:
print("Test accuracy: {}".format(test_acurracy))

Test accuracy: 0.8468800187110901
