<h1>Sentiment Analysis Using Bi-directional RNN (LSTM)</h1>
<p>Copyright : Paritosh Morparia</p>
<p>Indiana University</p>

<h4>Data Source</h4>
<p>The data used here is provided by keras as [IMDB movie reviews](https://keras.io/datasets/), where reviews have been classified as either positive or negative</p>
<p>The data is available to import using the function:</p>
<b>keras.datasets.imdb.load_data()</b></br>

<p> The reasons of using this dataset are:<p>
<ul>
    <li>It has 50000 reviews</li>
    <li>It is easy to use as the data has been transformed to a unique ndarray containing numerical values</li>
    <li>Little amount of preprocessing is involved</li>
</ul>


<p>There is a really good example of [Movie reviewes using LSTM](https://github.com/keras-team/keras/edit/master/examples/imdb_lstm.py) which I used as a referene for  this assignment.</p>

In [1]:
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import LSTM
from keras.datasets import imdb


Using TensorFlow backend.


<h4> Defining Features of data</h4>

In [2]:
max_features = 10000 # Defines maximum number of features
maxlen = 200         # Max length of a review
batch_size = 64      # Batch size of the data

<h4>Fetching the data from keras</h4>
<ul><li><p>It gives data in a numpy array</p></ul></li>
<h4>Padding the data after fetching it</h4>

In [3]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)



<h4>Resizing the classes in one hot form for training the vector</h4>

In [4]:
import numpy as np
y_train2=np.zeros((25000,2),dtype='int')
y_test2=np.zeros((25000,2),dtype='int')

for i,x in enumerate(y_train):
  y_train2[i][x]=1
for i,x in enumerate(y_test):
  y_test2[i][x]=1

<h4>Defining the archirecture of the model and setting it to train</h4>
<p>The architecture comprises of following layers
    <ol>
        <li>Embedding layer - 128 Nodes</li>
        <li>Bi-directional LSTM layer      - 128 Nodes</li>
        <li>Dense layer     - 2 Nodes(Classes)</li>
    </ol>
</p>
<p>Other hyperparameters that were tweaked for the following net are
    <ul>
        <li>Dropout Rate</li>
        <li>Number of words in vocabulary</li>
        <li>Loss function = binary cross entropy</li>
        <li>optimizer     = Adam optimizer</li>
    </ul>
</p>

In [5]:
from keras.layers import Bidirectional
from keras.callbacks import ModelCheckpoint

checkpointer = ModelCheckpoint(filepath='LSSTM_MODEL.hdf5',verbose=1, save_best_only=True)

model = Sequential()
model.add(Embedding(max_features, 128))
model.add(Bidirectional(LSTM(84, dropout=0.2, recurrent_dropout=0.2)))
model.add(Dense(2, activation='sigmoid'))

# try using different optimizers and different optimizer configs
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Train...')
model.fit(x_train, y_train2,
          batch_size=batch_size,
          epochs=4,
          validation_data=(x_test, y_test2),
         callbacks=[checkpointer])
score, acc = model.evaluate(x_test, y_test2,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)


Train...
Train on 25000 samples, validate on 25000 samples
Epoch 1/4

Epoch 00001: val_loss improved from inf to 0.38748, saving model to LSSTM_MODEL.hdf5
Epoch 2/4

Epoch 00002: val_loss improved from 0.38748 to 0.36218, saving model to LSSTM_MODEL.hdf5
Epoch 3/4

Epoch 00003: val_loss improved from 0.36218 to 0.34378, saving model to LSSTM_MODEL.hdf5
Epoch 4/4

Epoch 00004: val_loss did not improve
Test score: 0.369127940636
Test accuracy: 0.84156


<h2>Results and analysis</h2>

<p><b> Accuracy :-</b>84%</p>
<h4>Comparison with Unidirectional LSTM</h4>

<p>In the previous assignment, we used LSTM to classify the sentiment of the movie reviews with 85% accuracy.</p>
<p>In this assignment, I attempted a bidirectional LSTM to preserve information both ways in time.</p>
<p>However it is evident that bidirectional LSTMs do not help much with classification. It may be useful in generative models where information from future is valuable.</p>
