This kernel is created by [Lilia Simeonova](https://github.com/lilia-simeonova) and it is based on previous work of  [Peter Naggy](https://www.kaggle.com/ngyptr/lstm-sentiment-analysis-keras/notebook).


We are going to review the well-known problem of Sentiment Analysis, but this time we will use the relatively new approach of Deep Learning.

Even though some of the most famous applications of the deep neural networks are related to image processing, many recent researches show awesome results resolving nature language processing problems.

Recurrent neural networks (which we are about to explore) are a subclass of neural networks, designed to perform a sequences recognition or prediction. They have a flexible number of inputs and they allow cyclical connections between their neurons. This means that they are able to remember previous information and connect it to the current task.

Long Short Term Memory networks (LSTM) are a subclass of RNN, specialized in remembering information for a long period of time. More over the Bidirectional lstms keep the contextual information in both directions.

Here you can find detailed explanation how LSTM works.

In this post we will concentrate on the application part.

# Tools
Before we start we need to make sure we have the following tools installed:

1. Python
2. TensorFlow - Google’s open sourced numeric computational library
3. Keras - Neural Network Framework, which can run on top of TensorFlow
4. Numpy - Package for scientific computations
5. Pandas - Package providing easy-to-use data structures and data analysis tools

# Approach
See below a simple diagram of how we will design our deep neural network.

![Architecture of our Bidirectional LSTM](http://thelillysblog.com/images/architecture-nn2.jpg)


### For this Kernal I’ve used [Sentiment data](https://www.kaggle.com/sonaam1234/sentimentdata/data) by Sonam Srivastava.

# Preprocessing
In order to make our life easier we can merge the two files (one with positive and one with negative examples) into one csv file with one column called text and another called sentiment - 1 for positive examples and 0 for negatives. We should also randomize the order.

The next thing to do is to prepare our data for the neural network.

We can simply use Keras preprocessing methods.

In [7]:
#!pip install keras
import numpy as np
import pandas as pd
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential, load_model
from keras.layers import Dense, Embedding, LSTM, Bidirectional, Dropout
# from sklearn.model_selection import train_test_split
from sklearn.cross_validation  import train_test_split
from keras.utils.np_utils import to_categorical
import re

data = pd.read_csv('https://raw.githubusercontent.com/rasbt/pattern_classification/master/data/50k_imdb_movie_reviews.csv')
#data = pd.read_csv("shuffled_movie_data.csv")

tokenizer = Tokenizer(num_words=2000, split=' ')

tokenizer.fit_on_texts(data['review'])
X = tokenizer.texts_to_sequences(data['review'])
X = pad_sequences(X)
Y = data['sentiment']

# We can then create our train and test sets:

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42)
print('X_train.shape: ', X_train.shape)
print('Y_train.shape: ', Y_train.shape)
print('X_test.shape: ', X_test.shape)
print('Y_test.shape: ', Y_test.shape)

X_train.shape:  (40000, 1939)
Y_train.shape:  (40000,)
X_test.shape:  (10000, 1939)
Y_test.shape:  (10000,)


In Keras, we can define our deep network as a sequence of layers. As described in the image above, we need to have three layers:

1. Embedding Layer - modifies the integer representation of words into dense vectors
2. Bidirectional LSTM Layer - connects two hidden layers of opposite directions to the same output
3. Dense Layer - output layer with softmax activation

In [14]:
model = Sequential()

model.add( Embedding(2000, 32, input_length = X.shape[1], dropout=0.2))

model.add(Dropout(0.2))
model.add( Bidirectional( LSTM(100, return_sequences=True)))

model.add(Dropout(0.2))
model.add( Bidirectional( LSTM(100)))

model.add(Dropout(0.2))
model.add( Dense(1, activation = 'sigmoid'))

model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

print(model.summary())

  This is separate from the ipykernel package so we can avoid doing imports until


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_8 (Embedding)      (None, 1939, 32)          64000     
_________________________________________________________________
dropout_14 (Dropout)         (None, 1939, 32)          0         
_________________________________________________________________
bidirectional_11 (Bidirectio (None, 1939, 200)         106400    
_________________________________________________________________
dropout_15 (Dropout)         (None, 1939, 200)         0         
_________________________________________________________________
bidirectional_12 (Bidirectio (None, 200)               240800    
_________________________________________________________________
dropout_16 (Dropout)         (None, 200)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 201       
Total para

In [0]:
model.fit(X_train, Y_train, epochs = 3, batch_size = 64, verbose = 2)

Epoch 1/3


In [0]:
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

This deep neural network gives us accuracy of 73%.

Note: Feel free to play with the Hyperparameters as much as you like.