# Overview

In this notebook I'm trying to implement sentiment analysis using Keras library.

# Data

All experiments uses IMDB datase provided with Keras library. This is a dataset of 25000 movies review from IMDb, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a list of word indexes (integers).

## Imports

In [1]:
import numpy

from tensorflow.keras.datasets import imdb
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout, Conv1D, MaxPooling1D
from tensorflow.keras.layers import Embedding, Flatten
from tensorflow.keras.preprocessing import sequence

## Parameters

In [2]:
top_words = 5000
max_review_length = 500
embedding_vector_length = 32

## Loading Data

In [3]:
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

## Sentiment analysis using Embedding + LSTM layer

In [4]:
%%time

model = Sequential()
model.add(Embedding(top_words, embedding_vector_length,
                   input_length=max_review_length))
model.add(Dropout(0.2))
model.add(LSTM(100))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, batch_size=64)

scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.4f' % (scores[1]))
print()

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 500, 32)           160000    
_________________________________________________________________
dropout (Dropout)            (None, 500, 32)           0         
_________________________________________________________________
lstm (LSTM)                  (None, 100)               53200     
_________________________________________________________________
dropout_1 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense (Dense)                (None, 1)                 101       
Total params: 213,301
Trainable params: 213,301
Non-tra

## Sentiment analysis using Embedding + Conv1D + LSTM layer

In [5]:
%%time

model = Sequential()
model.add(Embedding(top_words, embedding_vector_length,
                   input_length=max_review_length))
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, batch_size=64)

scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.4f' % (scores[1]))
print()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 500, 32)           160000    
_________________________________________________________________
conv1d (Conv1D)              (None, 500, 32)           2080      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 250, 32)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 100)               53200     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 101       
Total params: 215,381
Trainable params: 215,381
Non-trainable params: 0
_________________________________________________________________
None
Train on 25000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
Accuracy: 0.8823

CPU times: user 12min 4s, sys: 2min 57

## Sentiment analysis using Embedding + Conv1D + Dense without LSTM layer

In [6]:
%%time

model = Sequential()
model.add(Embedding(top_words, embedding_vector_length,
                   input_length=max_review_length))
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(128, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, batch_size=64)

scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.4f' % (scores[1]))
print()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 500, 32)           160000    
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 500, 32)           2080      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 250, 32)           0         
_________________________________________________________________
flatten (Flatten)            (None, 8000)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 128)               1024128   
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                

## Sentiment analysis using Embedding + Conv1D without LSTM layer

In [7]:
%%time

model = Sequential()
model.add(Embedding(top_words, embedding_vector_length,
                   input_length=max_review_length))
model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))

print(model.summary())

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3, batch_size=64)

scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: %.4f' % (scores[1]))
print()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (None, 500, 32)           160000    
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 500, 32)           2080      
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 250, 32)           0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8000)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 8001      
Total params: 170,081
Trainable params: 170,081
Non-trainable params: 0
_________________________________________________________________
None
Train on 25000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
Accuracy: 0.8759

CPU times: user 45.1 s, sys: 8.35 s, t

# Conclusion

I've tried 4 different approaches. Results represented below.

|Name                      |Execution Time|Total params|Accuracy|
|--------------------------|--------------|------------|--------|
|Embedding + LSTM          |29min 10s     |213,301     |0.8550  |
|Embedding + Conv1D + LSTM |15min 1s      |215,381     |0.8823  |
|Embedding + Conv1D + Dense|1min 10s      |1,186,337   |0.8803  |
|Embedding + Conv1D        |53.4 s        |170,081     |0.8759  |

All in all the best accuracy achieved by next Architecture: **Embedding + Conv1D + LSTM**, but learning took 15 minutes which is significant amount of time as for me. However, **Embedding + Conv1D** achieved сomparatively good results using only 170'081 parameters and what is more learning took only 53 seconds.