In [1]:
# One dimensional Convolutional Model for IMDB Movie Review
# Source: https://machinelearningmastery.com/predict-sentiment-movie-reviews-using-deep-learning/
# Please refer to other Notebook on loading & preparing the IMDB dataset to view the data
# Load keras, modules and IMDB dataset 
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence

Using TensorFlow backend.


In [2]:
# load the dataset but only keep the top n words, zero the rest
# Create Train & Test sets
top_words = 5000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
# pad (create a 2D numpy array of dataset to a maximum review length of 500 words
max_words = 500
X_train = sequence.pad_sequences(X_train, maxlen=max_words)
X_test = sequence.pad_sequences(X_test, maxlen=max_words)

In [None]:
'''We can now define our convolutional neural network model. 
This time, after the Embedding input layer, we insert a Conv1D layer. 
This convolutional layer has 32 feature maps and reads embedded word 
representations 3 vector elements of the word embedding at a time.
The convolutional layer is followed by a 1D max pooling layer with a 
length and stride of 2 that halves the size of the feature maps from 
the convolutional layer. The rest of the network is the same as the neural network above.'''

In [3]:
# create the model
model = Sequential()
model.add(Embedding(top_words, 32, input_length=max_words))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(250, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 500, 32)           160000    
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 500, 32)           3104      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 250, 32)           0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8000)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 250)               2000250   
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 251       
Total params: 2,163,605
Trainable params: 2,163,605
Non-trainable params: 0
____________________________________________

In [4]:
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=128, verbose=2)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))



Train on 25000 samples, validate on 25000 samples
Epoch 1/2
 - 40s - loss: 0.4484 - accuracy: 0.7603 - val_loss: 0.2747 - val_accuracy: 0.8845
Epoch 2/2
 - 38s - loss: 0.2172 - accuracy: 0.9141 - val_loss: 0.2793 - val_accuracy: 0.8840
Accuracy: 88.40%


In [None]:
'''Running the example, we are first presented with a summary of the network structure. 
We can see our convolutional layer preserves the dimensionality of our Embedding input 
layer of 32-dimensional input with a maximum of 500 words. The pooling layer compresses 
this representation by halving it.

Running the example offers a small but welcome improvement over the neural network model above 
with an accuracy of nearly 88%.

Note: Your specific results may vary given the stochastic nature of the learning algorithm. 
Consider running the example a few times and compare the average performance.'''