Initial configuration produced following result >
Test score: 0.6876810162913799
Test accuracy: 0.83668

An additional initial dense hidden layer is now added, as instructed

In [2]:
from __future__ import print_function
import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.preprocessing import sequence
from keras.models import Model
from keras.layers import Dense, Activation, Embedding, Flatten, Input, PReLU
from keras.datasets import imdb

max_features = 20000
maxlen = 80  # cut texts after this number of words (among top max_features most common words)
batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('Pad sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')


inputs = Input(shape=(maxlen,))
x = inputs
x = Embedding(max_features, 128, dropout=0.2)(x)
x = Dense(64)(x)
x = PReLU()(x) # Non-linearity
x = Flatten()(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)


model = Model(input=inputs, output=predictions)
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.summary()

print('Train...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=15,
          verbose=0, validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

Loading data...




25000 train sequences
25000 test sequences
Pad sequences (samples x time)
X_train shape: (25000, 80)
X_test shape: (25000, 80)
Build model...
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 80)                0         
_________________________________________________________________
embedding_2 (Embedding)      (None, 80, 128)           2560000   
_________________________________________________________________
dense_2 (Dense)              (None, 80, 64)            8256      
_________________________________________________________________
p_re_lu_1 (PReLU)            (None, 80, 64)            5120      
_________________________________________________________________
flatten_1 (Flatten)          (None, 5120)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 5121      




Test score: 1.1136203831624985
Test accuracy: 0.82076


The accuracy has reduced slightly (by around 1.5%).
The above is run again with dropout added after the initial dense layer 
(only changes to the layer code are now shown, for explanatory purposes) >

In [None]:
inputs = Input(shape=(maxlen,))
x = inputs
x = Embedding(max_features, 128, dropout=0.2)(x)
x = Dense(64)(x)
x = PReLU()(x) # Non-linearity
x = Dropout(rate=0.2)(x)
x = Flatten()(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)

Test score: 1.2429044671200962
Test accuracy: 0.80916

The accuracy has reduced slightly again (by around 1.1%).

The above is run again with this inital dense layer REPLACED by a 1-D convolutional layer followed by 1-D max pooling
(only changes to the layer code are shown, for explanatory purposes) >

In [None]:
inputs = Input(shape=(maxlen,))
x = inputs
x = Embedding(max_features, 128, dropout=0.2)(x)
x = Conv1D(128, kernel_size=4, activation='relu')(x)
x = MaxPooling1D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)

In [None]:
Layer (type)                 Output Shape              Param #   
=================================================================
input_14 (InputLayer)        (None, 80)                0         
_________________________________________________________________
embedding_4 (Embedding)      (None, 80, 128)           2560000   
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 77, 128)           65664     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 38, 128)           0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 4864)              0         
_________________________________________________________________
dense_34 (Dense)             (None, 1)                 4865      
_________________________________________________________________
activation_3 (Activation)    (None, 1)                 0         
=================================================================
Total params: 2,630,529
Trainable params: 2,630,529
Non-trainable params: 0

Test score: 1.2012744303126632
Test accuracy: 0.82928

This has improved the accuracy quite significantly (by around 2%).

The convolutional layer is onw replaced with a recurrent LSTM layer
(only changes to the layer code are shown, for explanatory purposes) >

In [None]:
inputs = Input(shape=(maxlen,))
x = inputs
x = Embedding(max_features, 128, dropout=0.2)(x)
x = LSTM(128, dropout_W=0.2, dropout_U=0.2)(x)
x = Dense(1)(x)
predictions = Activation("sigmoid")(x)

In [None]:
Layer (type)                 Output Shape              Param #   
=================================================================
input_15 (InputLayer)        (None, 80)                0         
_________________________________________________________________
embedding_5 (Embedding)      (None, 80, 128)           2560000   
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dense_35 (Dense)             (None, 1)                 129       
_________________________________________________________________
activation_4 (Activation)    (None, 1)                 0         
=================================================================
Total params: 2,691,713
Trainable params: 2,691,713
Non-trainable params: 0

Test score: 1.1564051804107427
Test accuracy: 0.81024