Tasks:

You used two hidden layers. Try using one or three hidden layers, and see how doing so affects validation and test accuracy.

Try using layers with more hidden units or fewer hidden units: 32 units, 64 units, and so on.

Try using the mse loss function instead of binary_crossentropy.

Try using the tanh activation (an activation that was popular in the early days of neural networks) instead of relu.

Download a dataset

In [3]:
from keras.datasets import imdb

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(
    num_words=10000)

Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz


In [4]:
train_data[0][:15]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65, 458, 4468, 66, 3941, 4]

Vectorizing data

In [5]:
import numpy as np

def vectorize_sequences(sequences, dimension=10000):
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        results[i, sequence] = 1.
    return results

x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)

In [6]:
y_train = np.asarray(train_labels).astype('float32')
y_test = np.asarray(test_labels).astype('float32')

Building a network

with 1 layer

In [7]:
from keras import models
from keras import layers

model_one = models.Sequential()
model_one.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model_one.add(layers.Dense(1, activation='sigmoid'))


Validation

In [12]:
x_val = x_train[:5000]
partial_x_train = x_train[5000:]
y_val = y_train[:5000]
partial_y_train = y_train[5000:]

In [13]:
from keras import optimizers
from keras import losses
from keras import metrics

model_one.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

In [14]:
history_one = model_one.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [15]:
results_one = model_one.evaluate(x_test, y_test)



In [16]:
print('Test accuracy: {}'.format(results_one[1]))

Test accuracy: 0.85428


2 hidden layers

In [17]:
model_two = models.Sequential()
model_two.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model_two.add(layers.Dense(16, activation='relu'))
model_two.add(layers.Dense(1, activation='sigmoid'))

model_two.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_two = model_two.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_two = model_two.evaluate(x_test, y_test)

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [18]:
print('Test accuracy: {}'.format(results_two[1]))

Test accuracy: 0.84988


3 hidden layers

In [19]:
model_three = models.Sequential()
model_three.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model_three.add(layers.Dense(16, activation='relu'))
model_three.add(layers.Dense(16, activation='relu'))
model_three.add(layers.Dense(1, activation='sigmoid'))

model_three.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_three = model_three.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_three = model_three.evaluate(x_test, y_test)

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [20]:
print('Test accuracy: {}'.format(results_three[1]))

Test accuracy: 0.85052


A more accurate is with 1 hidden unit

Number of hidden units

1 layer, 32 units

In [21]:
model_one = models.Sequential()
model_one.add(layers.Dense(32, activation='relu', input_shape=(10000,)))
model_one.add(layers.Dense(1, activation='sigmoid'))

model_one.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_one = model_one.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_one = model_one.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_one[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.85132


1 layer, 64 units

In [22]:
model_one = models.Sequential()
model_one.add(layers.Dense(64, activation='relu', input_shape=(10000,)))
model_one.add(layers.Dense(1, activation='sigmoid'))

model_one.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_one = model_one.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_one = model_one.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_one[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.84976


 A model with 32 units works better

3 layers, 32 units

In [23]:
model_three = models.Sequential()
model_three.add(layers.Dense(32, activation='relu', input_shape=(10000,)))
model_three.add(layers.Dense(32, activation='relu'))
model_three.add(layers.Dense(32, activation='relu'))
model_three.add(layers.Dense(1, activation='sigmoid'))

model_three.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_three = model_three.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_three = model_three.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_three[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.85284


3 layers, 64 units

In [24]:
model_three = models.Sequential()
model_three.add(layers.Dense(64, activation='relu', input_shape=(10000,)))
model_three.add(layers.Dense(64, activation='relu'))
model_three.add(layers.Dense(64, activation='relu'))
model_three.add(layers.Dense(1, activation='sigmoid'))

model_three.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.binary_crossentropy,
              metrics=[metrics.binary_accuracy])

history_three = model_three.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_three = model_three.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_three[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.84632


Again, a model with 32 units works better

Loss function for 1 layer with 16 units

In [25]:
model_one = models.Sequential()
model_one.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model_one.add(layers.Dense(1, activation='sigmoid'))

model_one.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.mse,
              metrics=[metrics.binary_accuracy])

history_one = model_one.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_one = model_one.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_one[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.85456


A bit but an accurasy arises

try different activation, again 1 layer with 16 units

In [26]:
model_one = models.Sequential()
model_one.add(layers.Dense(16, activation='tanh', input_shape=(10000,)))
model_one.add(layers.Dense(1, activation='sigmoid'))

model_one.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.mse,
              metrics=[metrics.binary_accuracy])

history_one = model_one.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

results_one = model_one.evaluate(x_test, y_test)
print('Test accuracy: {}'.format(results_one[1]))

Train on 20000 samples, validate on 5000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test accuracy: 0.8506


'Relu' works a bit better

Try using larger or smaller layers: 32 units, 128 units, and so on.

You used two hidden layers. Now try using a single hidden layer, or three hidden layers.

In [27]:
from keras.datasets import reuters

(train_data, train_labels), (test_data, test_labels) = reuters.load_data(
    num_words=10000)

def vectorize_sequences(sequences, dimension=10000):
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        results[i, sequence] = 1.
    return results

x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)

from keras.utils.np_utils import to_categorical

one_hot_train_labels = to_categorical(train_labels)
one_hot_test_labels = to_categorical(test_labels)

Downloading data from https://s3.amazonaws.com/text-datasets/reuters.npz


In [28]:
x_val = x_train[:5000]
partial_x_train = x_train[5000:]

y_val = one_hot_train_labels[:5000]
partial_y_train = one_hot_train_labels[5000:]

In [29]:
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.categorical_crossentropy,
              metrics=[metrics.binary_accuracy])

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=9,
                    batch_size=512,
                    validation_data=(x_val, y_val)
                   )

results = model.evaluate(x_test, one_hot_test_labels)
results[1]

Train on 3982 samples, validate on 5000 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


0.989962844474976

Using 32 units

In [30]:
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.categorical_crossentropy,
              metrics=[metrics.binary_accuracy])

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=9,
                    batch_size=512,
                    validation_data=(x_val, y_val)
                   )

results = model.evaluate(x_test, one_hot_test_labels)
results[1]

Train on 3982 samples, validate on 5000 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


0.9894595354238155

Using 128 units

In [31]:
model = models.Sequential()
model.add(layers.Dense(128, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.categorical_crossentropy,
              metrics=[metrics.binary_accuracy])

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=9,
                    batch_size=512,
                    validation_data=(x_val, y_val)
                   )

results = model.evaluate(x_test, one_hot_test_labels)
results[1]

Train on 3982 samples, validate on 5000 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


0.9901370675247477

The model with 128 units shows better result but let's use the model with 32 units

For 1 layer

In [32]:
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.categorical_crossentropy,
              metrics=[metrics.binary_accuracy])

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=9,
                    batch_size=512,
                    validation_data=(x_val, y_val)
                   )

results = model.evaluate(x_test, one_hot_test_labels)
results[1]

Train on 3982 samples, validate on 5000 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


0.9895079328561615

For 3 layers

In [33]:
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=0.001),
              loss=losses.categorical_crossentropy,
              metrics=[metrics.binary_accuracy])

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=9,
                    batch_size=512,
                    validation_data=(x_val, y_val)
                   )

results = model.evaluate(x_test, one_hot_test_labels)
results[1]

Train on 3982 samples, validate on 5000 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9


0.9894788942481404