# Problem 3

Use this notebook to write your code for problem 3.

In [13]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## 3D - Convolutional network

As in problem 2, we have conveniently provided for your use code that loads, preprocesses, and deals with the uglies of the MNIST data.

In [14]:
# load MNIST data into Keras format
import keras
from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [15]:
# look at the shapes
print(x_train.shape)
print(x_test.shape)

(60000, 28, 28)
(10000, 28, 28)


In [16]:
# we'll need to one-hot encode the labels
y_train = keras.utils.np_utils.to_categorical(y_train)
y_test = keras.utils.np_utils.to_categorical(y_test)

In [17]:
# don't forget to NORMALIZE
x_train = np.divide(x_train, 255)
x_test = np.divide(x_test, 255)

In [18]:
# we must reshape the X data (add a channel dimension)
x_train = x_train.reshape(tuple(list(x_train.shape) + [1]))
x_test = x_test.reshape(tuple(list(x_test.shape) + [1]))

In [19]:
# look at the shapes
print(x_train.shape)
print(x_test.shape)

(60000, 28, 28, 1)
(10000, 28, 28, 1)


In [29]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers import Conv2D, MaxPooling2D, Flatten, BatchNormalization
from keras import regularizers

# sample model
# note: what is the difference between 'same' and 'valid' padding?
# Take a look at the outputs to understand the difference, or read the Keras documentation!
model = Sequential()
model.add(Conv2D(24, (3, 3), padding='same',
                 input_shape=(28, 28, 1)))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(Conv2D(24, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))

model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

In [30]:
# why don't we take a look at the layers and outputs
# note: `None` in the first dimension means it can take any batch_size!
for i in range(len(model.layers)):
    layer = model.layers[i]
    print(layer)
    print(layer.output_shape)

<keras.layers.convolutional.Conv2D object at 0x000001859C0F2400>
(None, 28, 28, 24)
<keras.layers.normalization.BatchNormalization object at 0x000001859BFC5B00>
(None, 28, 28, 24)
<keras.layers.core.Activation object at 0x000001859C0FDEB8>
(None, 28, 28, 24)
<keras.layers.convolutional.Conv2D object at 0x000001859C0FDF60>
(None, 28, 28, 24)
<keras.layers.normalization.BatchNormalization object at 0x000001859BFC5B70>
(None, 28, 28, 24)
<keras.layers.core.Activation object at 0x000001859C1239E8>
(None, 28, 28, 24)
<keras.layers.pooling.MaxPooling2D object at 0x000001859C1A8940>
(None, 14, 14, 24)
<keras.layers.core.Dropout object at 0x000001859C0F2630>
(None, 14, 14, 24)
<keras.layers.core.Flatten object at 0x000001859C202DD8>
(None, 4704)
<keras.layers.core.Dense object at 0x000001859C1F9CF8>
(None, 64)
<keras.layers.core.Activation object at 0x000001859C2B66A0>
(None, 64)
<keras.layers.core.Dense object at 0x000001859C2C7E10>
(None, 10)
<keras.layers.core.Activation object at 0x0000018

In [31]:
# our model has some # of parameters:
model.count_params()

307410

In [33]:
# For a multi-class classification problem
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

In [35]:
# Train the model, iterating on the data in batches of 32 samples
history = model.fit(x_train, y_train, epochs=10, batch_size=32,
                    validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Above, we output the training loss/accuracy as well as the validation (here, the TEST) loss and accuracy. To confirm that these are right, we can explicitly print out the training and test losses/accuracies.

In [36]:
# note that our model outputs two eval params:
# 1. loss (categorical cross-entropy)
# 2. accuracy
model.metrics_names

['loss', 'acc']

In [37]:
model.evaluate(x=x_train, y=y_train)



[0.018079070207305464, 0.99485]

In [38]:
model.evaluate(x=x_test, y=y_test)



[0.04054048867377223, 0.99]

**Set the probabilities of your dropout layers to 10 equally-spaced values p ∈ [0, 1], train for 1 epoch, and
report the final model accuracies for each**

In [40]:

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers import Conv2D, MaxPooling2D, Flatten, BatchNormalization
from keras import regularizers

dropouts = np.arange(0, 1, 0.1)

for d in dropouts:
    # sample model
    # note: what is the difference between 'same' and 'valid' padding?
    # Take a look at the outputs to understand the difference, or read the Keras documentation!
    model = Sequential()
    model.add(Conv2D(24, (3, 3), padding='same',
                     input_shape=(28, 28, 1)))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(Conv2D(24, (3, 3), padding='same'))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(d))

    model.add(Flatten())
    model.add(Dense(64))
    model.add(Activation('relu'))
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
        # For a multi-class classification problem
    model.compile(optimizer='rmsprop',
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    
        # Train the model, iterating on the data in batches of 32 samples
    history = model.fit(x_train, y_train, epochs=1, batch_size=32,
                        validation_data=(x_test, y_test))
    print('Dropout rate: {}'.format(d))
    print(model.evaluate(x=x_test, y=y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.0
[0.05165868423196953, 0.9834]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.1
[0.068304227259435, 0.9818]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.2
[0.05782625220362097, 0.982]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.30000000000000004
[0.06246391470642411, 0.9805]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.4
[0.05893735353499651, 0.9824]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.5
[0.05952274547966663, 0.981]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.6000000000000001
[0.06562029218955431, 0.9797]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.7000000000000001
[0.06480238866824074, 0.9805]
Train on 60000 samples, validate on 10000 samples
Epoch 1/1
Dropout rate: 0.8
[0.0772786666626