# Techforum : Deep Learning (part 3/3)

## Convolutional Neural Networks (Convnets) in Keras

Objective:
- Now using Deep Learning to get better results at classifying MNIST

Note : this toy-example run faster on a GPU... and much much slower on CPU (~45 minutes on a laptop !)

Notebook inspired by : https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py
    

In [1]:
import tensorflow as tf

import timeit

# Use Tensorflow tutorial's helper to load/prepare the MNIST dataset
from tensorflow.examples.tutorials.mnist import input_data

# Import keras and even more Deep Learning / Convolutional Neural network buildling blocks
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D


Using TensorFlow backend.


### Load the MNIST dataset

In [2]:
# Import data (Thanks to helpers provided in Tensorflow tutorials !)
mnist = input_data.read_data_sets('./', one_hot=True)

Extracting ./train-images-idx3-ubyte.gz
Extracting ./train-labels-idx1-ubyte.gz
Extracting ./t10k-images-idx3-ubyte.gz
Extracting ./t10k-labels-idx1-ubyte.gz


In [3]:
# Define for convenience a few (Python/Numpy)variables to handle the dataset 
x_train = mnist.train.images
y_train = mnist.train.labels
x_test = mnist.test.images
y_test = mnist.test.labels

# Reshape the dataset to fit with the convolutional layers
# which expect at their input a 4D tensor with shape
# (samples, rows, cols, channels)
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

### Define some Hyperparameters for the network

In [4]:
# How fast the network will learn, by making more or less small updates during training
#    too low, and the network will take too much time to learn
#    too high, and the network might never converge to a solution
learning_rate = 0.001 

# Number of training epoch (in Keras : ~loop on the full training dataset)
epoch = 12

# Number of iamges to process per batch iteration
batch_size = 128

# Path to home of the Tensorboard logs and Training Checkpoints
logs_path = "./logs/mnist/Keras/ConvNet" 

# Not important : Just a counter to separate logs directory between each training experiments
experiments = 1

### Build the model : just by stacking the layers

In [5]:
# The Sequential model is a linear stack of layers
model = Sequential()

# 2D convolution layer (spatial convolution over images).
# This layer creates a convolution kernel that is convolved with the 
# layer input to produce a tensor of outputs.
model.add(Conv2D(filters=32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(28,28,1)))

# Apply a second layer of convolution
model.add(Conv2D(filters=64, kernel_size=(3, 3), 
                 activation='relu'))

# Sub-sampling : max pooling operation for spatial data.
model.add(MaxPooling2D(pool_size=(2, 2)))

# Add regularization : Dropout consists in randomly setting a fraction rate 
# of input units to 0 at each update during training time which helps prevent overfitting
model.add(Dropout(0.25))

# Flattens the input. Does not affect the batch size.
model.add(Flatten())

# Just a regular densely-connected NN layer, here with 128 neurons
# relu (rectified linear unit) is applying a non-linearity 
model.add(Dense(units=128, activation='relu'))

# Add regularization : Dropout consists in randomly setting a fraction rate 
# of input units to 0 at each update during training time which helps prevent overfitting
model.add(Dropout(rate=0.5))

# Just a regular densely-connected NN layer for the ouput of the network, here with 10 neurons
model.add(Dense(units=10, activation='softmax'))



In [6]:
# Cool tool to display information about the model we have built
model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 9216)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1179776   
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
__________

### Training in Keras : compile, fit (train), evaluate

In [7]:
tbCallBack = keras.callbacks.TensorBoard(log_dir=logs_path, histogram_freq=0, write_graph=True, write_images=False)

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

# Monitor execution time
start_time = timeit.default_timer()

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epoch,
          verbose=1,
          validation_data=(x_test, y_test),
          callbacks=[tbCallBack])

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

#### Training is done !

print("Execution time= %4f sec" % (timeit.default_timer() - start_time)) 

Train on 55000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
Test loss: 0.0285407811684
Test accuracy: 0.9906
Execution time= 118.637069 sec
