# A Keras Convolutional Neural Network on CIFAR-10

CIFAR-10 and CIFAR-100 are benchmark image datasets for computer vision. They consist of 32x32 images from 10 and 100 different classes respectively. In this tutorial, you will build a basic Convolutional Neural Network on the CIFAR-10 dataset from scratch. Along the way you will get more familiar with Keras and learn hwo to manipulate network weights, layers and models. Those are essential skills when you are designing or debugging your own networks. 

For complete code, please refer to Keras's github example file https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py

## Step 1.  Data aquisition and hyperparameter setting

In [2]:
import tensorflow as tf 
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
%matplotlib inline

### Keras dependencies 
When using Keras for machine learning, you often want to import some essential Keras models, layers and utilities (for data loading, model saveing and retrieving). You are not required to do so as long as you import the keras module, but explicitly importing those dependencies helps you reduce the overwhelming "keras." prefix to keep your code clean and organized.

In [3]:
import keras
# Keras has built-in functions for aquiring the cifar10 and cifar100 datasets 
from keras.datasets import cifar10 
# Keras model and utilities  
from keras.models import Sequential 
from keras.models import model_from_json
# Keras layers 
from keras.layers import Dense, Activation, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D

Using TensorFlow backend.


### Hyperparameters
It's a common practice to set your hyperparameters before building the model. Typical hyperparamets are number of epochs, batch size, number of class labels (for classification) and dropout rates (if you are using any droupout layer in the network)

In [4]:
batch_size = 32
num_classes = 10
epochs = 1

### CIFAR-10 data
The "load_data" function is in keras's datasets/cifar10.py file. It downloads the image dataset online, preprocesss it (mainly merging data zip files) and returns a tuple of ready-to-use training set and test set. 

In [5]:
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


The default image class labels are encoded in single integers. Use the "keras.utils.to_categorical" function to change them into one-hot encoding. 

In [6]:
# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

print("y_train shape:", y_train.shape)

y_train shape: (50000, 10)


## Step 2. Build the model 

### Instantiate a model
First step in building a Keras neural net model is always to instantiate a model (think of it as a wrapper/container of the network layers). Here we are using a Sequential model. 

In [7]:
model = Sequential()

### Add layers 
Now we can add layers to the model. Remeber we do it by calling the add method to our instantiated model with layer instances as arguments.  

***
Our model architecture should consist of the following:
- ***1st Convolutional Block***
    - 2D Convolutional layer (with zero paddings, 32 3x3 filters)
    - relu activation 
    - 2D Convolutional layer (without paddings, 32 3x3 filters)
    - relu activation
    - max pooling layer (2x2 pool size)
    - drouput layer (0.25 dropout rate)
- ***2nd Convolutional Block*** 
    - 2D Convolutional layer (with zero paddings, 64 3x3 filters)
    - relu activation 
    - 2D Convolutional layer (without paddings, 64 3x3 filters)
    - relu activation
    - max pooling layer (2x2 pool size)
    - drouput layer (0.25 dropout rate)
- ***Output Block***
    - flatten layer
    - fully-connected layer (512 output size)
    - relu activation
    - dropout layer (0.5 dropout rate)
    - fully-connected layer (10 output size)
    - softmax activation / output layer
***
    
Use "***model.summary()***" function to check your model architecture at the end. 

In [8]:
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

In [9]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
activation_1 (Activation)    (None, 32, 32, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 30, 30, 32)        9248      
_________________________________________________________________
activation_2 (Activation)    (None, 30, 30, 32)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 15, 15, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 64)        18496     
__________

## Step 3. Train the model

Model compilation builds the computational graph and model fitting trains the model with the given data batches.

### Compile model

Use RMSprop as the model optimizer, with a $10^{-4}$ learning rate and $10^{-6}$ decay/dicount rate.  

***Note:*** You can customize your own optimizer (along with its learning rate and other optimization parameters) for model compilation. Refer to https://keras.io/optimizers/ for other optimization methods in Keras.

In [10]:
# initiate RMSprop optimizer
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)

Use cross entropy for loss function and accuracy as training metric. Refer to ... for other variants.

In [11]:
model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

### Start training 

Augment your image data (training and testing ) by standardization (divide by its scale in the context).  

*** Note: *** Standardization/Normalization = Zero centering + Unit rescaling 

In [12]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
x_train.shape

(50000, 32, 32, 3)

Fit the model with the first 20000 training samples, use test data for validation, enable data shuffling. 

In [11]:
model.fit(x_train[:20000], y_train[:20000],
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(x_test, y_test),
          shuffle=True)

Not using data augmentation.
Train on 20000 samples, validate on 10000 samples
Epoch 1/1


<keras.callbacks.History at 0x11d025c88>

## Step 4. Retrieve layers & weights

### Get model layers 
Retrieve all the layers in the model and print them out.

In [28]:
layers = model.layers

for layer in layers:
    print("Layer name: " + layer.name)
    print(str(layer) + "\n") 

Layer name: conv2d_1
<keras.layers.convolutional.Conv2D object at 0x122e18ba8>

Layer name: activation_1
<keras.layers.core.Activation object at 0x122e18b70>

Layer name: conv2d_2
<keras.layers.convolutional.Conv2D object at 0x122e18f28>

Layer name: activation_2
<keras.layers.core.Activation object at 0x122e18f60>

Layer name: max_pooling2d_1
<keras.layers.pooling.MaxPooling2D object at 0x1150f1198>

Layer name: dropout_1
<keras.layers.core.Dropout object at 0x1150f1d68>

Layer name: conv2d_3
<keras.layers.convolutional.Conv2D object at 0x122e527b8>

Layer name: activation_3
<keras.layers.core.Activation object at 0x122e523c8>

Layer name: conv2d_4
<keras.layers.convolutional.Conv2D object at 0x122e80a90>

Layer name: activation_4
<keras.layers.core.Activation object at 0x122e806a0>

Layer name: max_pooling2d_2
<keras.layers.pooling.MaxPooling2D object at 0x122e96d30>

Layer name: dropout_2
<keras.layers.core.Dropout object at 0x122e96ac8>

Layer name: flatten_1
<keras.layers.core.Fla

Retrieve an individual layer (for example an convolutional layer) and print it out.  

It's a good practice to build a dictionary of model layers indexed by their names. Build one and print it out.

In [35]:
# Retrieving individual layers
conv_layer_example = layers[0]
print(conv_layer_example.name + " layer at " + str(conv_layer_example) + "\n")

# layer dictionary
layer_dict = {layer.name:layer for layer in layers}

for layer_name in layer_dict:
    print(layer_name + " : " + str(layer_dict[layer_name]) + "\n")

conv2d_1 layer at <keras.layers.convolutional.Conv2D object at 0x122e18ba8>

flatten_1 : <keras.layers.core.Flatten object at 0x124ff74e0>

dropout_1 : <keras.layers.core.Dropout object at 0x1150f1d68>

conv2d_3 : <keras.layers.convolutional.Conv2D object at 0x122e527b8>

dense_1 : <keras.layers.core.Dense object at 0x125001940>

activation_1 : <keras.layers.core.Activation object at 0x122e18b70>

activation_3 : <keras.layers.core.Activation object at 0x122e523c8>

conv2d_4 : <keras.layers.convolutional.Conv2D object at 0x122e80a90>

activation_5 : <keras.layers.core.Activation object at 0x125001d30>

conv2d_1 : <keras.layers.convolutional.Conv2D object at 0x122e18ba8>

activation_2 : <keras.layers.core.Activation object at 0x122e18f60>

dropout_3 : <keras.layers.core.Dropout object at 0x125029f98>

max_pooling2d_1 : <keras.layers.pooling.MaxPooling2D object at 0x1150f1198>

max_pooling2d_2 : <keras.layers.pooling.MaxPooling2D object at 0x122e96d30>

activation_4 : <keras.layers.core.A

### Get layer weights 
Retrieve leanrt layer parameters and verify their shapes. 

In [37]:
# use the get_weights() method from layer objects to retrieve learnt layer parameters
conv1_params = layers[0].get_weights()
print(len(conv1_params))

conv1_weights = conv1_params[0]
conv1_biases = conv1_params[1]
print("conv1 weight shape: " + str(conv1_weights.shape))
print("conv1 bias shape: " + str(conv1_biases.shape))

2
conv1 weight shape: (3, 3, 3, 32)
conv1 bias shape: (32,)


## Step 5. Save the model & trained weights 

Typically you will want to save the model architecture and trained weights so that you can reuse them next time (either reusing only the model architecture with new weights or reusing the whole trained model). A Keras model can be saved in severl ways:
- save model + weights to HDF5 file(s)
- save model architecture to JSON file(s)
- save model parameters/weights to HDF5 file(s)  

*** Notes: *** HDF5 is an efficient file storage format using hierarchy data structures, especially for scientific data (usually a large amount). Please refer to https://support.hdfgroup.org/HDF5/whatishdf5.html for more details.  

In [None]:
# save the whole model (model architecture + trained weights)
model.save("whole_model.h5")

In [16]:
# serialize model to JSON
model_json = model.to_json()
with open("model_architecture.json", "w") as json_file:
    json_file.write(model_json)
    
# serialize weights to HDF5
model.save_weights("model_weights.h5")

print("Saved model to disk")

Saved model to disk


## Step 6. Reload model & weights 
Depending on how you save your model and weights, there are different procedures to reload/reuse your trained model  accordingly. 
- reload the entire model 
- load model architecture 
- load model weights 

In [None]:
# reload the entire model 
whole_model = load_model('whole_model.h5')

In [None]:
# load json and create model
json_file = open('model_architecture.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)

# load weights into new model
loaded_model.load_weights("model_weights.h5")

print("Loaded model from disk")