# 2. Develop a CNN model for solving the CIFAR10 dataset classification problem.

* The CIFAR-10 data consists of 60,000 (32×32) color images in 10 classes, with 6000 images per class.
<img src="./CIFAR10.jpeg" width="442" />

* Ref : https://appliedmachinelearning.blog/2018/03/24/achieving-90-accuracy-in-object-recognition-task-on-cifar-10-dataset-with-keras-convolutional-neural-networks/

## EDA (Exploratory Data Analysis)

In [6]:
from keras.datasets import cifar10
import pandas as pd

In [7]:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [8]:
print x_train.shape
print x_test.shape

(50000, 32, 32, 3)
(10000, 32, 32, 3)


In [9]:
d = pd.DataFrame(y_train)
d[0].value_counts()

9    5000
8    5000
7    5000
6    5000
5    5000
4    5000
3    5000
2    5000
1    5000
0    5000
Name: 0, dtype: int64

## Objective : Classify the given image into one of 10 classes
* <h4>Class labels</h4>

     * airplane : 0
     * automobile : 1
     * bird : 2
     * cat : 3
     * deer : 4
     * dog : 5
     * frog : 6
     * horse : 7
     * ship : 8
     * truck : 9
* <h4>Image</h4> 
        32 * 32 containing 3 values corresponding to each color component,
        shape - (32,32,3)
* <h4>There are 50,000 training images and 10,000 test images .</h4>  
* <h4>Train Data</h4>
        Images in 10 classes, with 5000 images per class. 

## Keras Implementation

In [10]:
import keras
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D
from keras import backend as bk
from keras.callbacks import ModelCheckpoint

import os.path

In [11]:
from keras import regularizers
from keras.layers import Activation,BatchNormalization

In [12]:
# Path to saved model weights(as hdf5)
resume_weights = "./models/cifar10-cnn-best.hdf5"

# Hyper-parameters
batch_size = 128
num_classes = 10
epochs = 5

# input image dimensions
img_rows, img_cols = 32, 32

In [13]:
# Reshape strategy according to backend
if bk.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 3, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 3, img_rows, img_cols)
    # 3 x 32 x 32 [number_of_channels (colors) x height x weight]
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 3)
    # 32 x 32 x 3 [height x weight x number_of_channels (colors)]
    input_shape = (img_rows, img_cols, 3)
    
print x_train.shape  
print input_shape

(50000, 32, 32, 3)
(32, 32, 3)


In [14]:
print x_test.max()
print x_test.min()

255
0


In [15]:
# Reshape, type, normalized, print
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In [16]:
print x_test.max()
print x_test.min()

1.0
0.0


In [17]:
# Dataset info
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
print input_shape

('x_train shape:', (50000, 32, 32, 3))
(50000, 'train samples')
(10000, 'test samples')
(32, 32, 3)


In [18]:
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print y_test.shape

(10000, 10)


In [19]:
# MODEL
# Conv(32,3,3)[ReLU] -> Conv(64,3,3)[ReLU] -> MaxPool(2,2)[Dropout 0.25] ->FC(_, 128)[ReLU][Dropout 0.5] -> FC(128, 10)[Softmax]
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 30, 30, 32)        896       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 14, 14, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 12544)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1605760   
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
__________

In [20]:
# model to TF graph 
import tensorflow as tf

writer = tf.summary.FileWriter('./cifar10/tb1',bk.get_session().graph)
writer.close()

* TF graph
<img src="./cifar10/cifar10_tfgraph.png"/>

In [139]:
# If exists a best model, load its weights!
if os.path.isfile(resume_weights):
        print ("Resumed model's weights from {}".format(resume_weights))
        # load weights
        model.load_weights(resume_weights)

# CEE, Adam
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])    

Resumed model's weights from ./models/cifar10-cnn-best.hdf5


In [140]:
# Checkpoint In the folder
filepath = resume_weights

# Keep only a single checkpoint, the best over test accuracy.
checkpoint = ModelCheckpoint(filepath,monitor='val_acc',verbose=1,save_best_only=True,mode='max')

In [141]:
# Train
model.fit(x_train, y_train,	batch_size=batch_size,
                epochs=epochs,
                verbose=1,
                validation_data=(x_test, y_test),
                callbacks=[checkpoint])

# Eval
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Train on 50000 samples, validate on 10000 samples
Epoch 1/5

Epoch 00001: val_acc improved from -inf to 0.59880, saving model to ./models/cifar10-cnn-best.hdf5
Epoch 2/5

Epoch 00002: val_acc improved from 0.59880 to 0.61380, saving model to ./models/cifar10-cnn-best.hdf5
Epoch 3/5

Epoch 00003: val_acc improved from 0.61380 to 0.65570, saving model to ./models/cifar10-cnn-best.hdf5
Epoch 4/5

Epoch 00004: val_acc improved from 0.65570 to 0.66280, saving model to ./models/cifar10-cnn-best.hdf5
Epoch 5/5

Epoch 00005: val_acc improved from 0.66280 to 0.67710, saving model to ./models/cifar10-cnn-best.hdf5
('Test loss:', 0.9317553981781006)
('Test accuracy:', 0.67710000000000004)
