# CIFAR-10

Link: https://www.cs.toronto.edu/~kriz/cifar.html

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. 

The classes are completely mutually exclusive. There is no overlap between automobiles and trucks. "Automobile" includes sedans, SUVs, things of that sort. "Truck" includes only big trucks. Neither includes pickup trucks.

In [1]:
# all the imports
import tensorflow.keras as keras
import numpy as np
import glob

In [2]:
# variables
num_classes = 10

## Downloading and Preparing the Dataset
The archive contains the files data_batch_1, data_batch_2, ..., data_batch_5, as well as test_batch. Each of these files is a Python "pickled" object produced with Pickle.

Loaded in this way, each of the batch files contains a dictionary with the following elements:
- data -- a 10000x3072 numpy array of uint8s. Each row of the array stores a 32x32 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.
- labels -- a list of 10000 numbers in the range 0-9. The number at index i indicates the label of the ith image in the array data.

In [3]:
# get all the dataset file names
filenames_train = np.array(glob.glob("./cifar-10-batches-py/data_batch_*"), dtype=str)
filename_test = "./cifar-10-batches-py/test_batch"
filename_label = "./cifar-10-batches-py/batches.meta"

In [4]:
filenames_train, filename_test, filename_label

(array(['./cifar-10-batches-py/data_batch_2',
        './cifar-10-batches-py/data_batch_1',
        './cifar-10-batches-py/data_batch_5',
        './cifar-10-batches-py/data_batch_4',
        './cifar-10-batches-py/data_batch_3'], dtype='<U34'),
 './cifar-10-batches-py/test_batch',
 './cifar-10-batches-py/batches.meta')

In [5]:
# unpickle the data and return the dictionary containing the data and label
def unpickle(file):
    import pickle
    with open(file, 'rb') as f:
        dict = pickle.load(f, encoding='bytes')
    return dict

In [6]:
train_images = []
labels = []

In [7]:
for file in filenames_train:
    cifar = unpickle(file)
    train_images.extend(cifar[b'data'])
    labels += cifar[b'labels']

In [8]:
X_train = np.array(train_images)
X_train = X_train.reshape(-1, 32, 32, 3)
X_train = X_train / 255.0
X_train.shape

(50000, 32, 32, 3)

In [9]:
y_train = keras.utils.to_categorical(labels, num_classes)

In [10]:
X_train[94], y_train[94]

(array([[[0.17254902, 0.19215686, 0.20784314],
         [0.21176471, 0.20392157, 0.20392157],
         [0.20392157, 0.20392157, 0.2       ],
         ...,
         [0.99607843, 0.99607843, 0.99607843],
         [0.99607843, 0.99607843, 0.99607843],
         [0.99607843, 0.99607843, 0.99607843]],
 
        [[0.21960784, 0.23137255, 0.23921569],
         [0.24705882, 0.25098039, 0.24705882],
         [0.24705882, 0.23137255, 0.25490196],
         ...,
         [0.91764706, 0.99215686, 1.        ],
         [1.        , 0.99607843, 0.99215686],
         [0.99215686, 1.        , 1.        ]],
 
        [[0.04705882, 0.05490196, 0.05882353],
         [0.07058824, 0.09019608, 0.09803922],
         [0.12156863, 0.16078431, 0.2       ],
         ...,
         [0.09019608, 0.08627451, 0.06666667],
         [0.07058824, 0.11764706, 0.25098039],
         [0.47058824, 0.69019608, 0.84313725]],
 
        ...,
 
        [[0.08235294, 0.05490196, 0.03137255],
         [0.03529412, 0.10980392, 0.14117

In [11]:
test_images = []
test_labels = []

In [12]:
cifar = unpickle(filename_test)
test_images = cifar[b'data']
test_labels = cifar[b'labels']

In [13]:
X_test = np.array(test_images, dtype=float)
X_test = X_test.reshape(-1, 32, 32, 3)
X_test = X_test / 255.0
X_test.shape

(10000, 32, 32, 3)

In [14]:
y_test = keras.utils.to_categorical(test_labels, num_classes)

In [15]:
X_test[94], y_test[94]

(array([[[0.24705882, 0.22745098, 0.27843137],
         [0.3372549 , 0.3372549 , 0.34117647],
         [0.34509804, 0.34117647, 0.36470588],
         ...,
         [0.34509804, 0.27058824, 0.30196078],
         [0.28627451, 0.29019608, 0.30980392],
         [0.31372549, 0.3254902 , 0.31372549]],
 
        [[0.30588235, 0.28627451, 0.34117647],
         [0.35294118, 0.43137255, 0.43529412],
         [0.36470588, 0.34901961, 0.36862745],
         ...,
         [0.50588235, 0.38823529, 0.38823529],
         [0.31372549, 0.32156863, 0.35686275],
         [0.30980392, 0.30196078, 0.3254902 ]],
 
        [[0.35294118, 0.3254902 , 0.30588235],
         [0.38039216, 0.40784314, 0.41960784],
         [0.40784314, 0.43137255, 0.37254902],
         ...,
         [0.49411765, 0.37254902, 0.32156863],
         [0.30196078, 0.25098039, 0.30980392],
         [0.28627451, 0.32156863, 0.36470588]],
 
        ...,
 
        [[0.17254902, 0.16078431, 0.20392157],
         [0.2       , 0.21568627, 0.20392

## Convolution Neural Network Model
We'll implement a Deep CNN Model with regularization techniques

In [16]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Input
from tensorflow.keras.models import Model

In [17]:
# define the model
x = Input(shape=(32, 32, 3, ))
conv1 = Conv2D(64, (3,3), padding='same', activation='relu')(x)
conv2 = Conv2D(64, (3,3), padding='same', activation='relu')(conv1)
conv2_maxpool = MaxPooling2D(pool_size=(2,2))(conv2)
conv2_maxpool_dropout = Dropout(0.25)(conv2_maxpool)

conv3 = Conv2D(128, (3,3), padding='same', activation='relu')(conv2_maxpool_dropout)
conv4 = Conv2D(128, (3,3), padding='same', activation='relu')(conv3)
conv4_maxpool = MaxPooling2D(pool_size=(2,2))(conv4)
conv4_maxpool_dropout = Dropout(0.25)(conv4_maxpool)

conv5 = Conv2D(256, (3,3), padding='same', activation='relu')(conv4_maxpool_dropout)
conv6 = Conv2D(256, (3,3), padding='same', activation='relu')(conv5)
conv6_maxpool = MaxPooling2D(pool_size=(2,2))(conv6)
conv6_maxpool_dropout = Dropout(0.25)(conv6_maxpool)

flatten = Flatten()(conv6_maxpool_dropout)
fc1 = Dense(4096, activation='relu')(flatten)
fc1_dropout = Dropout(0.5)(fc1)
fc2 = Dense(1024, activation='relu')(fc1_dropout)
fc3 = Dense(num_classes, activation='softmax')(fc2)

model = Model(inputs=x, outputs=fc3)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 32, 32, 64)        1792      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 64)        36928     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 128)       73856     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 16, 16, 128)       147584    
__________

## Compile Model
Compile the model with
1. loss function as categorical crossentropy
2. optimizer as Adam
3. metric as accuracy

In [18]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

### Setup Callbacks

In [19]:
n_epochs = 50
batch_size = 512

In [20]:
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard

filepath="/media/amol/WINUX/CIFAR-10-{epoch:02d}-{val_loss:.4f}.h5"
tensorboard = TensorBoard(log_dir='/tmp/cifar-10-final', batch_size=batch_size, write_graph=True, histogram_freq=0)
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, mode='min')
callbacks_list = [checkpoint, tensorboard]

Finally, it's time to train our model

In [21]:
model.fit(X_train, y_train, epochs=n_epochs, verbose=1, batch_size=batch_size, callbacks=callbacks_list, shuffle=True, validation_data=(X_test, y_test))

Train on 50000 samples, validate on 10000 samples
Epoch 1/50

Epoch 00001: saving model to /media/amol/WINUX/CIFAR-10-01-1.7260.h5
Epoch 2/50

Epoch 00002: saving model to /media/amol/WINUX/CIFAR-10-02-1.4844.h5
Epoch 3/50

Epoch 00003: saving model to /media/amol/WINUX/CIFAR-10-03-1.4085.h5
Epoch 4/50

Epoch 00004: saving model to /media/amol/WINUX/CIFAR-10-04-1.2891.h5
Epoch 5/50

Epoch 00005: saving model to /media/amol/WINUX/CIFAR-10-05-1.1728.h5
Epoch 6/50

Epoch 00006: saving model to /media/amol/WINUX/CIFAR-10-06-1.0984.h5
Epoch 7/50

Epoch 00007: saving model to /media/amol/WINUX/CIFAR-10-07-1.0955.h5
Epoch 8/50

Epoch 00008: saving model to /media/amol/WINUX/CIFAR-10-08-1.0438.h5
Epoch 9/50

Epoch 00009: saving model to /media/amol/WINUX/CIFAR-10-09-1.0564.h5
Epoch 10/50

Epoch 00010: saving model to /media/amol/WINUX/CIFAR-10-10-0.9580.h5
Epoch 11/50

Epoch 00011: saving model to /media/amol/WINUX/CIFAR-10-11-1.0024.h5
Epoch 12/50

Epoch 00012: saving model to /media/amol/WIN

Epoch 40/50

Epoch 00040: saving model to /media/amol/WINUX/CIFAR-10-40-0.9323.h5
Epoch 41/50

Epoch 00041: saving model to /media/amol/WINUX/CIFAR-10-41-0.9532.h5
Epoch 42/50

Epoch 00042: saving model to /media/amol/WINUX/CIFAR-10-42-0.9758.h5
Epoch 43/50

Epoch 00043: saving model to /media/amol/WINUX/CIFAR-10-43-0.9919.h5
Epoch 44/50

Epoch 00044: saving model to /media/amol/WINUX/CIFAR-10-44-0.9706.h5
Epoch 45/50

Epoch 00045: saving model to /media/amol/WINUX/CIFAR-10-45-0.9597.h5
Epoch 46/50

Epoch 00046: saving model to /media/amol/WINUX/CIFAR-10-46-0.9561.h5
Epoch 47/50

Epoch 00047: saving model to /media/amol/WINUX/CIFAR-10-47-0.9814.h5
Epoch 48/50

Epoch 00048: saving model to /media/amol/WINUX/CIFAR-10-48-0.9892.h5
Epoch 49/50

Epoch 00049: saving model to /media/amol/WINUX/CIFAR-10-49-1.0137.h5
Epoch 50/50

Epoch 00050: saving model to /media/amol/WINUX/CIFAR-10-50-1.0150.h5


<tensorflow.python.keras.callbacks.History at 0x7f848534e780>

Accuracy = 73%