[View in Colaboratory](https://colab.research.google.com/github/mathemakitten/keras-workshop/blob/master/Intro_Keras_PartII_CIFAR10.ipynb)

# Intro to Keras - Classifying Very Small Images

The CIFAR-10 dataset consists of 80 million tiny (32x32) colour images in 10 classes, with 6000 images per class. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton, prominent researchers in the artificial intelligence field today.

For more info on the CIFAR-10 dataset, please see Alex Krizhevsky's reference: https://www.cs.toronto.edu/~kriz/cifar.html

We are going to train a deep convolutional neural network to identify whether an image is in one of 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, or truck. There is also a CIFAR-100 dataset (with, as you may have guessed, 100 categories).

This workshop is based on work created and shared by the Keras team at Google, and used according to terms described in The MIT License (MIT).

Source: https://github.com/keras-team/keras/tree/master/examples

In [None]:
# Ensure that we have the right version of Keras due to dependencies 
!pip uninstall keras
!pip install keras==2.1.3

In [1]:
# Check the Keras version
import keras 
keras.__version__

Using TensorFlow backend.


'2.1.3'

In [2]:
# Our model will get to 75% validation accuracy in 25 epochs, and 79% after 50 epochs. (it's still underfitting then)

from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os

# Set hyperparameters -- you can change these! 
batch_size = 32
num_classes = 10
epochs = 100
data_augmentation = True
num_predictions = 20
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_cifar10_trained_model.h5'

# Training data preparation

We always want to split out our dataset into train and test subsets to ensure that we get an accurate view of whether or not the model is overfitting to random noise within the data. Test data will never been shown to the model during training and used only for calculating accuracy at the end. 

In [3]:
# The data, split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


# One-hot encoding y-labels

We use Keras's built-in `to-categorical` feature to convert an array of labeled data (from 0 to number_of_classes-1) to a one-hot vector (think truth tables).

In [4]:
# Convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Putting together a model architecture

The beauty of Keras is in its simplicity of syntax. Keras was designed to be human-readable and intuitive, as evidenced below. We are going to build a sequential model with several convolutional layers, a reLU activation function, a 2x2 pooling layer, and dropout to avoid overfitting. 

In [9]:
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',
                 input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

# initiate RMSprop optimizer: http://ruder.io/optimizing-gradient-descent/
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)

# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

# Data augmentation

We can teach our network to learn even more nuances by constantly rotating or changing small bits of the image in real-time to create a new training observation while in the middle of training.

In [None]:
if not data_augmentation:
    print('Not using data augmentation.')
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

    # Compute quantities required for feature-wise normalization
    # (std, mean, and principal components if ZCA whitening is applied).
    datagen.fit(x_train)

    # Fit the model on the batches generated by datagen.flow().
    model.fit_generator(datagen.flow(x_train, y_train,
                                     batch_size=batch_size),
                        epochs=epochs,
                        validation_data=(x_test, y_test),
                        workers=4)

Using real-time data augmentation.
Epoch 1/100
  74/1563 [>.............................] - ETA: 5:38 - loss: 2.3027 - acc: 0.0992

# Save model to disk 

After we've fit a model as above, we can save its parameters to disk and then use it later on to make predictions, as below.

In [None]:
# Save model and weights
if not os.path.isdir(save_dir):
    os.makedirs(save_dir)
model_path = os.path.join(save_dir, model_name)
model.save(model_path)
print('Saved trained model at %s ' % model_path)

# Score trained model
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])