## Classifying CIFAR-10 with Data Augmentation

In this exercise, we revisit CIFAR-10 and the networks we previously built.  We will use real-time data augmentation to try to improve our results.

When you are done going through the notebook, experiment with different data augmentation parameters and see if they help (or hurt!) the performance of your classifier.

In [3]:
from __future__ import print_function
import tensorflow.keras as keras
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D

import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


In [4]:
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test  = keras.utils.to_categorical(y_test, num_classes)

In [None]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In Exercise 6, we built two models.  One was smaller (181K parameters) while the second was larger (1.25M) parameters.  Below we use the smaller model and train it with data augmentation.

In [5]:
# Let's build a CNN using Keras' Sequential capabilities
model_1 = Sequential()

model_1.add(Conv2D(32, (5, 5), strides=(2, 2), padding='same', input_shape=x_train.shape[1:]))
model_1.add(Activation('relu'))

model_1.add(Conv2D(32, (5, 5), strides=(2, 2)))
model_1.add(Activation('relu'))
            
model_1.add(MaxPooling2D(pool_size=(2, 2)))
model_1.add(Dropout(0.25))

model_1.add(Flatten())
model_1.add(Dense(512))
model_1.add(Activation('relu'))
model_1.add(Dropout(0.5))
model_1.add(Dense(num_classes))
model_1.add(Activation('softmax'))


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


We still have 181K parameters, even though this is a "small" model.

In [7]:
batch_size = 32

opt = keras.optimizers.RMSprop(lr=.0005, decay=1e-6)

model_1.compile(loss='categorical_crossentropy',
               optimizer=opt,
               metrics=['accuracy'])

Here we define the `ImageDataGenerator` that we will use to serve images to our model during the training process.  Currently, it is configured to do some shifting and horizontal flipping.

In [8]:
datagen = ImageDataGenerator(
    featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False,
    samplewise_std_normalization=False,
    zca_whitening=False,
    rotation_range=0,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    vertical_flip=False
)

datagen.fit(x_train)

model_1.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),
                      steps_per_epoch=x_train.shape[0] // batch_size,
                      epochs=15,
                      validation_data=(x_test, y_test)
)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x22b18c38ba8>

How does the performance compare with the non-augmented training?

## Exercise
### Your Turn

1. Experiment above with different settings of the data augmentation parameters.  Can you make the model do better?  Can you make it do worse?

2. As in Exercise 6, Build a more complicated model with the following pattern:
    - Conv -> Conv -> MaxPool -> Conv -> Conv -> MaxPool -> (Flatten) -> Dense -> Final Classification
    - Use strides of 1 for all convolutional layers.

3. Use data augmentation to train this model.  Can you get better performance?

In [9]:
# Let's build a CNN using Keras' Sequential capabilities
model_2 = Sequential()

model_2.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model_2.add(Activation('relu'))

model_2.add(Conv2D(32, (3, 3)))
model_2.add(Activation('relu'))
            
model_2.add(MaxPooling2D(pool_size=(2, 2)))
model_2.add(Dropout(0.25))

model_2.add(Conv2D(64, (3, 3), padding='same'))
model_2.add(Activation('relu'))

model_2.add(Conv2D(64, (3, 3)))
model_2.add(Activation('relu'))

model_2.add(MaxPooling2D(pool_size=(2, 2)))
model_2.add(Dropout(0.25))

model_2.add(Flatten())
model_2.add(Dense(512))
model_2.add(Activation('relu'))
model_2.add(Dropout(0.5))
model_2.add(Dense(num_classes))
model_2.add(Activation('softmax'))

In [12]:
## Check number of parameters
model_2.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
activation_4 (Activation)    (None, 32, 32, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 30, 30, 32)        9248      
_________________________________________________________________
activation_5 (Activation)    (None, 30, 30, 32)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 15, 15, 32)        0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 15, 15, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 15, 15, 64)       

In [13]:
# initiate RMSprop optimizer
batch_size = 32

opt_2 = keras.optimizers.RMSprop(lr=.0005)

model_2.compile(loss='categorical_crossentropy',
               optimizer=opt_2,
               metrics=['accuracy'])

In [14]:
datagen = ImageDataGenerator(
    featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False,
    samplewise_std_normalization=False,
    zca_whitening=False,
    rotation_range=0,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    vertical_flip=False
)

datagen.fit(x_train)

model_2.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),
                      steps_per_epoch=x_train.shape[0] // batch_size,
                      epochs=15,
                      validation_data=(x_test, y_test)
)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x22b41859940>