## CIFAR Example

The basic architecture is taken from:

https://appliedmachinelearning.blog/2018/03/24/achieving-90-accuracy-in-object-recognition-task-on-cifar-10-dataset-with-keras-convolutional-neural-networks/

The notebook is supposed to run with TensorFlow 2.0.0-alpha0 and GPU support. In the following cell, it is assumed that you use Colaboratory and TensorFlow 2.0.0-alpha0 has to be installed.

In [1]:
!pip install -q tensorflow-gpu==2.0.0-alpha0
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
print("GPU available", tf.test.is_gpu_available())

import numpy as np
print("Numpy version:", np.version.version)

[33mDEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.[0m
[31m  Could not find a version that satisfies the requirement tensorflow-gpu==2.0.0-alpha0 (from versions: 0.12.0rc1, 0.12.0, 0.12.1, 1.0.0, 1.0.1, 1.1.0rc0, 1.1.0rc1, 1.1.0rc2, 1.1.0)[0m
[31mNo matching distribution found for tensorflow-gpu==2.0.0-alpha0[0m
TensorFlow version: 2.0.0-alpha0
GPU available False
Numpy version: 1.15.4


Donwload and normailze data:

In [3]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

mean = np.mean(x_train,axis=(0,1,2))
std = np.std(x_train,axis=(0,1,2))
x_train = (x_train-mean)/std
x_test = (x_test-mean)/std

print("Mean of the three channels on the training data:", x_train.mean(axis=(0,1,2)))
print("Standard deviation of the three channels on the training data:", x_train.std(axis=(0,1,2)))

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
Mean of the three channels on the training data: [1.90680804e-17 9.16847154e-17 1.50768287e-17]
Standard deviation of the three channels on the training data: [1. 1. 1.]


Build model, train, and evaluate:

In [None]:
iiv = 0.01  # Standard deviation and offset for initializing bias parameters in hidden layers

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(32, (3,3), activation="elu", padding='same', 
  input_shape=x_train.shape[1:],
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(32, (3,3), activation="elu", padding='same', 
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
 
model.add(tf.keras.layers.Conv2D(64, (3,3), activation="elu", padding='same', 
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(64, (3,3), activation="elu", padding='same', 
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
 
model.add(tf.keras.layers.Conv2D(128, (3,3), activation="elu", padding='same', 
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(128, (3,3), activation="elu", padding='same', 
  bias_initializer=tf.initializers.TruncatedNormal(mean=iiv, stddev=iiv)))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
 
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='softmax'))
 
model.summary()
 
# Data augmentation
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    )
datagen.fit(x_train)
 
# Training
batch_size = 64
   
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)  # The decay parameter controls a schedule that reduces the learning rate parameter over time
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size),
                    steps_per_epoch=x_train.shape[0] // batch_size, epochs=125,
                    verbose=1, validation_data=(x_test,y_test))

# Testing
scores = model.evaluate(x_test, y_test, batch_size=128, verbose=1)

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 32, 32, 32)        896       
_________________________________________________________________
batch_normalization_v2 (Batc (None, 32, 32, 32)        128       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 32)        9248      
_________________________________________________________________
batch_normalization_v2_1 (Ba (None, 32, 32, 32)        128       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 64)        18496     
_________________________________________________________________
batch_normalization_v2_2 (Ba (None, 16, 16, 64)        2