# CSE5ML Lab 8: Convolutional Neural Network


### Developing simple CNN

When we build a Convolutional Neural Network model, we would need to have convolutional layers, max pooling and dense layers. To enhance the performance, we would also include dropouts. Bellow is a Simple CNN model for the CIFAR-10 Dataset.

Dropout is a regularization method proposed by Srivastava, et al at 2014. It is a  simple yet effective way to Prevent Neural Networks from Overfitting. Dropout randomly selectes percentage of neurons and ignore them during training. This means that their contribution to the activation is temporally removed on the forward pass, and any weight updates are not applied to the neuron on the backward pass.

### Load dataset and Preprocess data

In [1]:
# packages
import numpy as np
import tensorflow as tf
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.constraints import maxnorm
from keras.optimizers import SGD
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils

# load data
(Inputs, Labels), (Test_Data, Test_Label) = cifar10.load_data() # notice the first line of importing packages

# normalize inputs from 0-255 to 0.0-1.0
# Neural networks process inputs using small weight values, and inputs with large integer values can disrupt or slow down the learning process. As such it is good practice to normalize the pixel values so that each pixel value has a value between 0 and 1.
Inputs = Inputs.astype('float32')
Test_Data = Test_Data.astype('float32')
Inputs = Inputs / 255.0
Test_Data = Test_Data / 255.0

# Encode the outputs with one hot coding
Labels = np_utils.to_categorical(Labels) #Converts a class vector (integers) to binary class matrix.
Test_Label = np_utils.to_categorical(Test_Label)
num_classes = Test_Label.shape[1]

Using TensorFlow backend.


### Build a convolutional neural networks model

More information about parameters settings in Conv2D can be found here: https://keras.io/api/layers/convolution_layers/convolution2d/

In [2]:
# Build the model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu', kernel_constraint=maxnorm(3)))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', kernel_constraint=maxnorm(3)))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))

Instructions for updating:
If using Keras pass *_constraint arguments to layers.



### Compile the model
define loss function, optimizer and addtional evaluation metrics

#### Some addtional information about optimizers

To undrestand the concept of optimizers one usually begins with the most basic and popular one, Gradient Descent (used in the bellow example). The important part of the Gradient Descent algorithm (and optimizers in general) is to understand gradients, which indicates: what a small change in a a given parameter (here weight) would do to the loss function. Gradients are a measure of change. They are the connection between the loss function and the weights. In a simple language, they tell us what specific operation should be performed to the weights (ezamples: add 2.1, subtract .07, etc.), for the purpose of reducing the loss (which will increase the accuracy).

In [3]:
# Define optimizer
lrate = 0.002
epochs = 5
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.7, decay=decay, nesterov=False) #Stochastic gradient descent optimizer

# Compile model
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

### Plot the model
it can help us understand model structure, the shape of output and the number of parameters in a model

In [4]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 32)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8192)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               4194816   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)               

### Train the mode

In [7]:
tf.set_random_seed(1)
np.random.seed(1)

epochs = 5
# Fit the model
model.fit(Inputs, Labels, validation_data=(Test_Data, Test_Label), epochs=epochs, batch_size=60, verbose=1)

Train on 50000 samples, validate on 10000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x1de0ac59408>

### Evaluate the trained model with testing dataset

In [6]:
# Final evaluation of the model
scores = model.evaluate(Test_Data, Test_Label, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Accuracy: 48.97%


### Deeper CNN network and optimization
We can add more layers to have a more complex model. Bellow is an example of a deeper CNN model for the CIFAR-10 Dataset.


In [4]:
# Pakages
import numpy as np
import tensorflow as tf
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.constraints import maxnorm
from keras.optimizers import SGD
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils

tf.set_random_seed(1)
np.random.seed(1)

# load data
(Inputs, Labels), (Test_Data, Test_Label) = cifar10.load_data()
# normalize inputs (so all pixel values are transformed from [0,255] to [0,0-1.0]
Inputs = Inputs.astype('float32')
Test_Data = Test_Data.astype('float32')
Inputs = Inputs / 255.0
Test_Data = Test_Data / 255.0
# Encode outputs
Labels = np_utils.to_categorical(Labels)
Test_Label = np_utils.to_categorical(Test_Label)
num_classes = Test_Label.shape[1]

# Build a deeper CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), activation='relu', padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(1024, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
epochs = 20
lrate = 0.001
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
model.summary()
# Fit the model
model.fit(Inputs, Labels, validation_data=(Test_Data, Test_Label), epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(Test_Data, Test_Label, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_3 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 32, 32, 32)        9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 16, 16, 64)        18496     
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 16, 16, 64)        36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 8, 8, 128)        