<a href="https://colab.research.google.com/github/gshah8/UCF/blob/master/Machine_Learning/HW2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# HW 2

The goal of this homework is to create a convolutional neural network for the CIFAR10 data set. 
See [this colab notebook](https://colab.research.google.com/drive/1LZZviWOzvchcXRdZi2IBx3KOpQOzLalf) how to load the CIFAR data in Keras.

You should not use any pretrained convnets that come with Keras. You have to create and train your own convnets with Keras from scratch.

## Simple hold-out validation

Make sure that the data is divided into: 

- training set (80%)
- validation set (20%)
- test set. 

Use the training set to train your neural networks. Evaluate their performance on the validation data set. 

After trying several different architectures, choose the one that performs
best of the validation set. Try at least four different architectures by using data augmentation, using dropout, varying the number of layers, the number of filters, etc.

Train this final architecture on the data from the training set and validation set and evaluate its performance on 
the test set.

## k-fold validation

Reevaluate your best architecture using k-fold validation with k=5, that is, the size of the validation fold is 20%. Does the accuracy/loss obtain by k-fold validation differ from the accuracy/loss obtain by simple hold-out validation.

### Loading the CIFAR10 data set



In [1]:
from keras.datasets import cifar10
import numpy as np
import matplotlib.pyplot as plt
from keras.utils import to_categorical
from keras import models
from keras import layers


(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

Using TensorFlow backend.


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


### Exploring the format of the CIFAR10 data set

In [2]:
train_images.shape

(50000, 32, 32, 3)

In [3]:
train_images.ndim

4

In [4]:
train_labels.shape

(50000, 1)

In [5]:
train_labels.ndim

2

###Using Simple Hold Validation for the models below

In [6]:
#divide training data into training and validation data
rand_idx = np.random.permutation(len(train_images))
val_idx = rand_idx[0:2000]
train_idx = rand_idx[2000:]

train_images_new, train_labels_new = train_images[train_idx] , train_labels[train_idx]
val_images, val_labels = train_images[val_idx] , train_labels[val_idx]

#normalize training and validation data
train_images_norm = train_images_new/255.0
val_images_norm = val_images/255.0

#train_images_norm.shape
#train_images_norm.shape
train_labels_new[0]

array([0], dtype=uint8)

###Preprocess test data

In [7]:
from keras.utils import np_utils
#train_images_norm = train_images/255
test_images_norm = test_images/255.0

#one-hot encoding
train_labels_norm = to_categorical(train_labels_new)
val_labels_norm = to_categorical(val_labels)
test_labels_norm = to_categorical(test_labels)

train_labels_norm[0]

array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)

###Build a basic model (without Data Augmentation and Dropout)

In [69]:

# set up the layers

model = models.Sequential()
#conv layers   
#1
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding = 'same', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
#2
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
#3
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
#dense layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_98 (Conv2D)           (None, 32, 32, 32)        896       
_________________________________________________________________
max_pooling2d_84 (MaxPooling (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_99 (Conv2D)           (None, 16, 16, 64)        18496     
_________________________________________________________________
max_pooling2d_85 (MaxPooling (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_100 (Conv2D)          (None, 8, 8, 64)          36928     
_________________________________________________________________
max_pooling2d_86 (MaxPooling (None, 4, 4, 64)          0         
_________________________________________________________________
flatten_22 (Flatten)         (None, 1024)              0         
__________

###Compile the Model

In [0]:
model.compile(optimizer='adam',
             loss='categorical_crossentropy',
             metrics=['accuracy'])

###Train the Model

In [71]:
#epochs = 20
#history = model.fit(train_images_norm, 
#                      train_labels_norm, 
#                      epochs=epochs,  
#                      validation_data=(val_images_norm, val_labels))

#epochs = 20
#history = model.fit(train_images_norm, 
#                      train_labels_norm, 
#                      epochs=epochs,  
#                      validation_data=(val_images_norm, val_labels))



epochs = 20
history = model.fit(train_images_norm, 
                      train_labels_norm, 
                      epochs=epochs, 
                      batch_size=64, 
                      validation_data=(val_images_norm, val_labels_norm))

Train on 48000 samples, validate on 2000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


We can observe that the model is clearly overfitted since the training accuracy is increasing as we increase the number of epochs but the validation accuracy is not increasing.

Now, we will try to improve the model by using the four following architectures:
1. increasing the number of convolutional layers
2. Adding more filters
3. Data Augmentation
4. Dropout

We will add the above mentioned architectures one by one and observe the training and validation accuracy.
In the end, as a final check, we will run the model for test data.

####1. Increasing the number of convolutional layers

In [63]:

# set up the layers

model = models.Sequential()
#conv layers   
#1
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding = 'same', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
#2
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
#3
model.add(layers.Conv2D(128, (3, 3), activation='relu',padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
#4
model.add(layers.Conv2D(128, (3, 3), activation='relu', padding = 'same'))
model.add(layers.MaxPooling2D((2, 2)))
#dense layers
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_91 (Conv2D)           (None, 32, 32, 32)        896       
_________________________________________________________________
max_pooling2d_77 (MaxPooling (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_92 (Conv2D)           (None, 16, 16, 64)        18496     
_________________________________________________________________
max_pooling2d_78 (MaxPooling (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_93 (Conv2D)           (None, 8, 8, 128)         73856     
_________________________________________________________________
max_pooling2d_79 (MaxPooling (None, 4, 4, 128)         0         
_________________________________________________________________
conv2d_94 (Conv2D)           (None, 4, 4, 128)         147584    
__________

In [0]:
model.compile(optimizer='adam',
             loss='categorical_crossentropy',
             metrics=['accuracy'])

In [65]:
epochs = 40
history = model.fit(train_images_norm, 
                      train_labels_norm, 
                      epochs=epochs, 
                      batch_size=64, 
                      validation_data=(val_images_norm, val_labels_norm))

Train on 48000 samples, validate on 2000 samples
Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
