## Build a CNN for Image Recognition.
### Name: Joe Arul Susai Prakash (U34351756)
### For class ISM6930 (Tech Foundation of AI) at the University of South Florida

### Summary of Results :


- VGG16 architecture (__15 layers, 75 training epochs__) achieved __90.05%__ accuracy on test data with loss 0.317.
- ResNet architecture (__48 layers, 35 training epochs__) achieved __82%__ accuracy on training before I ran out of compute units and lost access to the runtime and the model.

## Data preparation

In [None]:
import numpy as np
import tensorflow as tf

### Load data

In [None]:
from keras.datasets import cifar10


(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print('shape of x_train: ' + str(x_train.shape))
print('shape of y_train: ' + str(y_train.shape))
print('shape of x_test: ' + str(x_test.shape))
print('shape of y_test: ' + str(y_test.shape))
print('number of classes: ' + str(np.max(y_train) - np.min(y_train) + 1))

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
shape of x_train: (50000, 32, 32, 3)
shape of y_train: (50000, 1)
shape of x_test: (10000, 32, 32, 3)
shape of y_test: (10000, 1)
number of classes: 10


### One-hot encode the labels

In [None]:
def one_hot_encode(labels):
  num_labels = len(np.unique(labels))
  one_hot_labels = np.zeros((len(labels), num_labels))
  for i, label in enumerate(labels):
    one_hot_labels[i, label] = 1
  return one_hot_labels

In [None]:
y_train_vec = one_hot_encode(y_train)
y_test_vec = one_hot_encode(y_test)

print('Shape of y_train_vec: ' + str(y_train_vec.shape))
print('Shape of y_test_vec: ' + str(y_test_vec.shape))

print(y_train[0])
print(y_train_vec[0])

Shape of y_train_vec: (50000, 10)
Shape of y_test_vec: (10000, 10)
[6]
[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]


### Randomly partition the training set to training and validation sets

In [None]:
rand_indices = np.random.permutation(50000)
train_indices = rand_indices[0:40000]
valid_indices = rand_indices[40000:50000]

x_val = x_train[valid_indices, :]
y_val = y_train_vec[valid_indices, :]

x_tr = x_train[train_indices, :]
y_tr = y_train_vec[train_indices, :]

print('Shape of x_tr: ' + str(x_tr.shape))
print('Shape of y_tr: ' + str(y_tr.shape))
print('Shape of x_val: ' + str(x_val.shape))
print('Shape of y_val: ' + str(y_val.shape))

Shape of x_tr: (40000, 32, 32, 3)
Shape of y_tr: (40000, 10)
Shape of x_val: (10000, 32, 32, 3)
Shape of y_val: (10000, 10)


## Build a CNN and tune its hyper-parameters

- Build a convolutional neural network model
- Use the validation data to tune the hyper-parameters
- Try to achieve a validation accuracy as high as possible

#### Model 1: Baseline model / LeNet Architecture (__10 epochs__)

- 2 Convolution and Pooling layers
- 1 Fully connected layer
- RMSProp optimizer with learning rate = 0.00001
- Batch size of train and val = 32

__Validation accuracy at last epoch : 48.7%__

__Validation Loss at last epoch : 1.61__

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential



model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))


model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 32, 32, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 16, 16, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 8, 8, 64)          0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 4096)              0         
                                                                 
 dense (Dense)               (None, 128)               5

In [None]:
from keras import optimizers

learning_rate = 1E-5 # to be tuned!

model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.RMSprop(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model.fit(x_tr, y_tr, batch_size=32, epochs=10, validation_data=(x_val, y_val))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


#### Model 2: Baseline model with Data Augmentation (__10 epochs__)

- 2 Convolution and Pooling layers
- 1 Fully connected layer
- __Data Augmentation on training data__
- RMSProp optimizer with learning rate = 0.0001
- Batch size of train and val = 32

__Validation accuracy at last epoch : 54%__

__Validation Loss at last epoch : 1.29__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential

model2 = Sequential()
model2.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model2.add(MaxPooling2D((2, 2)))
model2.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model2.add(MaxPooling2D((2, 2)))
model2.add(Flatten())
model2.add(Dense(128, activation='relu'))
model2.add(Dense(10, activation='softmax'))


model2.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_4 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 16, 16, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_5 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 8, 8, 64)          0         
 g2D)                                                            
                                                                 
 flatten_2 (Flatten)         (None, 4096)              0         
                                                                 
 dense_4 (Dense)             (None, 128)              

In [None]:
from keras import optimizers

learning_rate = 1e-4

model2.compile(loss='categorical_crossentropy',
              optimizer=optimizers.RMSprop(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
history = model2.fit(train_datagen.flow(x_tr, y_tr, batch_size=32), validation_data=val_datagen.flow(x_val, y_val, batch_size=32), epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


#### Model 3: Baseline Model with Data Augmentation and Normalization layers (__10 epochs__)

- 2 Convolution and Pooling layers
- 1 Fully connected layer
- Data Augmentation on training data
- __Batch Normalization between hidden and activation layers__
- RMSProp optimizer with learning rate = 0.00001
- Batch size of train and val = 32

__Validation accuracy at last epoch : 50.5%__

__Validation Loss at last epoch : 1.417__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential
from keras import layers

model_3 = Sequential()
model_3.add(Conv2D(32, (3, 3), padding='same', input_shape=(32, 32, 3)))
model_3.add(layers.BatchNormalization())
model_3.add(layers.Activation('relu'))
model_3.add(MaxPooling2D((2, 2)))
model_3.add(Conv2D(64, (3, 3),padding='same'))
model_3.add(layers.BatchNormalization())
model_3.add(layers.Activation('relu'))
model_3.add(MaxPooling2D((2, 2)))
model_3.add(Flatten())
model_3.add(Dense(128))
model_3.add(layers.BatchNormalization())
model_3.add(layers.Activation('relu'))
model_3.add(Dense(10, activation='softmax'))


model_3.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_6 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 batch_normalization (Batch  (None, 32, 32, 32)        128       
 Normalization)                                                  
                                                                 
 activation (Activation)     (None, 32, 32, 32)        0         
                                                                 
 max_pooling2d_6 (MaxPoolin  (None, 16, 16, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_7 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 batch_normalization_1 (Bat  (None, 16, 16, 64)       

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-5

model_3.compile(loss='categorical_crossentropy',
              optimizer=optimizers.RMSprop(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
history = model_3.fit(train_datagen.flow(x_tr, y_tr, batch_size=32), validation_data=val_datagen.flow(x_val, y_val, batch_size=8), epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


#### Model 4: Baseline Model with Data Augmentation, Normalization layers and Dropout layer (__25 epochs__)
- 2 Convolution and Pooling layers
- 1 Fully connected layer
- Data Augmentation on training data
- Batch Normalization between hidden and activation layers
- __Dropout 50% before dense layer (epochs - 25) to avoid overfitting__
- RMSProp optimizer with learning rate = 0.0001
- Batch size of train and val = 32

__Validation accuracy at last epoch : 56.3%__

__Validation Loss at last epoch : 1.254__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential
from keras import layers

model_4 = Sequential()
model_4.add(Conv2D(32, (3, 3), padding='same', input_shape=(32, 32, 3)))
model_4.add(layers.BatchNormalization())
model_4.add(layers.Activation('relu'))
model_4.add(MaxPooling2D((2, 2)))
model_4.add(Conv2D(64, (3, 3),padding='same'))
model_4.add(layers.BatchNormalization())
model_4.add(layers.Activation('relu'))
model_4.add(MaxPooling2D((2, 2)))
model_4.add(Flatten())
model_4.add(layers.Dropout(0.5))
model_4.add(Dense(128))
model_4.add(layers.BatchNormalization())
model_4.add(layers.Activation('relu'))
model_4.add(Dense(10, activation='softmax'))


model_4.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_8 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 batch_normalization_3 (Bat  (None, 32, 32, 32)        128       
 chNormalization)                                                
                                                                 
 activation_3 (Activation)   (None, 32, 32, 32)        0         
                                                                 
 max_pooling2d_8 (MaxPoolin  (None, 16, 16, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_9 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 batch_normalization_4 (Bat  (None, 16, 16, 64)       

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-4

model_4.compile(loss='categorical_crossentropy',
              optimizer=optimizers.RMSprop(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
history = model_4.fit(train_datagen.flow(x_tr, y_tr), validation_data=val_datagen.flow(x_val, y_val),batch_size=32,epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


#### Model 5: __AlexNet Architecture (from scratch / 25 epochs)__
Reference: https://www.kaggle.com/code/vortexkol/alexnet-cnn-architecture-on-tensorflow-beginner

- 5 Convolution layers (2 with pooling)
- 2 Fully connected layers
- Data Augmentation on training data
- Batch Normalization between hidden and activation layers
- Dropout 50% before both dense layers (epochs - 25) to avoid overfitting
- Adam optimizer with learning rate = 0.001
- Batch size of train =32 and val = 8

__Validation accuracy at last epoch : 72.1%__

__Validation Loss at last epoch : 0.815__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential
from keras import layers

model_5 = Sequential()
# Convolution 1 with pooling
model_5.add(Conv2D(32, (3, 3), padding='same', input_shape=(32, 32, 3)))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))
model_5.add(MaxPooling2D((2, 2)))

# Convolution 2 with pooling
model_5.add(Conv2D(64, (3, 3),padding='same'))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))
model_5.add(MaxPooling2D((3, 3)))

# Convolution 3
model_5.add(Conv2D(128, (3, 3),padding='same'))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))

#Convolution 4
model_5.add(Conv2D(256, (3, 3),padding='same'))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))

# Convolution 5
model_5.add(Conv2D(256, (3, 3),padding='same'))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))
model_5.add(MaxPooling2D((2, 2)))

model_5.add(Flatten())

# Fully connected layer 1
model_5.add(layers.Dropout(0.5))
model_5.add(Dense(1024))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))

# Fully connected layer 2
model_5.add(layers.Dropout(0.5))
model_5.add(Dense(128))
model_5.add(layers.BatchNormalization())
model_5.add(layers.Activation('relu'))

model_5.add(Dense(10, activation='softmax'))


model_5.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_10 (Conv2D)          (None, 32, 32, 32)        896       
                                                                 
 batch_normalization_6 (Bat  (None, 32, 32, 32)        128       
 chNormalization)                                                
                                                                 
 activation_6 (Activation)   (None, 32, 32, 32)        0         
                                                                 
 max_pooling2d_10 (MaxPooli  (None, 16, 16, 32)        0         
 ng2D)                                                           
                                                                 
 conv2d_11 (Conv2D)          (None, 16, 16, 64)        18496     
                                                                 
 batch_normalization_7 (Bat  (None, 16, 16, 64)       

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_5.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
history = model_5.fit(train_datagen.flow(x_tr, y_tr, batch_size=32), validation_data=val_datagen.flow(x_val, y_val, batch_size=8),epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


In [None]:
model_5.save('A2_Alexnet.keras')

In [None]:
model_to_use = tf.keras.models.load_model('A2_Alexnet.keras')

#### Model 6: __VGG16 Architecture (from scratch / 25 epochs)__
Reference: https://builtin.com/machine-learning/vgg16

- 13 Convolution layers (5 with pooling)
- 2 Fully connected layers
- Data Augmentation on training data
- Batch Normalization between hidden and activation layers
- Dropout 50% before both dense layers (epochs - 25) to avoid overfitting
- Adam optimizer with learning rate = 0.001
- Batch size of train =32 and val = 8

__Validation accuracy at last epoch : 77%__

__Validation Loss at last epoch : 0.707__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.models import Sequential
from keras import layers

model_6 = Sequential()
# Convolution 1 with pooling
model_6.add(Conv2D(32, (3, 3), padding='same', input_shape=(32, 32, 3)))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 2 with pooling
model_6.add(Conv2D(64, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))
model_6.add(MaxPooling2D((2, 2))) # pooling after conv 1,2

# Convolution 3
model_6.add(Conv2D(128, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

#Convolution 4
model_6.add(Conv2D(128, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))
model_6.add(MaxPooling2D((2, 2))) # pooling after conv 3,4

# Convolution 5
model_6.add(Conv2D(256, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 6
model_6.add(Conv2D(256, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 7
model_6.add(Conv2D(256, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))
model_6.add(MaxPooling2D((2, 2))) # pooling after conv 5,6,7

# Convolution 8
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 9
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 10
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))
model_6.add(MaxPooling2D((2, 2))) # pooling after conv 8,9,10

# Convolution 11
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 12
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Convolution 13
model_6.add(Conv2D(512, (3, 3),padding='same'))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))
model_6.add(MaxPooling2D((2, 2))) # pooling after conv 11,12,13

model_6.add(Flatten())

# Fully connected layer 1
model_6.add(layers.Dropout(0.5))
model_6.add(Dense(1024))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

# Fully connected layer 2
model_6.add(layers.Dropout(0.5))
model_6.add(Dense(1024))
model_6.add(layers.BatchNormalization())
model_6.add(layers.Activation('relu'))

model_6.add(Dense(10, activation='softmax'))


model_6.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 32, 32, 32)        896       
                                                                 
 batch_normalization (Batch  (None, 32, 32, 32)        128       
 Normalization)                                                  
                                                                 
 activation (Activation)     (None, 32, 32, 32)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 64)        18496     
                                                                 
 batch_normalization_1 (Bat  (None, 32, 32, 64)        256       
 chNormalization)                                                
                                                                 
 activation_1 (Activation)   (None, 32, 32, 64)        0

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_6.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
history = model_6.fit(train_datagen.flow(x_tr, y_tr, batch_size=32), validation_data=val_datagen.flow(x_val, y_val, batch_size=8),epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


In [None]:
model_6.save('A2_VGG16_U34351756.keras')

#### __Model 7: ResNet (from scratch / 10 epochs)__
Reference : https://www.kaggle.com/code/mishki/resnet-keras-code-from-scratch-train-on-gpu
- __48 layers__ in total (including 4 convolutions for identity)
  - 3 layers in each identity block
  - 4 layers in each convolution block
  - 14 blocks stacked one by one
  - No pooling after each convolution
  - 1 FC layer with dropout
- Data Augmentation on training data
- Batch Normalization between hidden and activation layers
- Dropout 50% before the dense layer to avoid overfitting
- Adam optimizer with learning rate = 0.001
- Batch size of train =32 and val = 8

__Validation accuracy at last epoch : 63.4%__

__Validation Loss at last epoch : 1.065__

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
train_datagen.fit(x_tr)
val_datagen.fit(x_val)

In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, BatchNormalization, ReLU, concatenate, GlobalAveragePooling2D, Flatten, Dense
from tensorflow.keras.models import Model, Sequential

In [None]:
def identity_block(input_tensor, filter1, filter2, filter3):
    # layer 1 of block
    x = Conv2D(filter1, (1, 1), padding="same")(input_tensor)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # layer 2 of block
    x = Conv2D(filter2, (3, 3), padding="same")(x)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # layer 3 of block
    x = Conv2D(filter3, (1, 1), padding="same")(x)
    x = BatchNormalization()(x)

    # Add the residual connection
    x = concatenate([x, input_tensor])
    x = ReLU()(x)
    return x

In [None]:
def convolution_block(input_tensor, filter1, filter2, filter3):
    # layer 1 of block
    x = Conv2D(filter1, (1, 1), padding="same")(input_tensor)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # layer 2 of block
    x = Conv2D(filter2, (3, 3), padding="same")(x)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # layer 3 of block
    x = Conv2D(filter3, (1, 1), padding="same")(x)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    # convolution for residual layer
    residual = Conv2D(filter3, (1, 1), padding="same")(input_tensor)
    residual = BatchNormalization()(residual)
    x = concatenate([x, residual])
    x = ReLU()(x)
    return x


In [None]:

# Model initialization
model_7 = Sequential()

# Convolution 1 with pooling
model_7.add(Conv2D(32, (3, 3), padding='same', input_shape=(32, 32, 3)))
model_7.add(BatchNormalization())
model_7.add(ReLU())

# Level 1 (3 blocks)
x = convolution_block(model_7.layers[-1].output, 32, 32, 64)
x = identity_block(x, 64, 64, 128)
x = identity_block(x, 64, 64, 128)

# Level 2 (3 blocks)
x = convolution_block(x, 128, 128, 256)
x = identity_block(x, 128, 128, 256)
x = identity_block(x, 256, 256, 512)

# Level 3 (5 blocks)
x = convolution_block(x, 256, 256, 512)
x = identity_block(x, 256, 256, 512)
x = identity_block(x, 512, 512, 1024)
x = identity_block(x, 512, 512, 1024)
x = identity_block(x, 512, 512, 1024)

# Level 4 (3 blocks)
x = convolution_block(x, 512, 512, 1024)
x = identity_block(x, 512, 512, 1024)
x = identity_block(x, 512, 512, 1024)

# Final pooling
x = GlobalAveragePooling2D()(x)
x = Flatten()(x)

# Fully connected layer 1
x = tf.keras.layers.Dropout(0.5)(x)
x = Dense(1024)(x)
x = BatchNormalization()(x)
x = ReLU()(x)

# Output Layer
output = Dense(10, activation='softmax')(x)

# Define the model
model_7 = Model(inputs=model_7.inputs, outputs=output)


In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_7.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_7.fit(train_datagen.flow(x_tr, y_tr, batch_size=32), validation_data=val_datagen.flow(x_val, y_val, batch_size=8),epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
model_7.save('A2_ResNet_U34351756.keras')

## Train (again) and evaluate the model

### Train the model on the entire training set

#### __Based on previous iterations of training, VGG16 and ResNet architectures seem to perform better than other versions. I will be training both these models with the entire training set (changing 1 hyperparameter - epochs) and evaluate them on the test set__

#### VGG16 (model 6) Re-training with same architecture (epochs = 50)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# entire training set
train_datagen.fit(x_train)


In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_6.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_6.fit(train_datagen.flow(x_train, y_train_vec, batch_size=32),epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [None]:
model_6.save('A2_VGG16_U34351756_Final1.keras')

#### ResNet (model 7) Re-training with same architecture (epochs = 25)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# entire training set
train_datagen.fit(x_train)

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_7.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_7.fit(train_datagen.flow(x_train, y_train_vec, batch_size=32),epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


##### Model training is progressing very slowly - maybe a better learning rate would have improved the training process as mentioned here.
https://pyimagesearch.com/2019/09/23/keras-starting-stopping-and-resuming-training/



#### ResNet (model 7) Re-training with same architecture but a different learning rate (epochs = 50)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# entire training set
train_datagen.fit(x_train)

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 0.01

model_7.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_7.fit(train_datagen.flow(x_train, y_train_vec, batch_size=32),epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50

##### __Lost connectivity__ :)
So I exhausted my compute units for the month for the A100 NVIDIA GPUs. In the last few epochs the accuracy improvement has slowed down to around 0.4% - which means that it will take a lot more iterations to converge to even 90%.
The Tesla T4 GPU takes 42 minutes to complete 1 epoch as opposed to just 4 minutes for 1 epoch on the A100s. So I'm stopping training for this model here.

#### VGG16 (model 6) Re-training with same architecture (epochs = 75)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# entire training set
train_datagen.fit(x_train)

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_6.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_6.fit(train_datagen.flow(x_train, y_train_vec, batch_size=32),epochs=75)

Epoch 1/75
Epoch 2/75
Epoch 3/75
Epoch 4/75
Epoch 5/75
Epoch 6/75
Epoch 7/75
Epoch 8/75
Epoch 9/75
Epoch 10/75
Epoch 11/75
Epoch 12/75
Epoch 13/75
Epoch 14/75
Epoch 15/75
Epoch 16/75
Epoch 17/75
Epoch 18/75
Epoch 19/75
Epoch 20/75
Epoch 21/75
Epoch 22/75
Epoch 23/75
Epoch 24/75
Epoch 25/75
Epoch 26/75
Epoch 27/75
Epoch 28/75
Epoch 29/75
Epoch 30/75
Epoch 31/75
Epoch 32/75
Epoch 33/75
Epoch 34/75
Epoch 35/75
Epoch 36/75
Epoch 37/75
Epoch 38/75
Epoch 39/75
Epoch 40/75
Epoch 41/75
Epoch 42/75
Epoch 43/75
Epoch 44/75
Epoch 45/75
Epoch 46/75
Epoch 47/75
Epoch 48/75
Epoch 49/75
Epoch 50/75
Epoch 51/75
Epoch 52/75
Epoch 53/75
Epoch 54/75
Epoch 55/75
Epoch 56/75
Epoch 57/75
Epoch 58/75
Epoch 59/75
Epoch 60/75
Epoch 61/75
Epoch 62/75
Epoch 63/75
Epoch 64/75
Epoch 65/75
Epoch 66/75
Epoch 67/75
Epoch 68/75
Epoch 69/75
Epoch 70/75
Epoch 71/75
Epoch 72/75
Epoch 73/75
Epoch 74/75
Epoch 75/75


In [None]:
model_6.save('A2_VGG16_U34351756_Final2.keras')

#### VGG16 (model 6) Re-training with same architecture (epochs = 125)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# entire training set
train_datagen.fit(x_train)

In [None]:
from keras import optimizers

#learning_rate = 1e-4

learning_rate = 1e-3

model_6.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

In [None]:
history = model_6.fit(train_datagen.flow(x_train, y_train_vec, batch_size=32),epochs=125)

Epoch 1/125
Epoch 2/125
Epoch 3/125
Epoch 4/125
Epoch 5/125
Epoch 6/125
Epoch 7/125
Epoch 8/125
Epoch 9/125
Epoch 10/125
Epoch 11/125
Epoch 12/125
Epoch 13/125
Epoch 14/125
Epoch 15/125
Epoch 16/125
Epoch 17/125
Epoch 18/125
Epoch 19/125
Epoch 20/125
Epoch 21/125
Epoch 22/125
Epoch 23/125
Epoch 24/125
Epoch 25/125
Epoch 26/125
Epoch 27/125
Epoch 28/125
Epoch 29/125
Epoch 30/125
Epoch 31/125
Epoch 32/125
Epoch 33/125
Epoch 34/125
Epoch 35/125
Epoch 36/125
Epoch 37/125
Epoch 38/125
Epoch 39/125
Epoch 40/125
Epoch 41/125
Epoch 42/125
Epoch 43/125
Epoch 44/125
Epoch 45/125
Epoch 46/125
Epoch 47/125
Epoch 48/125
Epoch 49/125
Epoch 50/125
Epoch 51/125
Epoch 52/125
Epoch 53/125
Epoch 54/125
Epoch 55/125
Epoch 56/125
Epoch 57/125
Epoch 58/125
Epoch 59/125
Epoch 60/125
Epoch 61/125
Epoch 62/125
Epoch 63/125
Epoch 64/125
Epoch 65/125
Epoch 66/125
Epoch 67/125
Epoch 68/125
Epoch 69/125
Epoch 70/125
Epoch 71/125
Epoch 72/125
Epoch 73/125
Epoch 74/125
Epoch 75/125
Epoch 76/125
Epoch 77/125
Epoch 78

##### This model (125 epochs) does __not__ produce the highest accuracy with the test set (89%). The 75 epoch version does better (90%).

### Evaluate the model on the test set

#### VGG16 architecture - 50 epochs (model 6 performance on test set)
#### __Accuracy = 90.04%, Loss=0.317__

In [None]:
test_datagen = ImageDataGenerator(rescale=1./255)

test_datagen.fit(x_test)

In [None]:
loss_and_acc = model_6.evaluate(test_datagen.flow(x_test, y_test_vec))
print('loss = ' + str(loss_and_acc[0]))
print('accuracy = ' + str(loss_and_acc[1]))

loss = 0.31722286343574524
accuracy = 0.9004999995231628
