# Checkpointing and Transfer Learning

### Welcome to the 7th Lab of 42028: Deep Learning and CNN!

In this  Lab/Tutorial session you will be learning how to checkpoint a model and also restart training from an existing checkpoint. In the second session you will implement Transfer learning/fine-tuning.

Designing of Alexnet architecture will also be discussed.

So lets get started!

## Tutorial:
1. Checkpointing or saving trained model
2. Transfer learning
3. Classic CNNs

## Tasks for this week:

1. Implementation of CNN for Dogs and Cats classification using Keras API. 
2. Save the snapshot of trained model using checkpoint
3. Loading the weights of trained model and start training again.
3. Using out-of-the-box models for classification
4. Transfer learning/fine-tuning from already trained model
5. Implement Alexnet.


##Task-1: Implementing CNN for Dogs and Cats classification using Keras API

### Step 1: Import required packages

we will need tensorflow, numpy, os and keras


In [0]:
import os
import tensorflow as tf
import zipfile
import matplotlib.pyplot as plt
from tensorflow.keras import layers
from tensorflow.keras import Model
from sklearn.preprocessing import LabelBinarizer
from keras.callbacks import ModelCheckpoint
from keras.optimizers import SGD
from keras.datasets import cifar10
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator


### Step 2: Download the Cats & Dogs dataset

In [0]:
!wget --no-check-certificate \
    https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip \
    -O /tmp/cats_and_dogs_filtered.zip

In [0]:
local_zip = '/tmp/cats_and_dogs_filtered.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp')
zip_ref.close()

base_dir = '/tmp/cats_and_dogs_filtered'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

# Directory with our training cat pictures
train_cats_dir = os.path.join(train_dir, 'cats')

# Directory with our training dog pictures
train_dogs_dir = os.path.join(train_dir, 'dogs')

# Directory with our validation cat pictures
validation_cats_dir = os.path.join(validation_dir, 'cats')

# Directory with our validation dog pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs')

### Step 3:  Design the CNN Architecture 

Design the following CNN architecture:


<img src='http://drive.google.com/uc?export=view&id=1EAWFwp7T92q3Lm1ZrX9A2-wnvhfAfzSF' alt='Conv'>

Input: $150 X 150 X 3$ image

No. of filters:
- Conv1 : 32, 3x3
- Conv2 : 64, 3x3
- Conv4 : 128, 3x3
- Conv4 : 128, 3x3

Activation function in CONV layer: Relu

Pool: MaxPooling, 2x2

FC Layer: 512 nodes, Activation : ReLu

Activation function in Output layer : sigmoid, 

**Hint:** Use Conv2D(), MaxPooling2D(), Flatten(), and Dense()

In [0]:
model = tf.keras.models.Sequential([
    
    ## Start Your Code Here ###
    
    
])
## End Your Code Here ###

### Step 4. Model Compilation

In [0]:
## Start Your Code Here ##
## Hint: Loss is 'binary_crossentropy',
## optimizer: RMSprop(lr=1e-4),
## metrics: 'acc'
model.compile()## Complete the missing paramenters
## End Your Code Here ###

### Step 5. Using Image generator to load images and generate labels automatically. Image generator also resizes the images.

In [0]:
# All images will be rescaled by 1./255
## Start Your Code Here ##

train_datagen = 
test_datagen = 

# Flow training images in batches of 20 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        train_dir,  # This is the source directory for training images
        target_size=(),  #Complete the paramenters, All images will be resized to 150x150,
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

# Flow validation images in batches of 20 using test_datagen generator
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=()),#Complete the paramenters,
        batch_size=20,
        class_mode='binary')

## End Your Code Here ##

##Task-2: Saving the snapshot of model as checkpoint



###Step 6: Checkpointing

In [0]:
#checkpoint = ModelCheckpoint('weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', save_best_only=True, verbose=1, period=3)
filepath='/tmp/weights.{epoch:02d}-{val_loss:.2f}.hdf5'
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)


### Step 7. Training the model

In [0]:
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,  # 2000 images = batch_size * steps
      epochs=10,
      validation_data=validation_generator,
      validation_steps=10,  # 1000 images = batch_size * steps
      callbacks = [checkpoint],
      verbose=2)

##Task 3: Loading the weights of trained model and start training again.

### Step 8.  Retraining from saved model

In [0]:
## Go to the /tmp folder and copy the name of the last saved model

## Start Your Code Here ##
model_modify=tf.keras.models.load_model() # Complete the code

## End Your Code Here ##

### Step 9. Compile the modified model 

In [0]:
## Start Your Code Here ##
## Hint: Loss is 'binary_crossentropy',
## optimizer: RMSprop(lr=1e-4),
## metrics: 'acc'
model_modify.compile()## Complete the missing paramenters
## End Your Code Here ###

model_modify.summary()


In [0]:
# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

## Start Your Code Here ###

# Flow training images in batches of 20 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        train_dir,  # This is the source directory for training images
        target_size=(), ##Complete the code # All images will be resized to 150x150
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

# Flow validation images in batches of 20 using test_datagen generator
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(), ## Complete the code
        batch_size=20,
        class_mode='binary')

## End Your Code Here ###

In [0]:
## Adding checkpoints after every 2-epochs 
filepath='/tmp/weights_modified.{epoch:02d}-{val_loss:.2f}.hdf5'
checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=2)


In [0]:
# Train the model
history = model_modify.fit_generator(
      train_generator,
      steps_per_epoch=100,  # 2000 images = batch_size * steps
      epochs=10,
      validation_data=validation_generator,
      validation_steps=10,  # 1000 images = batch_size * steps
      callbacks = [checkpoint],
      verbose=2)

## Task 4: Using out-of-the-box models for classification

## Task 5: Transfer Learning

[How to use pretrained networks for out of the box classification](https://keras.io/applications/)

### Step 1. Mount the google drive.

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

In [0]:
cd /content/gdrive/My Drive/42028-DL-CNN-2020/Week7-Lab7/images

### Step 2. Using ResNet50 pretrained model for classification 

In [0]:
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np

# Load the ResNet50 model with pretrained weights
model = ResNet50(weights='imagenet')

img_path = '/content/gdrive/My Drive/42028-DL-CNN/Week7-Lab7/images/cat.jpg'
img = image.load_img(img_path, target_size=(224, 224))
plt.imshow(img)
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

### Step 3: Transfer learning

We will use the VGG16 CNN architecture as the base model and adapt/re-train the FC layers to Dogs and Cats classification task.

VGG16 CNN architecture is given below:

![alt text](https://neurohive.io/wp-content/uploads/2018/11/vgg16.png)

In [0]:
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet',include_top=False, input_shape=(150, 150, 3))


from keras import models
from keras import layers
from keras import optimizers

# Load the CONV layers of VGG16 model and add the FC layers

model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))



In [0]:
conv_base.summary()
for layer in conv_base.layers[:-4]:
    layer.trainable = False
 
# Check the trainable status of the individual layers
for layer in conv_base.layers:
    print(layer, layer.trainable)

In [0]:
model.summary()

### Step 4:  Training CNN with ImageDataGenerator

In [0]:
# Updated to do image augmentation
train_datagen = ImageDataGenerator(
      rescale=1./255,
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale=1./255)

## Start Your Code Here ###

# Flow training images in batches of 20 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        train_dir,  # This is the source directory for training images
        target_size=(),  #Complete the parameters # All images will be resized to 150x150
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

# Flow validation images in batches of 20 using test_datagen generator
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(), #Complete the parameters
        batch_size=20,
        class_mode='binary')

## Start Your Code Here ###

In [0]:
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

In [0]:
# Train the model
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,  # 2000 images = batch_size * steps
      epochs=10,
      validation_data=validation_generator,
      validation_steps=50,  # 1000 images = batch_size * steps
      verbose=2)

### Step 5:  Visualization of results 

In [0]:
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

## Task 6: AlexNet implementation

The Alexnet CNN architecture is show in the diagram given below:

![alt text](https://www.oreilly.com/library/view/tensorflow-for-deep/9781491980446/assets/tfdl_0106.png)

### Create the Alexnet architecture

In [0]:
model = tf.keras.models.Sequential([
    #Conv_1          #original model was built for input shape of 224X224
    tf.keras.layers.Conv2D(96, (11,11),strides=4, padding='valid', activation='relu', input_shape=(224, 224, 3)),
    # Pooling_1
    tf.keras.layers.MaxPooling2D((2, 2), strides=(2,2),padding='valid'),
    # Batch Normalisation_1
    tf.keras.layers.BatchNormalization(),
    # Conv_2
    tf.keras.layers.Conv2D(256, (11,11),strides=1, padding='valid', activation='relu'),
    # Pooling_2
    tf.keras.layers.MaxPooling2D((2, 2), strides=(2,2),padding='valid'),
    #Batch Normalisation_2
    tf.keras.layers.BatchNormalization(),
    # Conv_3
    tf.keras.layers.Conv2D(384, (3,3),strides=1, padding='valid', activation='relu'),
    # Batch Normalisation_3
    tf.keras.layers.BatchNormalization(),
    # Conv_4
    tf.keras.layers.Conv2D(384, (3,3),strides=1, padding='valid', activation='relu'),
    # Batch Normalisation_3
    tf.keras.layers.BatchNormalization(),
    #conv_5
    tf.keras.layers.Conv2D(256, (3,3),strides=1, padding='valid', activation='relu'),
    #pooling_3
    tf.keras.layers.MaxPooling2D((2, 2), strides=(2,2),padding='valid'),
    #Batch Normalization_4
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Flatten(),
    #Dense layer_1
    tf.keras.layers.Dense(4096, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.BatchNormalization(),
    #Dense layer_2
    tf.keras.layers.Dense(4096, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.BatchNormalization(),
    #Dense layer_3
    tf.keras.layers.Dense(1000, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(1, activation='sigmoid')
    ])


###Create the dataset by resizing the images to 224X224 for training Alexnet.

In [0]:
# Updated to do image augmentation
train_datagen = ImageDataGenerator(
      rescale=1./255,
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale=1./255)

## Start Your Code Here ###

# Flow training images in batches of 20 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        train_dir,  # This is the source directory for training images
        target_size=(),  # Complete the code, All images will be resized to 224x224
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

# Flow validation images in batches of 20 using test_datagen generator
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(), # Complete the code, All images will be resized to 224x224
        batch_size=20,
        class_mode='binary')

## End Your Code Here ###

In [0]:
model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(lr=1e-4),
              metrics=['acc'])

In [0]:
# Train the model
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,  # 2000 images = batch_size * steps
      epochs=20,
      validation_data=validation_generator,
      validation_steps=10,  # 1000 images = batch_size * steps
      #callbacks = [checkpoint],
      verbose=2)

###Visualization of results


This is just for illustration only. The actual accuracy may vary based on the number of epochs

In [0]:
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()