![title](https://i.ibb.co/f2W87Fg/logo2020.png)

---


<table  class="tfo-notebook-buttons" align="left"><tr><td>
    
<a href="https://colab.research.google.com/github/adf-telkomuniv/CV2020_Exercises/blob/main/CV2020 - 10 - Transfer Learning.ipynb" source="blank" ><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
</td><td>
<a href="https://github.com/adf-telkomuniv/CV2020_Exercises/blob/main/CV2020 - 10 - Transfer Learning.ipynb" source="blank" ><img src="https://i.ibb.co/6NxqGSF/pinpng-com-github-logo-png-small.png"></a>
    
</td></tr></table>



# Task 10 - Transfer Learning


In this assignment you will practice in using the available pretrained model in Keras and use it as a Transfer Learning to fine tune your model

The goals of this assignment are as follows:

    * train and fine tune both full vgg and mini vgg on CIFAR-10
      * train from scratch
      * fine tune all layers
      * train classifier head
      * fine tune selected layers


Write down your Name and Student ID

In [None]:
## --- start your code here ----

NIM = ??
Nama = ??

## --- end your code here ----

---
---
#[Part 0] Import Libraries and Load Data

---
## 1 - Import Libraries
Import required libraries

In [None]:
import time
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras import backend as K
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.applications.vgg16 import VGG16

from tensorflow.keras.utils import to_categorical
from tensorflow.keras.utils import plot_model

%matplotlib inline
np.set_printoptions(precision=7)
%load_ext autoreload
%autoreload 2

---
## 2 - Load CIFAR-10

In [None]:
(X_train_ori, y_train), (X_test_ori, y_test) = tf.keras.datasets.cifar10.load_data()

class_names = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


---
## 3 - Split Validation Data

In [None]:
X_val_ori   = X_train_ori[-10000:,:]
y_val       = y_train[-10000:]

X_train_ori = X_train_ori[:-10000, :]
y_train     = y_train[:-10000]

---
## 4 - Normalize and Reshape Data

In [None]:
X_train    = X_train_ori.astype('float32')
X_val      = X_val_ori.astype('float32')
X_test     = X_test_ori.astype('float32')

mean_image = X_train.mean(axis=(0, 1, 2), keepdims=True)
std_image  = X_train.std(axis=(0, 1, 2), keepdims=True)

X_train    = (X_train - mean_image) /std_image
X_val      = (X_val - mean_image) /std_image
X_test     = (X_test - mean_image) /std_image

X_train    = X_train.astype('float32')
X_val      = X_val.astype('float32')
X_test     = X_test.astype('float32')

print('X_train.shape =',X_train.shape)
print('X_val.shape   =',X_val.shape)
print('X_test.shape  =',X_test.shape)

y_train = y_train.ravel()
y_val   = y_val.ravel()
y_test  = y_test.ravel()

print('\ny_train.shape =',y_train.shape)
print('y_val.shape   =',y_val.shape)
print('y_test.shape  =',y_test.shape)

one hot the label

In [None]:
y_train_hot = to_categorical(y_train, 10)
y_val_hot   = to_categorical(y_val, 10)
y_test_hot  = to_categorical(y_test, 10)

print('y_train_hot.shape =',y_train_hot.shape)
print('y_val_hot.shape   =',y_val_hot.shape)
print('y_test_hot.shape  =',y_test_hot.shape)

---
## 5 - Define Hyperparameter

In [None]:
batch_size = 128
epochs = 10
history = {}
scores = {}

---
---
# [Part 1] Helper Function

---
## 1 - Print Model Parameters

Function to display list of layer and trainable status. <br> It also shows the total parameters

Kind of simplified &nbsp;`model.summary()`

In [None]:
def print_params(model):  
    def count_params(weights):
        """Count the total number of scalars composing the weights.
        # Arguments
            weights: An iterable containing the weights on which to compute params
        # Returns
            The total number of scalars composing the weights
        """
        weight_ids = set()
        total = 0
        for w in weights:
            if id(w) not in weight_ids:
                weight_ids.add(id(w))
                total += int(K.count_params(w))
        return total
    
    trainable_count = count_params(model.trainable_weights)
    non_trainable_count = count_params(model.non_trainable_weights)
    
    print('id\ttrainable : layer name')
    print('-------------------------------')
    for i, layer in enumerate(model.layers):
        print(i,'\t',layer.trainable,'\t  :',layer.name)
    print('-------------------------------')

    print('Total params: {:,}'.format(trainable_count + non_trainable_count))
    print('Trainable params: {:,}'.format(trainable_count))
    print('Non-trainable params: {:,}'.format(non_trainable_count))

---
## 2 - Model Builder

Function to build the vgg model to train in cifar-10


In [None]:
def build_model(name, weights='imagenet', cut_at=-1, unfreeze_from=0, opt='adam'):
    
    # load model
    model = VGG16(weights=weights, include_top=False, input_shape=(32,32,3))
    
    # freeze all layer
    for layer in model.layers:
        layer.trainable = False
    
    # select layer output
    if cut_at==-1:
        x = model.output
    else:
        x = model.layers[cut_at].output
        
    # add new classifier head
    x = GlobalAveragePooling2D()(x)
    x = Dense(512, activation='relu')(x)
    x = Dense(512, activation='relu')(x)
    predictions = Dense(10, activation='softmax')(x)

    # instantiate new model
    myModel = Model(inputs=model.input, outputs=predictions, name=name)
        
    # unfreeze selected layer
    for layer in myModel.layers[unfreeze_from:]:
        layer.trainable = True
          
    # compile model
    myModel.compile(
          loss='categorical_crossentropy',
          optimizer=opt, 
          metrics=['accuracy']
    )
          
    # print parameters
    print_params(myModel)
    
    return myModel

---
## 3 - Plot Training History

In [None]:
def plot_history(history):
    plt.rcParams['figure.figsize'] = [12, 4]
    plt.subplots_adjust(wspace=0.2)

    plt.subplot(121)
    # Plot training & validation accuracy values
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title('Model accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Val'])

    plt.subplot(122)
    # Plot training & validation loss values
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Val'])
    plt.show()

---
## 4 - Plot All Training Histories

In [None]:
def plot_all(history, labels=None):  
    if labels==None:
        hist = history
    else:
        hist = {k: history[k] for k in labels}
    
    plt.rcParams['figure.figsize'] = [14, 5]
    plt.subplots_adjust(wspace=0.2)
    
    plt.subplot(121)
    for key, h in hist.items():
        plt.plot(h.history['accuracy'], label=key)
    plt.title('Train accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend()

    plt.subplot(122)
    for key, h in hist.items():
        plt.plot(h.history['val_accuracy'], label=key)
    plt.title('Validation Accuracy')
    plt.xlabel('Epoch')
    plt.legend()
    
    plt.tight_layout()
    plt.show()  
  

---
## 5 - Plot Accuracy Bar

In [None]:
def plot_bar(scores):  
    scores_1={k: scores[k] for k in ('full_scratch', 'full_all', 'full_classifier', 'full_10')}
    scores_2={k: scores[k] for k in ('mini_scratch', 'mini_all', 'mini_classifier', 'mini_10')}
    
    plt.rcParams['figure.figsize'] = [8, 6]
    acc_1, acc_2 = [], []
    labels = ['scratch', 'all', 'classifier', '10']
    x = np.arange(len(labels))
    width = 0.35
    
    for key, s in scores_1.items():
        acc_1.append(s[1])    
        
    for key, s in scores_2.items():
        acc_2.append(s[1])
        
    fig, ax = plt.subplots()
    plt.bar(x - width/2, acc_1, width, label='full')
    plt.bar(x + width/2, acc_2, width, label='mini')
    
    plt.xticks(x,labels)
    plt.title('Test Accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Model')
    plt.ylim(0,.9)
    plt.legend()
    plt.tight_layout()
    plt.show()  
  

---
---
# [Part 2] Train Full VGG


As you've tried in previous exercise, now we're going to train Full VGG on CIFAR-10 dataset

Full VGG architecture takes all 13 Conv layers then we add a Global Pooling followed by three FC Layer

we'll compare the performance if we train

    a. the architecture from scratch (random initialization),
    b. fine tune the weight from ImageNet pretrain, 
    c. train the classifier head only, and
    d. fine tune from layer 10 to the classifier head

---
## 1 - Train From Scratch

Let's define our first model

In [None]:
model_1S = build_model('full_scratch', weights=None)

Now train the model for 10 epochs with batch size=100

In [None]:
history[model_1S.name] = model_1S.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should plateau around 2.3 
with training accuracy plateau around 10% 

In [None]:
plot_history(history[model_1S.name])

scores[model_1S.name] = model_1S.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_1S.name][1]*100))

### NOTE
* You should see that the training got stuck at $10\%$ accuracy

* Possible cause is because the model is too deep, too many parameters to train. 

* Another possible cause is as is has been explained before, using 5 blocks of VGG yields to $[1\times1]$ output feature,<br> which is absolutely too little to classify<br> *see &nbsp;`model.summary()`*

---
## 2 - Fine Tune ImageNet Pretrain

Now let's try to fine tune the network using weights taken from ImageNet Pretraining

In [None]:
model_1A = build_model('full_all', weights='imagenet')

In [None]:
history[model_1A.name] = model_1A.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.9 and end around 0.35 
with training accuracy start around 20% and end around 88%

In [None]:
plot_history(history[model_1A.name])

scores[model_1A.name] = model_1A.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_1A.name][1]*100))

### NOTE
* You should see that the the pretrained weights greatly help the training as the training accuracy jump straight fron $20\%$ to $80\%$

* This happened because the weights already formed the feature extraction from being trained on ImageNet

* And since CIFAR-10 dataset is essentially a subset of ImageNet dataset, the model don't need much to learn the Conv Layers, and just straight to train the classifier head

* As we set all layers trainable, during training, the weights in all Conv layers are also updated to fit the current dataset

---
## 3 - Train Classifier Head

Based on previous explanation, we conclude that the Conv Layers are already trained, so we can just train the classifier head

For that, this time we freeze all layers except the three last FC Layers

In [None]:
model_1C = build_model('full_classifier', weights='imagenet', unfreeze_from=19)

In [None]:
history[model_1C.name] = model_1C.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.2 and end around 0.24 
with training accuracy start around 60% and end around 91%

In [None]:
plot_history(history[model_1C.name])

scores[model_1C.name] = model_1C.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_1C.name][1]*100))

### NOTE
* ***Oh No!*** *The model overfits!*

* This may happened because, again, the model is too deep and the output are just $[1\times1]$

* The feature created from the latter conv layers are too detailed for CIFAR-1 dataset, thus the classifier head overfit to the training set

---
## 4 - Train From Layer 10

So now, let's try to fine tune more layers so that the model can fit better to CIFAR-10 dataset

In here we set to train layers from $4^{th}$ block up to the classifier head

In [None]:
model_1N = build_model('full_10', weights='imagenet', unfreeze_from=10)

In [None]:
history[model_1N.name] = model_1N.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.7 and end around 0.3 
with training accuracy start around 30% and end around 89%

In [None]:
plot_history(history[model_1N.name])

scores[model_1N.name] = model_1N.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_1N.name][1]*100))

### NOTE
* *Much better*

* By training from much shallower layer, the accuracy has increased

* This happen because the earlier conv layers build a lower feature extraction such as dots, colors, and lines 

* which doesn't need to be changed much to be used in CIFAR-10 dataset

---
## 5 - Comparison

In [None]:
plot_all(history, ['full_scratch', 'full_all', 'full_classifier', 'full_10'])

You should see that the validation accuracy is slightly equal between fine tuned model using all layers and just from layer 10

---
---
# [Part 3] Train Mini VGG

Now let's try all that again, but using Mini VGG

As you've tried to implement in previous exercise, we cut the VGG up to the fourth block, so the output feature activation is not too small, 

Then add three FC layers for classification head

<br>

Again, we'll compare the performance if we train

    a. the architecture from scratch (random initialization),
    b. fine tune the weight from ImageNet pretrain, 
    c. train the classifier head only, and
    d. fine tune from layer 10 to the classifier head

---
## 1 - Train From Scratch

In [None]:
model_2S = build_model('mini_scratch', cut_at=13, weights=None)

In [None]:
history[model_2S.name] = model_2S.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 2 and end around 0.4 
with training accuracy start around 20% and end around 85%

In [None]:
plot_history(history[model_2S.name])

scores[model_2S.name] = model_2S.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_2S.name][1]*100))

### NOTE
* Training mini VGG from scratch performs much better than Full VGG as the output feature is larger

* With more input, the classifier head can classify better

* <font color='red'>nb: You might get stuck from bad weight initialization, try to re-initialize the model and train it again</font>

---
## 2 - Fine Tune ImageNet Pretrain

Now let's fine tune using ImageNet Pretraining

In [None]:
model_2A = build_model('mini_all', cut_at=13, weights='imagenet')

In [None]:
history[model_2A.name] = model_2A.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.9 and end around 0.25
with accuracy start around 25% and end around 91%

In [None]:
plot_history(history[model_2A.name])

scores[model_2A.name] = model_2A.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_2A.name][1]*100))

### NOTE
* Training pretrained mini VGG does not decrease the performence

* This validates that the network does not have to be that deep for this small dataset

---
## 3 - Train Classifier Head

So let's try to train the classifier head only

In [None]:
# fine tune classifier head
model_2C = build_model('mini_classifier', cut_at=13, weights='imagenet', unfreeze_from=14)


In [None]:
history[model_2C.name] = model_2C.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.0 and end around 0.2 
with accuracy start around 65% and end around 92%

In [None]:
plot_history(history[model_2C.name])

scores[model_2C.name] = model_2C.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_2C.name][1]*100))

### NOTE
* Seems like training the classifier head only still yields to overfitting

* Though with shallower model, we get slightly better performance

---
## 4 - Train From Layer 10

Now to fine tune the architecture from $4^{th}$ block up to the classifier head 

In [None]:
model_2N = build_model('mini_10', cut_at=13, weights='imagenet', unfreeze_from=10 )

In [None]:
history[model_2N.name] = model_2N.fit(
    X_train, y_train_hot, 
    validation_data=(X_val, y_val_hot), 
    batch_size=batch_size, 
    epochs=epochs, 
    verbose=2
)

**EXPECTED OUTPUT**:
<pre>
the training loss should start around 1.5 and end around 0.3 
with accuracy start around 45% and end around 89%

In [None]:
plot_history(history[model_2N.name])

scores[model_2N.name] = model_2N.evaluate(X_test, y_test_hot, verbose=0)

print("\nModel Accuracy: %.2f%%" % (scores[model_2N.name][1]*100))

### NOTE
* Again, training from much shallower layer increase the accuracy and prevent overfitting


---
## 5 - Comparison

In [None]:
plot_all(history, ['mini_scratch', 'mini_all', 'mini_classifier', 'mini_10'])

You should see that the validation accuracy is slightly equal between fine tuned model using all layers and just from layer 10

---
---

# [Part 4] Overall Scores

Let's show the training, validation, and testing accuracy comprison

---
## 1 - Train-Val Accuracy

In [None]:
plot_all(history)

---
## 2 - Test Accuracy

In [None]:
plot_bar(scores)

---
---
# [Part 5] CIFAR-10 Open-ended Challenge

In this section you can experiment with whatever ConvNet architecture you'd like on CIFAR-10.

You should experiment with **architectures**, **hyperparameters**, **loss functions**, **regularization**, or anything else you can think of to train a model 

You should achieve <font color='blue' size='5'><b>at least 90% accuracy</b></font> on the **validation** set <font color='red' size='4'><b>within 10-20 epochs</b></font>. 


---
## Available Models you can try:
- [VGG19](https://www.tensorflow.org/api_docs/python/tf/keras/applications/vgg19)
- [InceptionV3](https://www.tensorflow.org/api_docs/python/tf/keras/applications/inception_v3)
- [ResNet](https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet)
- [ResNet v2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet_v2)
- [Inception ResNet v2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/inception_resnet_v2)
- [Xception](https://www.tensorflow.org/api_docs/python/tf/keras/applications/xception)
- [DenseNet](https://www.tensorflow.org/api_docs/python/tf/keras/applications/densenet)
- [NasNet](https://www.tensorflow.org/api_docs/python/tf/keras/applications/nasnet)
- [MobileNet](https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet)
- [MobileNet v2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v2)

---
## Tips for training
For each network architecture that you try, you should tune the learning rate and other hyperparameters. 

When doing this there are a couple important things to keep in mind:

- If the parameters are working well, you should see improvement within a few hundred iterations

- Use small learning rate to fine tune pretrained model

- Once you have found some sets of parameters that seem to work, search more finely around these parameters. You may need to train for more epochs.

- You should use the validation set for hyperparameter search, and save your test set for evaluating your architecture on the best parameters as selected by the validation set.

<center>
<h2><font color='blue'>--- Go Wild, Have Fun, and Happy Training!  --- </font></h2>

---
## 1 - Define Model

---
#### <font color='red'>**EXERCISE:** </font>

    Design your Convolutional Neural Network Architecture

    

In [None]:
myModel = ??


myModel.summary()

---
## 2 - Train Model

---
#### <font color='red'>**EXERCISE:** </font>

    Compile the model
    Train the model
    

In [None]:
# Compile model
myModel.compile(??)

num_epochs = ??
batch_size = ??

history = myModel.fit(??)


---
## 3 - Evaluate Model

---
#### <font color='red'>**EXERCISE:** </font>

    evaluate your model on test set
    

In [None]:
myModel = load_model(??)

train_scores = myModel.evaluate(X_train, y_train_hot, verbose=1)
val_scores   = myModel.evaluate(X_val, y_val_hot, verbose=1)
test_scores  = myModel.evaluate(X_test, y_test_hot, verbose=1)

print("\nTraining Accuracy: %.2f%%" % (train_scores[1]*100))
print("Validation Accuracy: %.2f%%" % (val_scores[1]*100))
print("Testing Accuracy:    %.2f%%" % (test_scores[1]*100))

**EXPECTED OUTPUT**:
<pre>
you should get above 90% of accuracy for train, val, and test set

---
## 4 - Test Model on New Image

For this part, you have to test your model on new image

First of all, search for five images on the Internet, then list the URLs to the code below.

The five images must belong to the 10 CIFAR-10 classes that the model recognizes.

---
#### <font color='red'>**EXERCISE:** </font>

    define five image urls
    one image has been given for an example, you can change it

In [None]:
!wget -q -O 'data_test_0.jpg' 'https://ichef.bbci.co.uk/news/912/cpsprodpb/160B4/production/_103229209_horsea.png'
!wget -q -O 'data_test_1.jpg' '??'
!wget -q -O 'data_test_2.jpg' '??'
!wget -q -O 'data_test_3.jpg' '??'
!wget -q -O 'data_test_4.jpg' '??'

Run and Recognize the images

In [None]:
import cv2 as cv
from PIL import Image
class_names = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


for i in range(5):
  new_img = Image.open('data_test_'+str(i)+'.jpg')
  new_img = np.array(new_img)
  new_img2 = cv.resize(new_img, (32,32), interpolation=cv.INTER_AREA)
  plt.imshow(new_img2)
  plt.axis('off')
  plt.show()

  new_img2 = (new_img2 - mean_pixel) / std_pixel
  pred = myModel.predict(new_img2)
  class_id = np.argmax(pred)
  print('predicted id   :',class_id)
  print('predicted class:', class_names[class_id])
  print('--------------------------------\n\n')


---

# Congratulation, You've Completed Exercise 10

<p>Copyright &copy;  <a href=https://www.linkedin.com/in/andityaarifianto/>2020 - ADF</a> </p>

![footer](https://i.ibb.co/yX0jfMS/footer2020.png)