# Observe results before and after applying Transfer Learning.

Transfer learning is a research problem in machine learning & deep learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks.So, in transfer learning your previous learning helps you to understand the new concept or learning. In transfer learning we use pre-trained model & we make some modification on that to make a new model.

So, here we will create a hand writing classifier by MNIST data then we will modify this model to predict a given number is odd or even by the help of transfer learning. Then we will compare them not using transfer learning.


# Now first creating a Model which can classify MNIST handwriting:

In [1]:
# Importing necessary libraries
import os
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import time
plt.style.use("fivethirtyeight")
%load_ext tensorboard

In [2]:
# Loading the data of MNIST handwritten
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
X_train_full = X_train_full / 255.0
X_test = X_test / 255.0
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

In [3]:
# Creating layer of model
tf.random.set_seed(42)  #For getting similar output (optional)
np.random.seed(42)  #For getting similar output (optional)

LAYERS = [ tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, kernel_initializer="he_normal"),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dense(100, kernel_initializer="he_normal"),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dense(10, activation="softmax")]


model = tf.keras.models.Sequential(LAYERS)

In [4]:
# Compiling the model
model.compile(loss="sparse_categorical_crossentropy",
              optimizer=tf.keras.optimizers.SGD(learning_rate=1e-3),
              metrics=["accuracy"])

In [5]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 300)               235500    
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 300)               0         
                                                                 
 dense_1 (Dense)             (None, 100)               30100     
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 100)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                1010      
                                                                 
Total params: 266610 (1.02 MB)
Trainable params: 266610 

In [6]:
# Lets train the model
history = model.fit(X_train, y_train, epochs=10,
                    validation_data=(X_valid, y_valid), verbose=2)

Epoch 1/10
1719/1719 - 6s - loss: 1.5417 - accuracy: 0.5967 - val_loss: 0.9407 - val_accuracy: 0.8024 - 6s/epoch - 3ms/step
Epoch 2/10
1719/1719 - 6s - loss: 0.7444 - accuracy: 0.8224 - val_loss: 0.5932 - val_accuracy: 0.8540 - 6s/epoch - 3ms/step
Epoch 3/10
1719/1719 - 6s - loss: 0.5451 - accuracy: 0.8571 - val_loss: 0.4782 - val_accuracy: 0.8738 - 6s/epoch - 3ms/step
Epoch 4/10
1719/1719 - 5s - loss: 0.4639 - accuracy: 0.8737 - val_loss: 0.4198 - val_accuracy: 0.8882 - 5s/epoch - 3ms/step
Epoch 5/10
1719/1719 - 5s - loss: 0.4189 - accuracy: 0.8835 - val_loss: 0.3850 - val_accuracy: 0.8958 - 5s/epoch - 3ms/step
Epoch 6/10
1719/1719 - 4s - loss: 0.3897 - accuracy: 0.8909 - val_loss: 0.3618 - val_accuracy: 0.9008 - 4s/epoch - 3ms/step
Epoch 7/10
1719/1719 - 5s - loss: 0.3686 - accuracy: 0.8958 - val_loss: 0.3438 - val_accuracy: 0.9054 - 5s/epoch - 3ms/step
Epoch 8/10
1719/1719 - 6s - loss: 0.3524 - accuracy: 0.9003 - val_loss: 0.3296 - val_accuracy: 0.9104 - 6s/epoch - 3ms/step
Epoch 9/

In [8]:
# Saving the model
# model.save("pretrained_mnist_model.h5")
model.save("pretrained_mnist_model.keras")

# Now Lets create a model which can  predict a given number is odd or even without having Transfer learning technique.

In [9]:
# Making the label as an even or odd category from numbers where even is 1 and odd is 0

def update_even_odd_labels(labels):
  for idx, label in enumerate(labels):
    labels[idx] = np.where(label % 2 == 0, 1, 0)
  return labels

In [10]:
y_train_bin, y_test_bin, y_valid_bin = update_even_odd_labels([y_train, y_test, y_valid])

In [11]:
# Creating layer of model
tf.random.set_seed(42)  #For getting similar output (optional)
np.random.seed(42)  #For getting similar output (optional)

LAYERS = [ tf.keras.layers.Flatten(input_shape=[28, 28]),
    tf.keras.layers.Dense(300, kernel_initializer="he_normal"),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dense(100, kernel_initializer="he_normal"),
    tf.keras.layers.LeakyReLU(),
    tf.keras.layers.Dense(2, activation="softmax")]  # Here I have just used 2 output layers cz, our output would be 0 or 1


model_1 = tf.keras.models.Sequential(LAYERS)

In [12]:
model_1.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_1 (Flatten)         (None, 784)               0         
                                                                 
 dense_3 (Dense)             (None, 300)               235500    
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 300)               0         
                                                                 
 dense_4 (Dense)             (None, 100)               30100     
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 100)               0         
                                                                 
 dense_5 (Dense)             (None, 2)                 202       
                                                                 
Total params: 265802 (1.01 MB)
Trainable params: 26580

In [13]:
# Compiling new model
model_1.compile(loss="sparse_categorical_crossentropy",
              optimizer=tf.keras.optimizers.SGD(learning_rate=1e-3),
              metrics=["accuracy"])

In [14]:
# now training & calculating the training time.

# starting time
start = time.time()

history = model_1.fit(X_train, y_train_bin, epochs=10,
                    validation_data=(X_valid, y_valid_bin), verbose=2)

#ending time
end = time.time()

# total time taken
print(f"Runtime of the program is {end - start}")

Epoch 1/10
1719/1719 - 6s - loss: 0.4486 - accuracy: 0.7950 - val_loss: 0.3346 - val_accuracy: 0.8620 - 6s/epoch - 3ms/step
Epoch 2/10
1719/1719 - 5s - loss: 0.3172 - accuracy: 0.8657 - val_loss: 0.2832 - val_accuracy: 0.8830 - 5s/epoch - 3ms/step
Epoch 3/10
1719/1719 - 5s - loss: 0.2813 - accuracy: 0.8845 - val_loss: 0.2530 - val_accuracy: 0.9026 - 5s/epoch - 3ms/step
Epoch 4/10
1719/1719 - 5s - loss: 0.2564 - accuracy: 0.8986 - val_loss: 0.2308 - val_accuracy: 0.9126 - 5s/epoch - 3ms/step
Epoch 5/10
1719/1719 - 5s - loss: 0.2358 - accuracy: 0.9090 - val_loss: 0.2119 - val_accuracy: 0.9232 - 5s/epoch - 3ms/step
Epoch 6/10
1719/1719 - 5s - loss: 0.2184 - accuracy: 0.9179 - val_loss: 0.1955 - val_accuracy: 0.9316 - 5s/epoch - 3ms/step
Epoch 7/10
1719/1719 - 6s - loss: 0.2029 - accuracy: 0.9258 - val_loss: 0.1819 - val_accuracy: 0.9398 - 6s/epoch - 3ms/step
Epoch 8/10
1719/1719 - 5s - loss: 0.1895 - accuracy: 0.9317 - val_loss: 0.1698 - val_accuracy: 0.9442 - 5s/epoch - 3ms/step
Epoch 9/

## Conclusion:  
Runtime of the program is 50.24 sec & val_accuracy: 0.9516

# Now Let's create the same model which can predict a given number is odd or even with having Transfer learning technique.

In [15]:
# Loading pre-trained model
# pretrained_mnist_model = tf.keras.models.load_model("pretrained_mnist_model.h5")
pretrained_mnist_model = tf.keras.models.load_model("pretrained_mnist_model.keras")

In [16]:
pretrained_mnist_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 300)               235500    
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 300)               0         
                                                                 
 dense_1 (Dense)             (None, 100)               30100     
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 100)               0         
                                                                 
 dense_2 (Dense)             (None, 10)                1010      
                                                                 
Total params: 266610 (1.02 MB)
Trainable params: 266610 

In [17]:
# Checking layers are trainable or not
for layer in pretrained_mnist_model.layers:
  print(f"{layer.name}: {layer.trainable}")

flatten: True
dense: True
leaky_re_lu: True
dense_1: True
leaky_re_lu_1: True
dense_2: True


In [None]:
# # Lets make them false or non trainable except last one
# for layer in pretrained_mnist_model.layers[:-1]:
#   layer.trainable = False
#   print(f"{layer.name}: {layer.trainable}")

flatten_2: False
dense_7: False
leaky_re_lu_4: False
dense_8: False
leaky_re_lu_5: False


In [18]:
# Lets make them false or non trainable except last one
for layer in pretrained_mnist_model.layers[:-1]:
  layer.trainable = False
  print(f"{layer.name}: {layer.trainable}")

flatten: False
dense: False
leaky_re_lu: False
dense_1: False
leaky_re_lu_1: False


In [None]:
# for layer in pretrained_mnist_model.layers:
#   print(f"{layer.name}: {layer.trainable}")

flatten_2: False
dense_7: False
leaky_re_lu_4: False
dense_8: False
leaky_re_lu_5: False
dense_9: True


In [19]:
for layer in pretrained_mnist_model.layers:
  print(f"{layer.name}: {layer.trainable}")

flatten: False
dense: False
leaky_re_lu: False
dense_1: False
leaky_re_lu_1: False
dense_2: True


In [20]:
# Now make a model using that one
lower_pretrained_layers = pretrained_mnist_model.layers[:-1]

new_model = tf.keras.models.Sequential(lower_pretrained_layers)
new_model.add(
    tf.keras.layers.Dense(2, activation="softmax")
)

In [21]:
new_model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 300)               235500    
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 300)               0         
                                                                 
 dense_1 (Dense)             (None, 100)               30100     
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 100)               0         
                                                                 
 dense_6 (Dense)             (None, 2)                 202       
                                                                 
Total params: 265802 (1.01 MB)
Trainable params: 202 (

In [22]:
# Making the label as an even or odd category from numbers where even is 1 and odd is 0

def update_even_odd_labels(labels):
  for idx, label in enumerate(labels):
    labels[idx] = np.where(label % 2 == 0, 1, 0)
  return labels

In [23]:
y_train_bin, y_test_bin, y_valid_bin = update_even_odd_labels([y_train, y_test, y_valid])

In [24]:
# Compiling new model
new_model.compile(loss="sparse_categorical_crossentropy",
              optimizer=tf.keras.optimizers.SGD(learning_rate=1e-3),
              metrics=["accuracy"])

In [25]:
# starting time
start = time.time()

history = new_model.fit(X_train, y_train_bin, epochs=10,
                    validation_data=(X_valid, y_valid_bin), verbose=2)

#ending time
end = time.time()
print(f"Runtime of the program is {end - start}")

Epoch 1/10
1719/1719 - 4s - loss: 0.4332 - accuracy: 0.8106 - val_loss: 0.3448 - val_accuracy: 0.8480 - 4s/epoch - 3ms/step
Epoch 2/10
1719/1719 - 4s - loss: 0.3490 - accuracy: 0.8460 - val_loss: 0.3230 - val_accuracy: 0.8592 - 4s/epoch - 3ms/step
Epoch 3/10
1719/1719 - 4s - loss: 0.3332 - accuracy: 0.8553 - val_loss: 0.3107 - val_accuracy: 0.8662 - 4s/epoch - 2ms/step
Epoch 4/10
1719/1719 - 4s - loss: 0.3231 - accuracy: 0.8606 - val_loss: 0.3027 - val_accuracy: 0.8720 - 4s/epoch - 2ms/step
Epoch 5/10
1719/1719 - 4s - loss: 0.3155 - accuracy: 0.8643 - val_loss: 0.2961 - val_accuracy: 0.8774 - 4s/epoch - 2ms/step
Epoch 6/10
1719/1719 - 4s - loss: 0.3097 - accuracy: 0.8674 - val_loss: 0.2909 - val_accuracy: 0.8806 - 4s/epoch - 2ms/step
Epoch 7/10
1719/1719 - 4s - loss: 0.3047 - accuracy: 0.8699 - val_loss: 0.2868 - val_accuracy: 0.8818 - 4s/epoch - 2ms/step
Epoch 8/10
1719/1719 - 4s - loss: 0.3006 - accuracy: 0.8726 - val_loss: 0.2831 - val_accuracy: 0.8808 - 4s/epoch - 3ms/step
Epoch 9/

# Conclusion:
Runtime of the program is 42.07 sec & val_accuracy: 0.8830

# Comparison:

## Without Transfer learning:
- Runtime of the program is 50.24 sec 
- val_accuracy: 0.9516

## With Transfer Learning:
- Runtime of the program is 42.07 sec
- val_accuracy: 0.8830


Here we can we have transfer learning output is pretty close to actual accuracy, although we are just training 202 parameters. So, if we increase the epochs then the accuracy would be high. Now it is taking same time but in big problem it may take less time with compare to Without Transfer learning.