# Problem III

## Part A
### In theory

When using transfer learning for a small dataset where the dataset is very different from the dataset that the pre-trained model was trained on, there are some tips to know:

**Training only the higher layers of pre-trained model**: In the pre-trained model, we should freeze and fix the parameters of the first layers of the network, since they give us some generic features of the input that might be useful for our task. So we want our model to have these layers as they were in the pre-trained model. But the higer layers give more specific features of the input that might not be what we need in our dataset, so we train these layers alongside the layers that we add to the end of the network (if any)

![freezing lower layers](https://www.researchgate.net/publication/333882146/figure/fig4/AS:771649246879745@1560986925876/TOP-LEVEL-DIAGRAM-OF-TRANSFER-LEARNING-FROM-A-PRE-TRAINED-CNN-MODEL.png)

### In practice

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import backend as K
from skimage.color import gray2rgb

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

x_train_rgb = np.array([gray2rgb(img) for img in x_train])
x_test_rgb = np.array([gray2rgb(img) for img in x_test])

x_train_rgb_resized = tf.image.resize(x_train_rgb, size=(32, 32)).numpy()
x_test_rgb_resized = tf.image.resize(x_test_rgb, size=(32, 32)).numpy()

y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))

for layer in base_model.layers[0:13]:
    layer.trainable = False

model = Sequential([
    base_model,
    Flatten(),
    Dense(10, activation='softmax')
])

model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(
    x_train_rgb_resized, y_train,
    batch_size=32,
    epochs=1,
    validation_data=(x_test_rgb_resized, y_test)
)

 144/1875 [=>............................] - ETA: 1:06:59 - loss: 0.6032 - accuracy: 0.7834

KeyboardInterrupt: 

As you can see, I freezed the first 13 layers of VGG16 model (which are convolutional layers) and the result for Fashion MNIST image classification is really accurate, even though the dataset is significant from the dataset that the VGG16 model was trained on.

![vgg16](https://media.geeksforgeeks.org/wp-content/uploads/20200219152327/conv-layers-vgg16.jpg)

In [2]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train = x_train / 255.0
x_test = x_test / 255.0

y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))

for layer in base_model.layers:
    layer.trainable = False

x = base_model.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(10, activation='softmax')(x)

model = tf.keras.Model(inputs=base_model.input, outputs=x)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(x_train, y_train, epochs=3, batch_size=32)

loss, accuracy = model.evaluate(x_test, y_test)
print(f'Test loss: {loss}')
print(f'Test accuracy: {accuracy}')

Epoch 1/3
Epoch 2/3
Epoch 3/3
Test loss: 1.182462453842163
Test accuracy: 0.579200029373169
