### Codio Activity 23.4: Fine Tuning a Pre-trained Network

In addition to the use of a pre-trained network to extract features from a different dataset, the weights can be adjusted or **fine-tuned** as a last step to squeeze additional performance from the network.  To do so, you will again use the `EfficientNetV2B0` network on the `cifar10` data from `keras`.  This time you are encouraged to use the functional API syntax to construct your network.  

For a second example, consult the `keras` documentation example [here](https://keras.io/guides/transfer_learning/). 

#### Index

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)

Run the code cell below to import the necessary libraries.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


from tensorflow.keras.applications.efficientnet_v2 import EfficientNetV2B0
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras import Model, Input



In [2]:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
sub_indices = pd.DataFrame(y_train, columns= ["labels"]).groupby('labels', group_keys=False).apply(lambda x: x.sample(frac=0.3, random_state=123)).index
X_train = X_train[sub_indices]
y_train = y_train[sub_indices]

Y_train = to_categorical(y_train)
Y_test = to_categorical(y_test)


### Problem 1

#### Training the Network to Convergence

In the code cell below, use the function `EfficientNetV2B0` with the appropriate input shape and the argument `include_top` equal to `False` to create the model you will be using for this activity. Assign your result to `base_model`.

Use the function `Input()` with argument `shape` equal to `(32, 32, 3)` and assign your result to the variable `inputs`.

Use `base_model` with argument equal to `inputs` and assign your result to the variable `x`.

Use the function `Flatten` to flatten `x`. The pseudocode to complete this step is given below:

```Python
x = Flatten()(...)
```

Pass `x` through a `Dense` layer with 10 hidden nodes and `activation` equal to `relu` and assign the result to the variable `output`. The pseudocode to complete this step is given below:

```Python
output = Dense(..., activation = ...)(...)
```

Use the code `base_model.trainable = False` to ensure that your weights are not trainable.

Use the function `Model()` with argument `inputs` and `output` to define your model. Assign the result to the variable `model`.

Compile `model` using `categorical_crossentropy` as your `loss` and `accuracy` as your `metric`.

Use the `fit()` function on `model` to fit the  training data `X_train` and `Y_train`. Set the argument `validation_data` equal to `(X_test, Y_test)` and the argument `epochs` equal to 1.  Assign the result to the variable `bottom_model` below. 

NOTE: This question is computationally expensive, so please be patient with the processing. It may take a few minutes based on your computing power. 

In [4]:
base_model = ''
inputs = ''
x = ''
x = ''
x = ''
output = ''
model = ''
bottom_model = ''

base_model = EfficientNetV2B0(input_shape = (32,32,3), include_top = False)
inputs = Input(shape = (32,32,3))
x = base_model(inputs)
x = Flatten()(x)
x = Dense(10, activation = 'relu')(x)
output = Dense(10, activation = 'sigmoid')(x)
base_model.trainable = False
model = Model(inputs, output)
model.compile(loss = 'categorical_crossentropy',  metrics = ['accuracy'])
bottom_model = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), 
                    epochs = 1)

print(model.summary())

2025-12-05 18:42:55.420910: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_3 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 efficientnetv2-b0 (Function  (None, 1, 1, 1280)       5919312   
 al)                                                             
                                                                 
 flatten (Flatten)           (None, 1280)              0         
                                                                 
 dense (Dense)               (None, 10)                12810     
                                                                 
 dense_1 (Dense)             (None, 10)                110       
                                                                 
Total params: 5,932,232
Trainable params: 12,920
Non-trainable params: 5,919,312
______________________________________________

In [5]:
base_model = ''
inputs = ''
x = ''
x = ''
x = ''
output = ''
model = ''
#be sure to compile

bottom_model = ''
    
### BEGIN SOLUTION
base_model = EfficientNetV2B0(input_shape = (32, 32, 3), include_top=False)
inputs = Input(shape = (32, 32, 3))
x = base_model(inputs)
x = Flatten()(x)
x = Dense(10, activation = 'relu')(x)
output = Dense(10, activation = 'sigmoid')(x)
base_model.trainable = False
model = Model(inputs, output)
model.compile(loss = 'categorical_crossentropy', metrics = ['accuracy'])
bottom_model = model.fit(X_train, Y_train, validation_data=(X_test, Y_test), 
                    epochs = 1)
### END SOLUTION

### ANSWER CHECK
print(model.summary())

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_5 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 efficientnetv2-b0 (Function  (None, 1, 1, 1280)       5919312   
 al)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 1280)              0         
                                                                 
 dense_2 (Dense)             (None, 10)                12810     
                                                                 
 dense_3 (Dense)             (None, 10)                110       
                                                                 
Total params: 5,932,232
Trainable params: 12,920
Non-trainable params: 5,919,312
____________________________________________

### Problem 2

#### Setting to Trainable

In the code cell below, use the code `base_model.trainable = True` to ensure that your weights are now trainable.


Set the final five layers to trainable in the `base_model` as demonstrated in the lectures. 

In [6]:
base_model.trainable #are the base model weights set to trainable?

False

In [7]:
base_model.trainable = True
for layer in base_model.layers[:-5]:
    layer.trainable = False
for i, layer in enumerate(base_model.layers[-10:]):
    print(f'Layer is trainable: {layer.trainable}')

Layer is trainable: False
Layer is trainable: False
Layer is trainable: False
Layer is trainable: False
Layer is trainable: False
Layer is trainable: True
Layer is trainable: True
Layer is trainable: True
Layer is trainable: True
Layer is trainable: True


### Problem 3

#### Refitting the network

In the code cell below, use the function `compile` on `model` using `categorical_crossentropy` as your `loss` and `accuracy` as your `metric`.

Next, use the `fit()` function on `model` to fit the  training data `X_train` and `Y_train`. Set the argument `validation_data` equal to `(X_test, Y_test)` and the argument `epochs` equal to 1.  Assign the result to the variable `fine_tuned_history` below. 
 

In [9]:
model.compile(loss = 'categorical_crossentropy', metrics = ['accuracy'])
fine_tuned_history = model.fit(X_train, Y_train, validation_data = (X_test, Y_test), epochs = 1)

fine_tuned_history.history['accuracy'][-1]



0.47226667404174805