## Fine-tuning Transfer Learning

In fine-tuning transfer learning the pre-trained model weights from another model are unfrozen and tweaked during to better suit your own data.

Fine-tuning is a type of transfer learning where you:

Start with a pre-trained model (e.g., ResNet, BERT, GPT).

Adjust its parameters (weights) by continuing training on your smaller, task-specific dataset.

In this section you can optionally:

1. Freeze some layers (e.g., early layers that detect generic features like edges/word syntax).

2. Train the entire model end-to-end with a lower learning rate.

OR

3. Update only the later layers (more task-specific features like organs in X-rays or sentiment in reviews), which we have already implemented in the previous notebook.



Let us first import TensorFlow and che its version:

In [2]:
import tensorflow as tf

print(f"TensorFlow version: {tf.__version__}")

TensorFlow version: 2.19.0


And then check if we are using either GPU or CPU?

In [3]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


Also, we have to download the module `helper_functions.py module`:

In [4]:
import os

# GitHub raw URL of the module
url = "https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/refs/heads/main/extras/helper_functions.py"
filename = "helper_functions.py"

# Download only if not already present
if not os.path.exists(filename):
    print(f"Downloading {filename} from GitHub...")
    !wget -q -O {filename} {url}
    print(f"Downloaded {filename} to the current directory.")
else:
    print(f"{filename} already exists in the current directory.")

Downloading helper_functions.py from GitHub...
Downloaded helper_functions.py to the current directory.


In [5]:
# Import helper functions we're going to use
from helper_functions import create_tensorboard_callback, plot_loss_curves, unzip_data, walk_through_dir

Now download data as we did in the previous notebook.

In [6]:
import zipfile

# Download data
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

# Unzip the file
zip_ref = zipfile.ZipFile("10_food_classes_10_percent.zip", "r")
zip_ref.extractall()
zip_ref.close()

# OR alternatively

# !wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
# unzip_data("10_food_classes_10_percent.zip")

--2025-09-18 13:37:41--  https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.126.207, 173.194.206.207, 74.125.132.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.126.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 168546183 (161M) [application/zip]
Saving to: ‘10_food_classes_10_percent.zip’


2025-09-18 13:37:42 (203 MB/s) - ‘10_food_classes_10_percent.zip’ saved [168546183/168546183]



Typically the file structure should be like:

```
10_food_classes_10_percent/          <- parent directory
├── train                            <- training images
│   ├── pizza
│   │   │   1647351.jpg
│   │   │   1647352.jpg
│   │   │   ...
│   └── steak
│       │   1648001.jpg
│       │   1648050.jpg
│       │   ...
│
└── test                             <- testing images
    ├── pizza
    │   │   1001116.jpg
    │   │   1507019.jpg
    │   │   ...      
    └── steak
        │   100274.jpg
        │   1653815.jpg
        │   ...
```

In [7]:
import tensorflow as tf

# Create training and test directories
train_dir = "10_food_classes_10_percent/train/"
test_dir = "10_food_classes_10_percent/test/"

# Create data inputs
IMG_SIZE = (224, 224) # define image size
train_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=train_dir,
                                                                            image_size=IMG_SIZE,
                                                                            label_mode="categorical", # what type are the labels?
                                                                            batch_size=32) # batch_size is 32 by default, this is generally a good number
test_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=test_dir,
                                                                           image_size=IMG_SIZE,
                                                                           label_mode="categorical")

Found 750 files belonging to 10 classes.
Found 2500 files belonging to 10 classes.


Calling a TensorFlow utility above (`tf.keras.preprocessing.image_dataset_from_directory`), we simply and quickly build a dataset of images from a folder structure.

If you want to see an example batch of data, you could use the `take()` method.

In [8]:
# See an example batch of data
for images, labels in train_data_10_percent.take(1):
  print(images, labels)

tf.Tensor(
[[[[1.46127548e+02 7.61275482e+01 1.53010197e+01]
   [1.59525513e+02 9.15255127e+01 2.85255127e+01]
   [1.42290817e+02 7.18622437e+01 1.07346935e+01]
   ...
   [1.49862198e+02 5.90764580e+01 6.07645750e+00]
   [1.45806198e+02 5.48062057e+01 2.60251731e-01]
   [1.58872543e+02 6.88725433e+01 8.15818596e+00]]

  [[1.48530609e+02 7.91275558e+01 1.53367357e+01]
   [1.58903061e+02 9.17653122e+01 2.41887741e+01]
   [1.53000000e+02 8.35000000e+01 1.93571415e+01]
   ...
   [1.46515259e+02 5.55152588e+01 2.51526070e+00]
   [1.45637878e+02 5.47042122e+01 2.90869564e-01]
   [1.65974701e+02 7.59747009e+01 1.40665283e+01]]

  [[1.49030609e+02 7.98775482e+01 1.32397966e+01]
   [1.65367355e+02 9.80255127e+01 2.78979588e+01]
   [1.56336731e+02 8.54795914e+01 1.77653046e+01]
   ...
   [1.44474396e+02 5.26427460e+01 1.47441959e+00]
   [1.49785843e+02 5.84134064e+01 2.64293575e+00]
   [1.67699020e+02 7.51939545e+01 1.30510254e+01]]

  ...

  [[6.43878174e+01 5.68163452e+01 4.31735535e+01]
   [1

Now we are ready to launch our first model (the baseline)...

We’ll use the `tf.keras.applications` module, which provides a collection of pre-trained computer vision models (trained on ImageNet), along with the Keras Functional API to build our model. But what are the right steps one should take...?

Model 0: Building a transfer learning model using the Keras Functional API

In order to create a baseline in our case, you can do as follows:

1. Instantiate a pre-trained base model object by choosing a target model such as `EfficientNetV2B0` and setting the `include_top` parameter to `False`.
2. Set the base model's `trainable` attribute to `False` to freeze all of the weights in the pre-trained model.
3. Define an input layer for our model (data shape expected by the model)
4. Normalize the inputs to our model if it requires (e.g., `EfficientNetV2B0`).
5. Pass the inputs to the base model.
6. Pool the outputs of the base model into a shape compatible with the output activation layer (turn base model output tensors into same shape as label tensors)
7. Create an output activation layer via `tf.keras.layers.Dense()`
8. Merge the inputs and outputs layer into a new model using `tf.keras.Model()`
9. Compile the model, passing the loss function and optimizer
10. Fit the model for as many epochs as necessary.

In [12]:
# Create base model
base_model = tf.keras.applications.EfficientNetV2B0(include_top=False)

# Freeze base model layers
base_model.trainable = False

# Create the input layer
inputs = tf.keras.layers.Input(shape=(224, 224, 3), name="input_layer")

# If using ResNet50V2, add this to speed up convergence, remove for EfficientNetV2
# x = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(inputs)

# Pass the inputs to the base model
x = base_model(inputs)

print(f"Shape after base_model: {x.shape}")

# Average pool the outputs of the base model
x = tf.keras.layers.GlobalAveragePooling2D(name="global_average_pooling_layer")(x)
print(f"After GlobalAveragePooling2D(): {x.shape}")

# Create the activation layer
outputs = tf.keras.layers.Dense(10, activation="softmax", name="output_layer")(x)

# Merge inputs and outputs into a new model via Model()
model_0 = tf.keras.Model(inputs, outputs)

# Compile the model
model_0.compile(loss="categorical_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=["accuracy"])

# Fit the model
history_0 = model_0.fit(train_data_10_percent,
                        epochs=5,
                        steps_per_epoch=len(train_data_10_percent),
                        validation_data=test_data_10_percent,
                        validation_steps=int(0.15 * len(test_data_10_percent)),
                        callbacks=[create_tensorboard_callback(dir_name="training_logs",
                                                               experiment_name="model_0_baseline")])

Shape after base_model: (None, 7, 7, 1280)
After GlobalAveragePooling2D(): (None, 1280)
Saving TensorBoard log files to: training_logs/model_0_baseline/20250918-144718
Epoch 1/5
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 3s/step - accuracy: 0.2801 - loss: 2.1429 - val_accuracy: 0.7443 - val_loss: 1.2693
Epoch 2/5
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m66s[0m 3s/step - accuracy: 0.7381 - loss: 1.2030 - val_accuracy: 0.7983 - val_loss: 0.8711
Epoch 3/5
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m65s[0m 3s/step - accuracy: 0.8286 - loss: 0.8559 - val_accuracy: 0.8494 - val_loss: 0.7135
Epoch 4/5
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m84s[0m 3s/step - accuracy: 0.8618 - loss: 0.6578 - val_accuracy: 0.8352 - val_loss: 0.6222
Epoch 5/5
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 3s/step - accuracy: 0.8729 - loss: 0.5872 - val_accuracy: 0.8438 - val_loss: 0.5955


What we did exactly is:

We took our custom dataset, fed it into a pre-trained model (`EfficientNetV2B0`), let it identify meaningful patterns, and then added our own output layer to match the number of classes we needed.

Keras Functional API is used instead of using the Sequential one to create the model.