## Tranfer Learning with MobileNetV2

Welcome to the assignment, where you'll be using *transfer learning* on a pre-trained CNN to build an Alpaca/Not Alpaca classifier!

A pre-trained model is a network that's already been trained on a large dataset and saved, which allows you to use it to customize your own model cheaply and efficiently. The one you'll be using, MobileNetV2, was designed to provide fast and computationally efficient performance. It's been pre-trained on ImageNet, a dataset containing over 14 million images and 1000 classes.

By the end of this assignment, you'll be able to:
- Create a dataset from a directory
- Preprocess and augment data using the Sequential API
- Adapt a pre-trained model to a new data and train a classifier using the Functional API and MobileNet
- Fine-tune a classifier's final layers to improve accuracy


### 1 - Packages


In [1]:
import matplotlib.pyplot as plt
import json 
import numpy as np
import os
import tensorflow as tf
import tensorflow.keras.layers as tfl

from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation

2026-01-09 09:14:51.994228: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


### 1.1 Create the Dataset and split it into Training and Validation Sets

When training and evaluating a deep learning models in Keras, generating a dataset from image files stored on disk is simple and fast. Call the `image_dataset_from_directory` to read from the directory and create both training and validation datasets.

If you are specifying a validation split, you'll also need to specify the `subset` argument. Just set the training set to `"subset='training'"` and the validation set to `"subset='validation'"`. 

You'll also set your seeds to match each other, so your training and validation sets don't overlap.


In [None]:
os.chdir(os.path.join(os.getcwd(), 'Chapter04-Convolutional-Neural-Networks',
                       'DeepConvolutional_Models-CaseStudies', 
                       'W2A2'))

In [6]:
BATCH_SIZE = 32
IMG_SIZE = (160, 160)
directory = 'dataset/'
train_dataset = image_dataset_from_directory(directory,
                                             shuffle=True,
                                             validation_split=0.2,
                                             subset="training",
                                             seed=42,
                                             batch_size = BATCH_SIZE,
                                            image_size=IMG_SIZE)

validation_dataset = image_dataset_from_directory(directory,
                                                  shuffle=True,
                                                  validation_split=0.2,
                                                    subset="validation",
                                                    seed=42,
                                                    batch_size = BATCH_SIZE,
                                                    image_size=IMG_SIZE)

Found 327 files belonging to 2 classes.
Using 262 files for training.


2026-01-09 09:49:44.053384: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Found 327 files belonging to 2 classes.
Using 65 files for validation.
