In [1]:
import os
# necessary as keras 3 has some issues with hublayers
os.environ["TF_USE_LEGACY_KERAS"] = "1"
import tensorflow as tf
tf.config.list_physical_devices()

2024-09-12 00:03:26.111136: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-12 00:03:26.127658: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-12 00:03:26.132923: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-12 00:03:26.145365: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I0000 00:00:1726099407.320504      34 cuda_executor.c

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

# Transfer learning with tensorflow part 1: feature extraction

Transfer learning is leveraging a working model's existing architecture and learned patterns for our own problem.

There are two main benefits:

1. Can leverage the existing neural network architecture proven to work on problems similar to our own
2. Can leverage a working neural architectures which has already learned patterns on similar data to our own, then we can adapt those patterns to our own data

## Downloading and becoming one with the data

In [2]:
# Get data(10% of 10 food classes file)
import zipfile 
import os

# Unzip downloaded file
if not "10_food_classes_10_percent.zip" in os.listdir("./"):
    !wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
    zip_ref = zipfile.ZipFile("10_food_classes_10_percent.zip")
    zip_ref.extractall()
    zip_ref.close()

In [3]:
# walk through 10% data directory and list number of files
for dirpath, dirnames, filenames in os.walk("10_food_classes_10_percent"):
    print(f"There are {len(dirnames)} directories and {len(filenames)} in {dirpath}")

There are 2 directories and 0 in 10_food_classes_10_percent
There are 10 directories and 0 in 10_food_classes_10_percent/train
There are 0 directories and 75 in 10_food_classes_10_percent/train/ramen
There are 0 directories and 75 in 10_food_classes_10_percent/train/grilled_salmon
There are 0 directories and 75 in 10_food_classes_10_percent/train/chicken_curry
There are 0 directories and 75 in 10_food_classes_10_percent/train/sushi
There are 0 directories and 75 in 10_food_classes_10_percent/train/pizza
There are 0 directories and 75 in 10_food_classes_10_percent/train/ice_cream
There are 0 directories and 75 in 10_food_classes_10_percent/train/steak
There are 0 directories and 75 in 10_food_classes_10_percent/train/hamburger
There are 0 directories and 75 in 10_food_classes_10_percent/train/fried_rice
There are 0 directories and 75 in 10_food_classes_10_percent/train/chicken_wings
There are 10 directories and 0 in 10_food_classes_10_percent/test
There are 0 directories and 250 in 10_f

## Creating data loaders(preparing the data)
We'll use the `ImageDataGenerator` class to load our images in batches

In [4]:
# Setup data inputs 
from tensorflow.keras.preprocessing.image import ImageDataGenerator

IMAGE_SHAPE=(224,224)
BATCH_SIZE=32
EPOCHS = 5

train_dir = "10_food_classes_10_percent/train"
test_dir = "10_food_classes_10_percent/test"
train_datagen=ImageDataGenerator(rescale=1/255.)
test_datagen=ImageDataGenerator(rescale=1/255.)

print("Training Images:")
train_data_10_percent = train_datagen.flow_from_directory(train_dir,
                                                          target_size=IMAGE_SHAPE,
                                                          batch_size=BATCH_SIZE,
                                                          class_mode="categorical")
test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=IMAGE_SHAPE,
                                             batch_size=BATCH_SIZE,
                                             class_mode="categorical")

Training Images:
Found 750 images belonging to 10 classes.
Found 2500 images belonging to 10 classes.


 ## Setting up callbacks(things to run whilst our model trains)
Callbacks are extra functionality you can add to your models to be performed during or after training.  Some of the most popular callbacks:

* Tracking experiments with the Tensorboard callback
* Model checkpoint with the ModelCheckpoint callback
* Stopping a model from training(before it takes too long and overfits) with Earlystopping Callbacks

In [5]:
#Create Tensorboard callback (functionized because we need to create a new one for each model)
import datetime

def create_tensorboard_callback(dir_name,experiment_name):
    log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir = log_dir)
    print(f"Saving TensorBoard log files to:{log_dir}")
    return tensorboard_callback

## Creating models using tensorflow hub

In the past we've used Tensorflow to create our own models layer by layer from scratch.

Now we're going to do a similar process, except the majority of our model's layers are going to come from Tensorflow Hub.

We can access pretrained models from https://www.tensorflow.org/hub

In [6]:
# import dependencies
import tensorflow as tf 
import tensorflow_hub as hub
from tensorflow.keras import layers
import keras


In [7]:
# Let's make a create_model() function to create a model from a URL

def create_model(model_url,num_classes=10):
    """
    Takes a tensorflow hub URL and creates a keras sequential model with it

    Args:
        model_url (str): A tensorflow hub feature extraction URL
        num_classes (int): Number of output neurons in the output layer,
        should be equal to number of target classes, default 10.

    Returns:
        An uncompiled Keras Sequential model with model_url as feature extractor
        layer and Dense output layer with num_classes output neurons
    """

    # Download the pretrained model and save it as keras layer
    feature_extractor_layer = hub.KerasLayer(model_url,trainable=False,input_shape=IMAGE_SHAPE+(3,))
    model = tf.keras.Sequential([
        feature_extractor_layer,
        tf.keras.layers.Dense(num_classes,activation="softmax")
    ])
    return model

In [8]:
# Resnet 50 V2 feature vector
resnet_url = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4"

# Original: EfficientNetB0 feature vector (version 1)
efficientnet_url = "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"

# # New: EfficientNetB0 feature vector (version 2)
# efficientnet_url = "https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_b0/feature_vector/2"


### Creating ResNet Tensorflow Hub Feature Extraction Model

In [9]:
# Create resnet model
resnet_model = create_model(resnet_url,10)
resnet_model.compile(loss="categorical_crossentropy",optimizer=tf.keras.optimizers.Adam(),metrics=["accuracy"])
resnet_model.summary()


AttributeError: module 'tensorflow_hub' has no attribute 'KerasLayer'

In [None]:
resnet_history = resnet_model.fit(train_data_10_percent,
                                  epochs=5,steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                 callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub",experiment_name="resnet50V2")])

In [None]:
# Let's create a function to plot our loss curves
from matplotlib import pyplot as plt
def plot_graphs(title,plot1,plot2,label1,label2,xlabel,ylabel):
    plt.title(title)
    plt.plot(plot1,label=label1)
    plt.plot(plot2,label=label2)
    plt.xticks([1,2,3,4])
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.legend()
    plt.show()

Wow!

That. Is. Incredible. Our transfer learning feature extracting model out performed ALL of the previous models we built by hand... (substantially) and in a quicker training time AND with only 10% of the training examples.

In [None]:
plot_graphs(title="Training & Validation accuracy",
plot1=resnet_history.history["accuracy"],
plot2=resnet_history.history["val_accuracy"],
label1="Training Accuracy",
label2="Validation Accuracy",
xlabel="epochs",
ylabel="percentage")

In [None]:
plot_graphs(title="Validation & Training Loss",
plot1=resnet_history.history["loss"],
plot2=resnet_history.history["val_loss"],
label1="Training Loss",
label2="Validation Loss",
xlabel="epochs",
ylabel="percentage")


### Creating and testing EfficientNetB0 Tensorflow Hub model

In [None]:
# Create resnet model
efficientnet_model = create_model(efficientnet_url,10)
efficientnet_model.compile(loss="categorical_crossentropy",optimizer=tf.keras.optimizers.Adam(),metrics=["accuracy"])
efficientnet_model.summary()


In [None]:
# details on efficient net - https://paperswithcode.com/method/efficientnet#:~:text=EfficientNet%20is%20a%20convolutional%20neural,resolution%20using%20a%20compound%20coefficient
efficientnet_history = efficientnet_model.fit(train_data_10_percent,
                                  epochs=10,steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                 callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub",experiment_name="efficientnetB0")])

In [None]:
plot_graphs(title="Training & Validation accuracy",
plot1=efficientnet_history.history["accuracy"],
plot2=efficientnet_history.history["val_accuracy"],
label1="Training Accuracy",
label2="Validation Accuracy",
xlabel="epochs",
ylabel="percentage")

In [None]:
plot_graphs(title="Validation & Training Loss",
plot1=efficientnet_history.history["loss"],
plot2=efficientnet_history.history["val_loss"],
label1="Training Loss",
label2="Validation Loss",
xlabel="epochs",
ylabel="percentage")


In [None]:
efficientnet_model.summary()

In [None]:
resnet_model.summary()

In [11]:
# How many layers does our efficientnetb0 feature extractor have?
len(efficientnet_model.layers[0].weights)

NameError: name 'efficientnet_model' is not defined

## Different types of transfer learning

 * **"As is"** transfer learning - using existing model with no changes whatsoever(e.g using imagenet model on 1k imagenet classes)
 * **"Feature extraction"** transfer learning - using pretrained patterns of existing models(e.g efficient net b0 trained on imagenet) and adjust the output layer for your own purposes(e.g above using 1000 classes -> 10 classes of food)
 * **"Fine-tuning"** transfer learning - use the prelearned patterns of an existing model and "fine-tune" many or all of the underlying layers(including new output layers)

## Comparing our model results using Tensorboard

**Note:** When you upload things to Tensorboard.dev, your experiments are public.

**No longer applies with latest tensorboard it's hosted locally**

In [None]:
# Upload Tensorboard dev records
!tensorboard --logdir ./tensorflow_hub/ 

2024-09-12 00:03:33.124284: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-12 00:03:33.140350: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-12 00:03:33.145231: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-12 00:03:33.156895: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I0000 00:00:1726099414.197928     157 cuda_executor.c

In [None]:
# Checkout what tensorboard experiments you have - deprecated
# !tensorboard dev list

In [None]:
# delete experiment -deprecated
# !tensorboard dev delete --experiment_id 