# Transfer Learning in TensorFlow Part 1: Feature extraction

## Import the data

In [8]:
import os
import requests

In [6]:
import zipfile

# URL of the file to download
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip"

# Local filename
filename = "10_food_classes_10_percent.zip"

# Download the file
response = requests.get(url)
with open(filename, 'wb') as file:
    file.write(response.content)

# Unzip the file
zip_ref = zipfile.ZipFile(filename, 'r')
zip_ref.extractall()
zip_ref.close()

# Remove the zip file
os.remove(filename)

In [9]:
# Walk through 10 percent data directory and list number of files
for dirpath, dirnames, filenames in os.walk("10_food_classes_10_percent"):
  print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")

There are 2 directories and 0 images in '10_food_classes_10_percent'.
There are 10 directories and 0 images in '10_food_classes_10_percent/test'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/ice_cream'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/chicken_curry'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/steak'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/sushi'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/chicken_wings'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/grilled_salmon'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/hamburger'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/pizza'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test/ramen'.
There are 0 directories and 250 images in '10_food_classes_10_percent/test

> There are only 75 images in the train folders 🧐

## Creating data loaders (preparing the data)

In [10]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_dir = "10_food_classes_10_percent/train/"
test_dir = "10_food_classes_10_percent/test/"


# Get all pixel values between 0 & 1
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

# Load data in from directories and turn it into batches
train_data = train_datagen.flow_from_directory(directory=train_dir,
                                               batch_size=32, 
                                               target_size=(224,224), 
                                               class_mode="categorical")
test_data = test_datagen.flow_from_directory(directory=test_dir,
                                               batch_size=32,
                                               target_size=(224,224),
                                               class_mode="categorical")

Found 750 images belonging to 10 classes.
Found 2500 images belonging to 10 classes.


## Setting up callbacks (things to run whilst our model trains)

Callbacks are extra functionality you can add to your models to be performed during or after training. Some of the most popular callbacks:

<table style="font-size: 17px;">
  <tr>
    <th style="text-align: center;">Callback name</th>
    <th style="text-align: center;">Use case</th> 
    <th style="text-align: center;">Code</th>
  </tr>
  <tr>
    <td style="text-align: center; text-decoration: underline;">TensorBoard</td>
    <td style="text-align: center;">Log the performance of multiple model and then view and compare this models in a visual way on TensorBoard (a dashboard for inspecting neural network parameters). Helpful to compare the results of different models on your data.</td> 
    <td style="text-align: center;">tf.keras.callbacks.TensorBoard()</td>
  </tr>
  <tr>
    <td style="text-align: center; text-decoration: underline;">Model checkpoint</td>
    <td style="text-align: center;">Save your model as it trains so you can stop training if needed and come back to continue off where you left. Helpful if training takes a long time and can't be done in one sitting</td> 
    <td style="text-align: center;">tf.keras.callbacks.ModelCheckpoint()</td>
  </tr>
  <tr>
    <td style="text-align: center; text-decoration: underline;">Early stopping</td>
    <td style="text-align: center;">Leave your model training for an arbitrary amount of time and have it stop training automatically when it ceases to improve. Helpful when you've got a large dataset and don't know how long training will take</td> 
    <td style="text-align: center;">tf.keras.callbacks.EarlyStopping()</td>
  </tr>
</table>

In [12]:
# Create TensorBoard callback (functionized beause we need to create a new one for each model)
import datetime

def create_tensorboard_callback(dir_name, experiment_name):
  log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
  print(f"Saving TensorBoard log files to: {log_dir}")
  return tensorboard_callback

## Creating models using TensorFlow Hub

In the past we've used TensorFlow to create our own models layer by layer from scratch.

Now we're going to do a similar process, except the majority of our model's layers are going to come from TensorFlow Hub.

We can access pretrained models on: https://tfhub.dev/

Browsing the TensorFlow Hub page and sorting for image classification, we found the following feature vector model link: https://www.kaggle.com/models/tensorflow/efficientnet/frameworks/TensorFlow2/variations/b0-classification/versions/1