# Transfer learning

Leveraging an already pre-trained model to adapt to a specific problem. It also often providers greater results with less data compared to a regular model with more data

Ex: Using models trained on ImageNet

**Continuation of the CNN notebook**

## Data Exploration

In [2]:
# Get data (10% of labels)
import zipfile

# Download data
!curl -O https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip

# Unzip the downloaded file
zip_ref = zipfile.ZipFile('10_food_classes_10_percent.zip', 'r')
zip_ref.extractall()
zip_ref.close()

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0  160M    0   304    0     0    382      0   5d 02h --:--:--   5d 02h   381
  0  160M    0  977k    0     0   614k      0  0:04:27  0:00:01  0:04:26  614k
  1  160M    1 2238k    0     0   824k      0  0:03:19  0:00:02  0:03:17  824k
  2  160M    2 3358k    0     0   935k      0  0:02:55  0:00:03  0:02:52  935k
  2  160M    2 4734k    0     0  1032k      0  0:02:39  0:00:04  0:02:35 1032k
  3  160M    3 5566k    0     0   997k      0  0:02:45  0:00:05  0:02:40 1163k
  4  160M    4 6670k    0     0  1013k      0  0:02:42  0:00:06  0:02:36 1140k
  4  160M    4 8030k    0     0  1059k      0  0:02:35  0:00:07  0:02:28 1190k
  5  160M    5 9038k    0     0  1052k      0  0:02:36  0:00:08  0:02:28 1135k
  6  160M    6 10.2M    0     0  1097k      0  0:02

In [3]:
# How many images in each folder?
import os

# Walk through 10 percent data directory and list number of files
for dirpath, dirnames, filenames in os.walk('data/10_food_classes_10_percent'):
    print(f'There are {len(dirnames)} directories and {len(filenames)} images in \'{dirpath}\'.')

There are 2 directories and 0 images in 'data/10_food_classes_10_percent'.
There are 10 directories and 0 images in 'data/10_food_classes_10_percent\test'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\chicken_curry'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\chicken_wings'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\fried_rice'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\grilled_salmon'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\hamburger'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\ice_cream'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\pizza'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\ramen'.
There are 0 directories and 250 images in 'data/10_food_classes_10_percent\test\steak'.
There are 0 di

In [7]:
# Create data generators
from tensorflow.keras.preprocessing.image import ImageDataGenerator

IMG_SHAPE = (224, 224)
BATCH_SIZE = 32
EPOCHS = 5

train_dir = 'data/10_food_classes_10_percent/train/'
test_dir = 'data/10_food_classes_10_percent/test/'

train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

train_data = train_datagen.flow_from_directory(
    train_dir,
    target_size=IMG_SHAPE,
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)

test_data = test_datagen.flow_from_directory(
    test_dir,
    target_size=IMG_SHAPE,
    batch_size=BATCH_SIZE,
    class_mode='categorical'
)

Found 750 images belonging to 10 classes.
Found 2500 images belonging to 10 classes.


### Callbacks

Extra functionality that are added to models that run before or after training
* EarlyStopping - stop model training if a specified metric hasn't changed for a step number of epochs
* TensorBoard - monitor training process
* ModelCheckpoint - create model save checkpoints

In [10]:
# Create tensorboard callback - each model needs its own callback
import datetime
def create_tensorboard_callback(dir_name, experiment_name):
    log_dir = f'{dir_name}/{experiment_name}/{datetime.datetime.now().strftime("%Y%m%d-%H%M%S")}'
    tensorboard_callback = tf.keras.callback.TensorBoard(
        log_dir=log_dir
    )
    
    print(f'Saving TensorBoard log files to: {log_dir}')
    return tensorboard_callback