# Transfer Learning

Transfer learning is based on the idea that the feature a network learns for a problem can be reused for a variety of other tasks. In the world, this idea is very natural. When humans learn how to perform a new task, we seldom start from scratch. We carry over all that we have learned in our lifetime. Sometimes this knowledge allows us to quickly learn new stuff. We can often learn from a single training example. But, other times it actually hinders our development. Of course, babies don't learn this way because they don't have the same level of prior knowledge.

Resources:

https://www.tensorflow.org/tutorials/images/transfer_learning

For an excellent resource on Transfer Learning models, peruse:

https://towardsdatascience.com/an-intuitive-guide-to-deep-network-architectures-65fdc477db41

# Import **tensorflow** Library

Import library and alias it:

In [None]:
import tensorflow as tf

# GPU Hardware Accelerator

To vastly speed up processing, we can use the GPU available from the Google Colab cloud service. Colab provides a free Tesla K80 GPU of about 12 GB. It’s very easy to enable the GPU in a Colab notebook:

1.	click **Runtime** in the top left menu
2.	click **Change runtime** type from the drop-down menu
3.	choose **GPU** from the Hardware accelerator drop-down menu
4.	click **SAVE**

Verify that GPU is active:

In [None]:
tf.__version__, tf.test.gpu_device_name()

If '/device:GPU:0' is displayed, the GPU is active. If '..' is displayed, the regular CPU is active.

# Beans Experiment

**Beans** is a TensorFlow dataset (TFDS) of bean plant images taken in the field using smartphone cameras. It consists of 3 classes (bean_rust, angular_leaf_spot, healthy). Two of the classes are Angular Leaf Spot and Bean Rust, which are diseases that can befell bean plants. So a bean plant in this dataset is either healthy or afflicted with one of the two diseases. Data was annotated by experts from the National Crops Resources Research Institute (NaCRRI) in Uganda and collected by the Makerere AI research lab.

We train beans data with two pre-trained models. In this section, we load and explore the dataset.

Load **beans** as a TFDS:

In [None]:
import tensorflow_datasets as tfds

beans, beans_info = tfds.load(
    'beans', with_info=True, as_supervised=True,
    try_gcs=True)

## Explore the Dataset

Display the contents of the **info** object:

In [None]:
beans_info

Display splits:

In [None]:
beans

Simplify processing splits:

In [None]:
train = beans['train']
valid = beans['validation']
test = beans['test']

Get labels and number of classes:

In [None]:
class_labels = beans_info.features['label'].names
num_classes = beans_info.features['label'].num_classes
class_labels, num_classes

Check image sizes:

In [None]:
for img, lbl in train.take(10):
  print (img.shape)

Athough images are of the same size, we resize to 224 x 224 to increase performance and match the expected size of the pretrained model. 

## Visualize

Visualize with **show_examples**:

In [None]:
fig = tfds.show_examples(train, beans_info)

## Reformat Images

Resize and process images:

In [None]:
def preprocess(image, label):
  resized_image = tf.image.resize(image, [224, 224])
  final_image = tf.keras.applications.xception.\
                preprocess_input(resized_image)
  return final_image, label

We resize images to 224 x 224 and run them through Xception's preprocessing input function since we are leveraging an Xception model.

## Build the Input Pipeline

Shuffle train data, preprocess, batch, and prefetch train, validate and test data:

In [None]:
BATCH_SIZE = 32
shuffle = 250

train_ds = train.shuffle(shuffle).\
  map(preprocess).batch(BATCH_SIZE).prefetch(1)
valid_ds = valid.map(preprocess).batch(BATCH_SIZE).prefetch(1)
test_ds = test.map(preprocess).batch(BATCH_SIZE).prefetch(1)

Inspect the train tensor:

In [None]:
train_ds

Visualize examples from the train set:

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 12))
for img, lbl in train_ds.take(1):
  for index in range(9):
    plt.subplot(3, 3, index + 1)
    plt.imshow(img[index] / 2 + 0.5)
    plt.title(class_labels[lbl[index]])
    plt.axis('off')

## Model Beans with the Xception Model

The Xception model was proposed by Francois Chollet in 2017. **Xception** is an extension of the inception architecture that replaces the standard Inception modules with depthwise Separable Convolutions. Xception often outperforms VGGNet, ResNet, and Inception-v3 models. As a sidenote, Chollet is also the author of Keras. 

Resources:

https://maelfabien.github.io/deeplearning/xception/#

https://towardsdatascience.com/review-xception-with-depthwise-separable-convolution-better-than-inception-v3-image-dc967dd42568

https://medium.com/analytics-vidhya/image-recognition-using-pre-trained-xception-model-in-5-steps-96ac858f4206#:~:text=Xception%20Model%20is%20proposed%20by,modules%20with%20depthwise%20Separable%20Convolutions.

Clear previous models and generate a seed:

In [None]:
import numpy as np

tf.keras.backend.clear_session()
np.random.seed(0)
tf.random.set_seed(0)

### Create a Model

Create a base model from the pre-trained **Xception** model, average input and activate neurons with **softmax** to create the final model:

In [None]:
Xception = tf.keras.applications.xception.Xception
xception_model = Xception(
    weights='imagenet', include_top=False)

Load an Xception model pre-trained on ImageNet and exclude the top layer of the network by setting **include_top=False**, which excludes the global average pooling layer and the dense output layer. 

View all the layers:

In [None]:
xception_model.summary()

View layer objects:

In [None]:
xception_model.layers

Get number of layers:

In [None]:
len(xception_model.layers)

Display model as a readable diagram:

In [None]:
tf.keras.utils.plot_model(
    xception_model,
    show_shapes=True,
    show_layer_names=True)

Import libraries:

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout,\
                                    GlobalAveragePooling2D

Build the final model:

In [None]:
x_model = tf.keras.Sequential([
  xception_model,
  GlobalAveragePooling2D(),
  Dropout(0.5),
  Dense(num_classes, activation='softmax')
])

Since we excluded the top layer of the pre-trained network that has a global average pooling layer and a dense output layer, we must add our own global average pooling layer and a dense output layer with three classes and *softmax* activation.

Alternatively, we can build the final model in this form:

avg = tf.keras.layers.GlobalAveragePooling2D()(base_model.output)\
output = tf.keras.layers.Dense(num_classes, activation='softmax')(avg)\
model = tf.keras.models.Model(inputs=base_model.input, outputs=output)

### Model the Data

Freeze the weights of the pre-trained layers, compile, and train:

In [None]:
for layer in xception_model.layers:
  layer.trainable = False

optimizer = tf.keras.optimizers.SGD(
    lr=0.2, momentum=0.9, decay=0.01)

x_model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=optimizer,
    metrics=['accuracy'])

history = x_model.fit(
    train_ds, validation_data=valid_ds, epochs=10)

### Visualize

Create a function to visualize:

In [None]:
def visualize(span):
  acc = history.history['accuracy']
  val_acc = history.history['val_accuracy']
  loss = history.history['loss']
  val_loss = history.history['val_loss']
  epochs_range = span
  plt.figure(figsize=(8, 8))
  plt.subplot(1, 2, 1)
  plt.plot(epochs_range, acc, label='Training Accuracy')
  plt.plot(epochs_range, val_acc, label='Validation Accuracy')
  plt.legend(loc='lower right')
  plt.title('Training and Validation Accuracy')
  plt.subplot(1, 2, 2)
  plt.plot(epochs_range, loss, label='Training Loss')
  plt.plot(epochs_range, val_loss, label='Validation Loss')
  plt.legend(loc='upper right')
  plt.title('Training and Validation Loss')
  plt.show()

Invoke:

In [None]:
visualize(range(10))

We set a very aggressive learning rate. But loss is not divergent!

### Model Trained Data with Unfrozen Layers

After we trained the model for a few epochs, validation accuracy is pretty good, but it doesn't get better. This means that the top layers are pretty well trained. Continue training with all the layers unfrozen. We use a **much lower learning rate** to avoid damaging the pre-trained weights.

In [None]:
for layer in xception_model.layers:
  layer.trainable = True

optimizer = tf.keras.optimizers.SGD(
    learning_rate=0.01, momentum=0.9,
    nesterov=True, decay=0.001)

x_model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=optimizer, metrics=['accuracy'])

history = x_model.fit(
    train_ds, validation_data=valid_ds, epochs=10)

### Visualize

Visualize performance:

In [None]:
visualize(range(10))

Definitely an improvement. For real-world data, set learning rates much, much lower to allow the networks to randomly set optimal weights for the neurons. By setting high learning rates, training time is less but we reduce the ability of networks to randomly adjust neuron weights. A good starting point for learning rate might be in the vicinity of 0.0001!

## Model Beans with the Inception Model

**Inception-v3** is a pre-trained convolutional neural network model that is 48 layers deep. It is a version of the network already trained on more than a million images from the ImageNet database. The pre-trained network can classify images into 1000 object categories such as keyboard, mouse, pencil, and many animals.

Resources:

https://www.tensorflow.org/api_docs/python/tf/keras/applications/InceptionV3

https://medium.com/analytics-vidhya/transfer-learning-using-inception-v3-for-image-classification-86700411251b

https://towardsdatascience.com/classify-any-object-using-pre-trained-cnn-model-77437d61e05f#:~:text=Inception%2Dv3%20is%20a%20pre,images%20from%20the%20ImageNet%20database.&text=This%20pre%2Dtrained%20network%20can,%2C%20pencil%2C%20and%20many%20animals.

### Build a Base Model

Create the base model:

In [None]:
inception_v3 = tf.keras.applications.InceptionV3
inception_model = inception_v3(
    include_top=False, weights='imagenet',
    input_shape=(224, 224, 3))

Clear models and seed:

In [None]:
tf.keras.backend.clear_session()
np.random.seed(0)
tf.random.set_seed(0)

Create the final model:

In [None]:
i_model = tf.keras.Sequential([
  inception_model,
  GlobalAveragePooling2D(),
  Dropout(0.5),
  Dense(num_classes, activation='softmax')
])

Leave out the last fully connected layer because it is specific to the ImageNet competition. We can use the current shape since include_top is False. Otherwise, the input shape must be (299, 299, 3).

Explore base model layers:

In [None]:
tf.keras.utils.plot_model(
    inception_model,
    show_shapes=True,
    show_layer_names=True)

### Model the Data

Freeze the weights of the pretrained layers, compile, and train:

In [None]:
for layer in inception_model.layers:
  layer.trainable = False

optimizer = tf.keras.optimizers.RMSprop(lr=0.1)

i_model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=optimizer, metrics=['accuracy'])

history = i_model.fit(
    train_ds, validation_data=valid_ds, epochs=10)

Notice that we used the **RMSprop** loss function.

### Visualize Performance

Visualize performance:

In [None]:
visualize(range(10))

We set an aggressive learning rate. Although loss is erratic, it still is not divergent. As an experiment, set a much lower learning rate. Training time will be increased, but loss will be less divergent.

### Model Trained Data with Unfrozen Layers

Let's see if we can squeeze out more performance by unfreezing all layers:

In [None]:
for layer in inception_model.layers:
  layer.trainable = True

optimizer = tf.keras.optimizers.RMSprop(lr=0.0001)

i_model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer=optimizer, metrics=['accuracy'])

history = i_model.fit(
    train_ds, validation_data=valid_ds, epochs=10)

We use a **much lower learning rate** to avoid damaging the pretrained weights.

### Visualize Performance

Visualize:

In [None]:
visualize(range(10))

We are able to increase performance and somewhat smooth out loss divergence.

## Generalize on Unseen Data

Generalize on the unseen test dataset for the Xception model:

In [None]:
x_model.evaluate(test_ds)

Generalize on the unseen test dataset for the Inception model:

In [None]:
i_model.evaluate(test_ds)

# Stanford Dogs Experiment

The **Stanford Dogs** dataset contains images of 120 breeds of dogs from around the world. It has been built using images and annotation from ImageNet for the task of fine-grained image categorization. The dataset contains 20,580 images split into 12,000 training images and 8,580 testing images. Class labels and bounding box annotations are provided for all 12,000 images.

## Model Stanford Dogs with the MobileNet Model

MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases. They can be built upon for classification, detection, embedding, and segmentation similar to how other popular large scale models such as Inception.

The **MobileNet V2** model was developed at Google. It is pre-trained on the ImageNet dataset, which is a large dataset consisting of 1.4 million images and 1,000 classes. **ImageNet** is a research training dataset with a wide variety of categories like jackfruit and syringe. Its base knowledge helps us classify dogs from our specific dataset.

Resource:

https://www.tensorflow.org/api_docs/python/tf/keras/applications/MobileNetV2

https://www.tensorflow.org/tutorials/images/transfer_learning

Load the train set:

In [None]:
train_pups, dogs_info = tfds.load(
    'stanford_dogs', with_info=True,
    as_supervised=True, try_gcs=True,
    split='train')

Get metadata:

In [None]:
dogs_info

Now that we know the splits, load the validation and test sets:set:

In [None]:
(validation_pups, test_pups) = tfds.load(
    'stanford_dogs',
    split=['test[:50%]', 'test[50%:]'],
    as_supervised=True, try_gcs=True)

## Visualize Examples

Create a function to get the named label:

In [None]:
get_name = dogs_info.features['label'].int2str

By trial and error, we got all integer labels and converted them to named ones:

In [None]:
lbls = []
for image, label in train_pups.take(464):
  lbls.append(get_name(label))
set_lbl = set(lbls)
len(set_lbl)

We have all of the labels in a list!

Grab some images and labels for visualization:

In [None]:
img, lbl = [], []
for image, label in train_pups.take(9):
  img.append(image)
  lbl.append(get_name(label)[10:])

Display the first one:

In [None]:
lbl[0]

Display some examples:

In [None]:
plt.figure(figsize=(12, 12))
for index in range(9):
  plt.subplot(3, 3, index + 1)
  plt.imshow(img[index])
  plt.title(lbl[index])
  plt.axis('off')

Display some examples with **show_examples**:

In [None]:
fig = tfds.show_examples(train_pups, dogs_info)

## Check Image Size

Display some example shapes:

In [None]:
for img, lbl in train_pups.take(10):
  print (img.shape)

Since images are of varying sizes, we must resize.

## Explore Metadata

Get number of classes:

In [None]:
class_labels = dogs_info.features['label']
num_breeds = dogs_info.features['label'].num_classes
class_labels, num_breeds

## Prepare Data and Build the Input Pipeline

Create variables:

In [None]:
IMG_LEN = 224
IMG_SHAPE = (IMG_LEN,IMG_LEN,3)

Create a function to preprocess:

In [None]:
def preprocess(img, lbl):
  resized_image = tf.image.resize(img, [IMG_LEN, IMG_LEN])
  final_image = tf.keras.applications.mobilenet.preprocess_input(
      resized_image)
  label = tf.one_hot(lbl, num_breeds)
  return final_image, label

Function resizes and preprocesses. It also encodes labels.

Create a function to build the pipeline:

In [None]:
def prepare(dataset, batch_size=None, shuffle_size=None):
  ds = dataset.map(preprocess, num_parallel_calls=4)
  ds = ds.shuffle(buffer_size=1000)
  if batch_size:
    ds = ds.batch(batch_size)
  if shuffle_size:
    ds = ds.shuffle(shuffle_size)   
  ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
  return ds

Build the pipeline with batch size of 32 and shuffle size of 1,000:

In [None]:
BATCH_SIZE = 32
SHUFFLE_SIZE = 1000

train_dogs = prepare(train_pups, batch_size=BATCH_SIZE,
                     shuffle_size=SHUFFLE_SIZE)
validation_dogs = prepare(validation_pups, batch_size=32)
test_dogs = prepare(test_pups, batch_size=32)

Inspect training tensors:

In [None]:
train_dogs, validation_dogs, test_dogs

## Model Data

For a nice tutorial on tranfer learning with stanford dogs, peruse:

https://www.angioi.com/dog-breed-classification/

Create the base model:

In [None]:
mobile_v2 = tf.keras.applications.MobileNetV2
mobile_model = mobile_v2(
    input_shape=IMG_SHAPE, include_top=False,
    weights='imagenet')

Explore the base model layers:

In [None]:
tf.keras.utils.plot_model(
    mobile_model,
    show_shapes=True,
    show_layer_names=True)

## Create and Train the Model

Clear previous models and generate a seed for reproducibility of results:

In [None]:
tf.keras.backend.clear_session()
np.random.seed(0)
tf.random.set_seed(0)

Verify number of classes:

In [None]:
num_breeds

Create a simple feedforward network and add the pretrained model to the first layer:

In [None]:
mobile_model.trainable = False

sd_model = tf.keras.Sequential([
  mobile_model,
  GlobalAveragePooling2D(),
  Dropout(0.5),
  Dense(num_breeds, activation='softmax')
])

Notice that we freeze the top layers.

Since we are training **many** more images and **many** more classes, training time is much longer. So be patient. Don't be concerned if your computer *craps out*. It is not an error. More RAM is needed. We use Colab Pro and don't seem to have an issue.

Compile and train:

In [None]:
EPOCHS = 5

sd_model.compile(
    optimizer=tf.keras.optimizers.Adamax(learning_rate=0.005),
    loss='categorical_crossentropy',
    metrics=['accuracy', 'top_k_categorical_accuracy'])

history = sd_model.fit(
    train_dogs, epochs=EPOCHS, validation_data=validation_dogs)

We set a faster learning rate and got pretty good results. By setting a low learning rate, training progresses slowly as we are making very tiny updates to the weights in the network. However, if learning rate is set too high, it can cause undesirable divergent behavior in the loss function.

## Visualize Performance

Visualize:

In [None]:
visualize(range(EPOCHS))

Not bad! Accuracy is over 80% in our experiment with 10 epochs. If we look at the top-5 predictions, the chance of guessing the correct breed jumps to over 97%. Setting a less aggressive learning rate mitigates loss divergence. Loss is somewhat divergent.

Visualize Top 5:

In [None]:
acc = history.history['top_k_categorical_accuracy']
val_acc = history.history['val_top_k_categorical_accuracy']

epochs_range = range(EPOCHS)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Top 5 Training and Validation Accuracy')
plt.grid(b=None)

Not bad! We get over 80% accuracy for breed detection. If we look at the top-5 predictions, the chance of guessing the correct breed jumps to over 97%! Since the dataset is large and complex, we show that using pre-trained models can have real-world use cases. Also it is amazing that we can build such a powerful model in just a few lines of code!

## Model Trained Data with Unfrozen Layers

After we trained the model for ten epochs, validation accuracy is pretty good, but its trajectory is not increasing. So the top layers are pretty well trained. Unfreeze the top layers and continute training. We use a **much lower learning rate** to avoid damaging the pre-trained weights.

In [None]:
mobile_model.trainable = True

sd_model.compile(
    optimizer=tf.keras.optimizers.Adamax(0.00001),
    loss='categorical_crossentropy',
    metrics=['accuracy', 'top_k_categorical_accuracy'])
			  
history = sd_model.fit(
    train_dogs, epochs=3,
    validation_data=validation_dogs)

We use a **much lower learning rate** to avoid damaging the pretrained weights. Unfreezing the layers, doesn't seem to improve performance. But we only run for three epochs.

## Visualize Performance

Visualize:

In [None]:
visualize(range(3))

Visualize Top:

In [None]:
acc = history.history['top_k_categorical_accuracy']
val_acc = history.history['val_top_k_categorical_accuracy']

epochs_range = range(3)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Top 5 Training and Validation Accuracy')
plt.grid(b=None)

## Generalize

Generalize from unseen data:

In [None]:
sd_model.evaluate(test_dogs)

# Flowers Experiment

Load flowers as TFRecords and use a pre-trained model for learning.

## Read Flowers as TFRecords

Read TFRecord files from GCS:

In [None]:
piece1 = 'gs://flowers-public/'
piece2 = 'tfrecords-jpeg-192x192-2/*.tfrec'
TFR_GCS_PATTERN = piece1 + piece2
tfr_filenames = tf.io.gfile.glob(TFR_GCS_PATTERN)

## Create Data Splits

Set parameters:

In [None]:
IMAGE_SIZE = [192, 192]
AUTO = tf.data.experimental.AUTOTUNE
BATCH_SIZE = 64
SHUFFLE_SIZE = 100
EPOCHS = 5
VALIDATION_SPLIT = 0.19
CLASSES = ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

Create splits:

In [None]:
split = int(len(tfr_filenames) * VALIDATION_SPLIT)
training_filenames = tfr_filenames[split:]
validation_filenames = tfr_filenames[:split]
print ('Splitting dataset into {} training files and {}'
       'validation files'.\
       format(
           len(tfr_filenames), len(training_filenames),
           len(validation_filenames)), end = ' ')
print ('with a batch size of {}.'.format(BATCH_SIZE))

validation_steps = int(3670 // len(tfr_filenames) *\
                       len(validation_filenames)) // BATCH_SIZE
steps_per_epoch = int(3670 // len(tfr_filenames) *\
                      len(training_filenames)) // BATCH_SIZE
print ('There are {} batches per training epoch and {} '\
       'batches per validation run.'\
       .format(BATCH_SIZE, steps_per_epoch, validation_steps))

## Create Functions to Load and Process TFRecord Files

Demonstrate one-hot encoding:

In [None]:
named_lbl = 'sunflowers'
indx = CLASSES.index(named_lbl)
encode = tf.one_hot([indx], 5)
one_hot = encode[0].numpy()
print ('encoded label:', one_hot)
pos = tf.math.argmax(one_hot).numpy()
print ('integer label:', pos)

Create a function to parse a TFRecord file:

In [None]:
def read_tfrecord(example):
  features = {
      'image': tf.io.FixedLenFeature([], tf.string),
      'class': tf.io.FixedLenFeature([], tf.int64)
  }
  example = tf.io.parse_single_example(example, features)
  image = tf.image.decode_jpeg(example['image'], channels=3)
  image = tf.cast(image, tf.float32) / 255.0 
  image = tf.reshape(image, [*IMAGE_SIZE, 3])
  class_label = example['class']
  one_hot = tf.one_hot(class_label, 5)
  return image, one_hot

Create a function to load TFRecord files as tf.data.Dataset:

In [None]:
def load_dataset(filenames):
  option_no_order = tf.data.Options()
  option_no_order.experimental_deterministic = False
  dataset = tf.data.TFRecordDataset(
      filenames, num_parallel_reads=AUTO)
  dataset = dataset.with_options(option_no_order)
  dataset = dataset.map(read_tfrecord, num_parallel_calls=AUTO)
  return dataset

Create a function to build an input pipeline from TFRecord files:

In [None]:
def get_batched_dataset(filenames, train=False):
  dataset = load_dataset(filenames)
  dataset = dataset.cache()
  if train:
    dataset = dataset.repeat()
    dataset = dataset.shuffle(SHUFFLE_SIZE)
  dataset = dataset.batch(BATCH_SIZE)
  dataset = dataset.prefetch(AUTO)
  return dataset

## Create Train and Test Sets

Instantiate the datasets:

In [None]:
training_dataset = get_batched_dataset(
    training_filenames, train=True)
validation_dataset = get_batched_dataset(
    validation_filenames, train=False)
training_dataset, validation_dataset

Display an image:

In [None]:
for img, lbl in training_dataset.take(1):
  plt.axis('off')
  label = tf.math.argmax(lbl[0]).numpy()
  plt.title(CLASSES[label])
  fig = plt.imshow(img[0])
  tfr_flower_shape = img.shape[1:]

## Model Data

Create a list of pre-trained models:

In [None]:
ptm =\
  [tf.keras.applications.MobileNetV2,
   tf.keras.applications.VGG16,
   tf.keras.applications.MobileNet,
   tf.keras.applications.xception.Xception,
   tf.keras.applications.InceptionV3,
   tf.keras.applications.ResNet50]

Choose any of the pre-trained models by index. We use Xception in this use case:

In [None]:
pre_trained_model = ptm[3](
    weights='imagenet', include_top=False,
    input_shape=[*IMAGE_SIZE, 3])

Clear and seed:

In [None]:
tf.keras.backend.clear_session()
np.random.seed(0)
tf.random.set_seed(0)

Create the model:

In [None]:
pre_trained_model.trainable = True

flower_model = tf.keras.Sequential([
  pre_trained_model,
  GlobalAveragePooling2D(),
  Dense(5, activation='softmax')])

We use the Xception pre-trained model. We drop the ImageNet-specific top layers with include_top=false and a max pooling and a softmax layer to predict the 5 flower classes. We also unfreeze all of the top layers!

## Compile and Train

Compile:

In [None]:
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

flower_model.compile(
    optimizer=optimizer,
    loss = 'categorical_crossentropy',
    metrics=['accuracy'])

The initial learning rate is often the single most important hyperparameter. If one can tune only one hyperparameter, learning rate is the one worth tuning. However, the **Adam** optimizer automatically tunes learning rate!

By training with a small learing rate, the model learns a more optimal or even a globally optimal set of weights. However, training takes significantly longer. When the learning rate is too large, gradient descent can inadvertently increase rather than decrease the training error. The idea is to allow a neural network to randomly adjust its weights. Lower learning rates increase randomization. Higher ones decrease randomization.

Train:

In [None]:
history = flower_model.fit(
    training_dataset, epochs=EPOCHS,
    verbose=1, steps_per_epoch=steps_per_epoch,
    validation_steps=validation_steps, 
    validation_data=validation_dataset)

## Visualize

Visualize:

In [None]:
visualize(range(EPOCHS))

## Generalize

Generalize on the validation set because we didn't split out a test one:

In [None]:
flower_model.evaluate(validation_dataset)

# Rock Paper Scissors Experiment

The data contains images of hands playing the rock, paper, scissor game.

## Load the Data

Load the train set:

In [None]:
train_digits, rps_info = tfds.load(
    'rock_paper_scissors', with_info=True,
    split='train', as_supervised=True,
    try_gcs=True)

Load the test set:

In [None]:
test_digits = tfds.load(
    'rock_paper_scissors',  try_gcs=True,
    as_supervised=True, split='test')

Display metadata:

In [None]:
rps_info

Inspect:

In [None]:
for image, label in train_digits.take(5):
  print (image.shape, label.numpy())

## Visualize

Visualize examples from the train set:

In [None]:
fig = tfds.show_examples(train_digits, rps_info)

## Build the Input Pipeline

Create a function to process images and labels:

In [None]:
def process_digits(image, label):
  resized_image = tf.image.resize(image, [224, 224])
  final_image = tf.keras.applications.xception.\
                preprocess_input(resized_image)
  one_hot = tf.one_hot(label, 3)
  return final_image, one_hot

Build the pipeline:

In [None]:
BATCH_SIZE = 64
shuffle = 250

train_fingers = train_digits.shuffle(shuffle).\
  map(process_digits).batch(BATCH_SIZE).prefetch(1)
test_fingers = test_digits.map(process_digits).\
  batch(BATCH_SIZE).prefetch(1)

## Create the Model

Create the base model:

In [None]:
Xception = tf.keras.applications.xception.Xception
xception_model = Xception(
    weights='imagenet', include_top=False)

Clear and seed:

In [None]:
tf.keras.backend.clear_session()
np.random.seed(0)
tf.random.set_seed(0)

Create the final model:

In [None]:
pre_trained_model.trainable = True

fingers_model = tf.keras.Sequential([
  xception_model,
  GlobalAveragePooling2D(),
  Dense(3, activation='softmax')])

## Compile and Train

Compile:

In [None]:
optimizer = tf.keras.optimizers.Adam(learning_rate=0.00001)

fingers_model.compile(
    optimizer=optimizer,
    loss = 'categorical_crossentropy',
    metrics=['accuracy'])

Train:

In [None]:
history = fingers_model.fit(
    train_fingers, epochs=10,
    validation_data=test_fingers)

## Visualize Performance

Visualize:

In [None]:
visualize(range(10))

## Generalize

Generalize on test data:

In [None]:
fingers_model.evaluate(test_fingers)

# Tips and Concepts

For additional tips to tune transfer learning models, peruse:

https://medium.com/@kenneth.ca95/a-guide-to-transfer-learning-with-keras-using-resnet50-a81a4a28084b

For a comprehensive take on the subject, peruse:

https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a