##### Copyright 2019 The TensorFlow Authors.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/rses-dl-course/rses-dl-course.github.io/blob/master/notebooks/python/L06_saving_and_loading_models.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/rses-dl-course/rses-dl-course.github.io/blob/master/notebooks/python/L06_saving_and_loading_models.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

# Lab 05: Saving and Loading Models

In this lab we will learn how we can take a trained model, save it, and then load it back to keep training it or use it to perform inference. We will train a classifier to classify images of cats and dogs, just like we did in the previous lesson. This time however, we will be adding checkpoints while training. We will then restore this model from checkpoint, use it to perform predictions, and then continue to train the model. Finally, we will save our trained model as a TensorFlow SavedModel and then we will download it to a local disk, so that it can later be used for deployment in different platforms.

## Concepts that will be covered in this Colab

1. Saving checkpoints while training the model
2. Saving models in the TensorFlow SavedModel format
3. Loading models
4. Download models to Local Disk

Before starting this Colab, you should reset the Colab environment by selecting `Runtime -> Reset all runtimes...` from menu above.

# Imports

In [None]:
import math
import os
import time
import numpy as np
import matplotlib.pylab as plt

import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from tensorflow.keras import layers

# Part 1: Load the Cats vs. Dogs Dataset

In [None]:

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
zip_dir = tf.keras.utils.get_file('cats_and_dogs_filterted.zip', origin=_URL, extract=True)
base_dir = os.path.join(os.path.dirname(zip_dir), 'cats_and_dogs_filtered')
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

train_cats_dir = os.path.join(train_dir, 'cats')  # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs')  # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats')  # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs')  # directory with our validation dog pictures

train_image_generator      = ImageDataGenerator(rescale=1./255)  # Generator for our training data
validation_image_generator = ImageDataGenerator(rescale=1./255)  # Generator for our validation data

BATCH_SIZE = 32

# The images in the Dogs vs. Cats dataset are not all the same size, so we'll 
# re-format the image resoluation to 150x150 just like our previous exercise.
IMAGE_RES = 150

train_batches = train_image_generator.flow_from_directory(batch_size=BATCH_SIZE,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMAGE_RES,IMAGE_RES), #(150,150)
                                                           class_mode='binary')

validation_batches = validation_image_generator.flow_from_directory(batch_size=BATCH_SIZE,
                                                              directory=validation_dir,
                                                              shuffle=True,
                                                              target_size=(IMAGE_RES,IMAGE_RES), #(150,150)
                                                              class_mode='binary')

class_names = np.array(["cats", "dogs"])

# Part 2: Train the Model with checkpoints

We will use the same model as in the beginning of the previous exercise to quickly put together a cats and dog classiffier. This time however, we will add checkpointing so that the model weights are saved as it is being trained.

Checkpointing as especially important with larger models that takes a long time to train. It can save from having to train from scratch should something go wrong (e.g. power cuts, system crash, etc.) and means you can go back to a specific stage of training and refine your hyperparameters.


In [None]:
# Define the model
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),

    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(2)
])
# Compile the model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

Createa a [`ModelCheckpoint`](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ModelCheckpoint) which will be used as the callback when we train the model. By default, it will save after every epoch but the `save_freq=` parameter can used to specify how frequently to save.

Named formatting can be used when specifying the path of our checkpoint files, in this case `cp-epoch{epoch:04d}-val_loss{val_loss:.2f}.ckpt` will record the epoch and validation loss at that point.

In [None]:
checkpoint_path = "catsdogs_checkpoint/cp-epoch{epoch:04d}-val_loss{val_loss:.2f}.ckpt"  
checkpoint_dir = os.path.dirname(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1,)

We'll train the model for only 6 epochs for now, notice that `cp_callback` is added to the `callbacks` parameter.

In [None]:
EPOCHS = 6
history = model.fit(
    train_batches,
    epochs=EPOCHS,
    validation_data=validation_batches,
    callbacks=[cp_callback], # Add the ModelCheckpoint to the list of training callbacks
)

You can see that the checkpoint files are saved in the `catsdogs_checkpoint` directory as we specified:

In [None]:
os.listdir(checkpoint_dir)

## Check the predictions

Below is an example function that takes the first batch from a provided dataset and use it to perform prediction on the provided model.

In [None]:
# Gets the first batch from validation_batches set
image_batch, label_batch = next(iter(validation_batches))


def check_predictions(model):
    """
    Function for visualising predictions
    """

    # Perform prediction on the batch
    predicted_batch = model.predict(image_batch)
    predicted_batch = tf.squeeze(predicted_batch).numpy()
    predicted_ids = np.argmax(predicted_batch, axis=-1)
    predicted_class_names = class_names[predicted_ids]
    print("Predicted class names: ", predicted_class_names)
    
    # Print out label and predictions as a comparison
    print("Labels: ", label_batch)
    print("Predicted labels: ", predicted_ids)
    
    # Visually plot this result
    plt.figure(figsize=(10,9))
    for n in range(30):
      plt.subplot(6,5,n+1)
      plt.imshow(image_batch[n])
      color = "blue" if predicted_ids[n] == label_batch[n] else "red"
      plt.title(predicted_class_names[n].title(), color=color)
      plt.axis('off')
    _ = plt.suptitle("Model predictions (blue: correct, red: incorrect)")
    

In [None]:
check_predictions(model)

# Part 4:  Load from previously saved checkpoint

We will now load a model checkpoint we just saved into a new model called `reloaded`. When we created `ModelCheckpoint` the parameter `save_weights_only=True` was specified, this means only the weights matrix was saved. Hence we also need to re-define our model graph:

In [None]:
# Define the model
reloaded = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),

    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(2)
])
# Compile the model
reloaded.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

As it's a new model, it performs pooly when running a prediction:

In [None]:
check_predictions(reloaded)

We can load from the latest checkpoint by using the function `tf.train.latest_checkpoint()` or use `tf.train.load_checkpoint()` to load a specific file. The resulting weights matrix can then be loaded to our `reloaded` model.

In [None]:
# Gets a weights matrix from the checkpoint
latest = tf.train.latest_checkpoint(checkpoint_dir)

# Loads to our model
reloaded.load_weights(latest)

Let's check the prediction again, it will give the same result as our previously trained model.

In [None]:
check_predictions(reloaded)

# Keep Training

Besides making predictions, we can also take our `reloaded` model and keep training it. To do this, you can just train the `reloaded` as usual, using the `.fit` method.

In [None]:
EPOCHS = 10
history = reloaded.fit(train_batches,
                    epochs=EPOCHS,
                    validation_data=validation_batches)

Check the prediction again to see the results of further training:

In [None]:
check_predictions(reloaded)

# Part 5: Export as SavedModel


You can also export a whole model to the TensorFlow SavedModel format. SavedModel is a standalone serialization format for Tensorflow objects, supported by TensorFlow serving as well as TensorFlow implementations other than Python. A SavedModel contains a complete TensorFlow program, including weights and computation. It does not require the original model building code to run, which makes it useful for sharing or deploying (with TFLite, TensorFlow.js, TensorFlow Serving, or TFHub).

The SavedModel files that were created contain:

* A TensorFlow checkpoint containing the model weights.
* A SavedModel proto containing the underlying Tensorflow graph. Separate graphs are saved for prediction (serving), train, and evaluation. If the model wasn't compiled before, then only the inference graph gets exported.
* The model's architecture config, if available.


Let's save our original `model` as a TensorFlow SavedModel. To do this we will use the `model.save()` function. This functions takes in the model we want to save and the path to the folder where we want to save our model.

This function will create a folder where you will find an `assets` folder, a `variables` folder, `keras_metadata.pb` and the `saved_model.pb` file.

In [None]:
t = time.time()

export_path_sm = "./{}".format(int(t))
print(export_path_sm)

model.save(export_path_sm)

In [None]:
!ls {export_path_sm}

# Part 6: Load SavedModel

Now, let's load our SavedModel and use it to make predictions. We use the `tf.saved_model.load()` function to load our SavedModels. The object returned by `tf.saved_model.load` is 100% independent of the code that created it.

In [None]:
reloaded_sm = tf.saved_model.load(export_path_sm)

Now, let's use the `reloaded_sm` (reloaded SavedModel) to make predictions on a batch of images.

In [None]:
# Prediction from the saved model as a comparison
result_batch = model.predict(image_batch)

# Prediction from the loaded model
reload_sm_result_batch = reloaded_sm(image_batch, training=False).numpy()

We can check that the reloaded SavedModel and the previous model give the same result.

In [None]:
(abs(result_batch - reload_sm_result_batch)).max()

As we can see, the result is 0.0, which indicates that both models made the same predictions on the same batch of images.

# Part 7: Loading the SavedModel as a Keras Model

The object returned by `tf.saved_model.load` is not a Keras object (i.e. doesn't have `.fit`, `.predict`, `.summary`, etc. methods). Therefore, you can't simply take your `reloaded_sm` model and keep training it by running `.fit`. To be able to get back a full keras model from the Tensorflow SavedModel format we must use the `tf.keras.models.load_model` function. This function will work the same as before, except now we pass the path to the folder containing our SavedModel.

In [None]:
t = time.time()

export_path_sm = "./{}".format(int(t))
print(export_path_sm)
model.save(export_path_sm)

In [None]:
reload_sm_keras = tf.keras.models.load_model(
  export_path_sm,
  custom_objects={'KerasLayer': hub.KerasLayer})

reload_sm_keras.summary()

Now, let's use the `reloaded_sm_keras` (reloaded Keras model from our SavedModel) to make predictions on a batch of images.

In [None]:
result_batch = model.predict(image_batch)
reload_sm_keras_result_batch = reload_sm_keras.predict(image_batch)

We can check that the reloaded Keras model and the previous model give the same result.

In [None]:
(abs(result_batch - reload_sm_keras_result_batch)).max()

# Part 8:  Download your model

You can download the SavedModel to your local disk by creating a zip file. We wil use the `-r` (recursice) option to zip all subfolders. 

In [None]:
!zip -r model.zip {export_path_sm}

The zip file is saved in the current working directory. You can see what the current working directory is by running:

In [None]:
!ls

Once the file is zipped, you can download  it to your local disk. 

In [None]:
try:
  from google.colab import files
  files.download('./model.zip')
except ImportError:
  pass

The `files.download` command will  search for files in your current working directory. If the file you want to download is in a directory other than the current working directory, you have to include the path to the directory where the file is located.