# Week 2 Assignment: Zombie Detection

Welcome to this week's programming assignment! You will use the Object Detection API and retrain [RetinaNet](https://arxiv.org/abs/1708.02002) to spot Zombies using just 5 training images. You will setup the model to restore pretrained weights and fine tune the classification layers.

<img src='https://drive.google.com/uc?export=view&id=18Ck0qNSZy9F1KsUKWc4Jv7_x_1e_fXTN' alt='zombie'>

## Exercises

* [Exercise 1 - Import Object Detection API packages](#exercise-1)
* [Exercise 2 - Visualize the training images](#exercise-2)
* [Exercise 3 - Define the category index dictionary](#exercise-3)
* [Exercise 4 - Download checkpoints](#exercise-4)
* [Exercise 5.1 - Locate and read from the configuration file](#exercise-5-1)
* [Exercise 5.2 - Modify the model configuration](#exercise-5-2)
* [Exercise 5.3 - Modify model_config](#exercise-5-3)
* [Exercise 5.4 - Build the custom model](#exercise-5-4)
* [Exercise 6.1 - Define Checkpoints for the box predictor](#exercise-6-1)
* [Exercise 6.2 - Define the temporary model checkpoint](#exercise-6-2)
* [Exercise 6.3 - Restore the checkpoint](#exercise-6-2)
* [Exercise 7 - Run a dummy image to generate the model variables](#exercise-7)
* [Exercise 8 - Set training hyperparameters](#exercise-8)
* [Exercise 9 - Select the prediction layer variables](#exercise-9)
* [Exercise 10 - Define the training step](#exercise-10)
* [Exercise 11 - Preprocess, predict, and post process an image](#exercise-11)

## Installation

You'll start by installing the Tensorflow 2 [Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection).

In [None]:
# Delete an existing models directory
!rm -rf ./models/

# Clone the Tensorflow Model Garden
!git clone --depth 1 https://github.com/tensorflow/models/

In [None]:
# Install Object Detection API
!cd models/research/ && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install .

## Imports

In [None]:
import matplotlib
import matplotlib.pyplot as plt

import os
import random
import zipfile
import io
import scipy.misc
import numpy as np

import glob
import imageio
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass

import tensorflow as tf
tf.get_logger().setLevel('ERROR')

<a name='exercise-1'></a>
### **Exercise 1**: Import Object Detection API packages

In [None]:
# Import object detection utilities

from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder
from object_detection.utils import colab_utils

## Utilities

You'll define a couple of utility functions for loading images and plotting detections. This code is provided for you.

In [None]:
def load_image_into_numpy_array(path):
    """Load an image from file into a numpy array.

    Puts image into numpy array to feed into tensorflow graph.
    Note that by convention we put it into a numpy array with shape
    (height, width, channels), where channels=3 for RGB.

    Args:
    path: a file path.

    Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
    """
    
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(img_data))
    (im_width, im_height) = image.size
    
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)


def plot_detections(image_np,
                    boxes,
                    classes,
                    scores,
                    category_index,
                    figsize=(12, 16),
                    image_name=None):
    """Wrapper function to visualize detections.

    Args:
    image_np: uint8 numpy array with shape (img_height, img_width, 3)
    boxes: a numpy array of shape [N, 4]
    classes: a numpy array of shape [N]. Note that class indices are 1-based,
          and match the keys in the label map.
    scores: a numpy array of shape [N] or None.  If scores=None, then
          this function assumes that the boxes to be plotted are groundtruth
          boxes and plot all boxes as black with no classes or scores.
    category_index: a dict containing category dictionaries (each holding
          category index `id` and category name `name`) keyed by category indices.
    figsize: size for the figure.
    image_name: a name for the image file.
    """
    
    image_np_with_annotations = image_np.copy()
    
    viz_utils.visualize_boxes_and_labels_on_image_array(
        image_np_with_annotations,
        boxes,
        classes,
        scores,
        category_index,
        use_normalized_coordinates=True,
        min_score_thresh=0.8)
    
    if image_name:
        plt.imsave(image_name, image_np_with_annotations)
    
    else:
        plt.imshow(image_np_with_annotations)


## Download the Zombie data

In [None]:
# Delete existing zip and training directory
!rm training-zombie.zip
!rm -rf ./training

# Download zombie images
!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/training-zombie.zip \
    -O ./training-zombie.zip

# Unzip
local_zip = './training-zombie.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('./training')
zip_ref.close()

<a name='exercise-2'></a>

### **Exercise 2**: Visualize the training images

In [None]:
%matplotlib inline

# Set directory name
train_image_dir = './training'

# Initialize list to hold image data
train_images_np = []

# Iterate over training images
for i in range(1, 6):

    image_path = os.path.join(train_image_dir, 'training-zombie' + str(i) + '.jpg')
    print(image_path)

    # Load images into numpy arrays and append to a list
    train_images_np.append(load_image_into_numpy_array(image_path))

# Configure plot settings
plt.rcParams['axes.grid'] = False
plt.rcParams['xtick.labelsize'] = False
plt.rcParams['ytick.labelsize'] = False
plt.rcParams['xtick.top'] = False
plt.rcParams['xtick.bottom'] = False
plt.rcParams['ytick.left'] = False
plt.rcParams['ytick.right'] = False
plt.rcParams['figure.figsize'] = [14, 7]

# Plot images
for idx, train_image_np in enumerate(train_images_np):
    plt.subplot(1, 5, idx+1)
    plt.imshow(train_image_np)

plt.show()

<a name='gt_boxes_definition'></a>
## Prepare data for training

In [None]:
# Define the list of ground truth boxes
gt_boxes = []

In [None]:
# Draw ground truth boxes
colab_utils.annotate(train_images_np, box_storage_pointer=gt_boxes)

In [None]:
# TEST CODE:
try:
  assert(len(gt_boxes) == 5), "Warning: gt_boxes is empty. Did you click `submit`?"

except AssertionError as e:
  print(e)

# checks if there are boxes for all 5 images
for gt_box in gt_boxes:
    try:
      assert(gt_box is not None), "There are less than 5 sets of box coordinates. " \
                                  "Please re-run the cell above to draw the boxes again.\n" \
                                  "Alternatively, you can run the next cell to load pre-determined " \
                                  "ground truth boxes."
    
    except AssertionError as e:
        print(e)
        break


ref_gt_boxes = [
        np.array([[0.27333333, 0.41500586, 0.74333333, 0.57678781]]),
        np.array([[0.29833333, 0.45955451, 0.75666667, 0.61078546]]),
        np.array([[0.40833333, 0.18288394, 0.945, 0.34818288]]),
        np.array([[0.16166667, 0.61899179, 0.8, 0.91910903]]),
        np.array([[0.28833333, 0.12543962, 0.835, 0.35052755]]),
      ]

for gt_box, ref_gt_box in zip(gt_boxes, ref_gt_boxes):
    try:
      assert(np.allclose(gt_box, ref_gt_box, atol=0.04)), "One of the boxes is too big or too small. " \
                                                          "Please re-draw and make the box tighter around the zombie."
    
    except AssertionError as e:
      print(e)
      break

#### View your ground truth box coordinates

In [None]:
# Print ground truth coordinates
for gt_box in gt_boxes:
  print(gt_box)

<a name='exercise-3'></a>

### **Exercise 3**: Define the category index dictionary


In [None]:
# Set zombie class ID
zombie_class_id = 1

# Define zombie class dictionary
category_index = {zombie_class_id: {'id': zombie_class_id, 'name': 'zombie'}}

# Specify number of classes
num_classes = 1

### Data preprocessing

In [None]:
# Set label ID offset
label_id_offset = 1

# Initialize list for image tensors
train_image_tensors = []

# Initialize lists for ground truth classes and bounding boxes
gt_classes_one_hot_tensors = []
gt_box_tensors = []

for (train_image_np, gt_box_np) in zip(train_images_np, gt_boxes):
    
    # Convert training image to tensor and add to list
    train_image_tensors.append(tf.expand_dims(tf.convert_to_tensor(
        train_image_np, dtype=tf.float32), axis=0))
    
    # Convert GT box array to tensor and add to list
    gt_box_tensors.append(tf.convert_to_tensor(gt_box_np, dtype=tf.float32))
    
    # Apply offset to get zero-indexed GT classes
    zero_indexed_gt_classes = tf.convert_to_tensor(
        np.ones(shape=[gt_box_np.shape[0]], dtype=np.int32) - label_id_offset)
    
    # Get one-hot encoded gt classes
    gt_classes_one_hot_tensors.append(tf.one_hot(
        zero_indexed_gt_classes, num_classes))

print('Done prepping data.')

## Visualize the zombies with their ground truth bounding boxes


In [None]:
# Give boxes a score of 100%
dummy_scores = np.array([1.0], dtype=np.float32)

# Plot images
plt.figure(figsize=(30, 15))
for idx in range(5):
    plt.subplot(2, 4, idx+1)
    plot_detections(
      train_images_np[idx],
      gt_boxes[idx],
      np.ones(shape=[gt_boxes[idx].shape[0]], dtype=np.int32),
      dummy_scores, category_index)

plt.show()

## Download the checkpoint containing the pre-trained weights

Next, you will download [RetinaNet](https://arxiv.org/abs/1708.02002) and copy it inside the object detection directory.

<a name='exercise-4'></a>
### Exercise 4: Download checkpoints

  - Download the compressed SSD Resnet 50 version 1, 640 x 640 checkpoint.
  - Untar (decompress) the tar file
  - Move the decompressed checkpoint to `models/research/object_detection/test_data/`


In [None]:
# Download the SSD Resnet 50 version 1, 640x640 checkpoint
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
    
# Untar file
!tar -xf ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz

# Copy checkpoint to test_data folder
!mv ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint models/research/object_detection/test_data/

## Configure the model
Here, you will configure the model for this use case.

<a name='exercise-5-1'></a>

### **Exercise 5.1**: Locate and read from the configuration file


In [None]:
tf.keras.backend.clear_session()

# Define .config file path
pipeline_config = 'models/research/object_detection/configs/tf2/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.config'

# Load config file
configs = config_util.get_configs_from_pipeline_file(pipeline_config)

# Show configs
configs

<a name='exercise-5-2'></a>

### **Exercise 5.2**: Get the model configuration

In [None]:
# Get 'model' configs object
model_config = configs['model']

# Show model_config
model_config

<a name='exercise-5-3'></a>

### **Exercise 5.3**: Modify model_config

In [None]:
# Modify num_classes
model_config.ssd.num_classes = num_classes

# Freeze batch normalization
model_config.ssd.freeze_batchnorm = True

# Show updated model_config
model_config

## Build the model

<a name='exercise-5.4'></a>

### **Exercise 5.4**: Build the custom model

In [None]:
# Build new model using model_config
detection_model = model_builder.build(model_config, is_training=True, add_summaries=True)

## Restore weights from your checkpoint

Now, you will selectively restore weights from your checkpoint.
- The parts of RetinaNet that you want to reuse are:
  - Feature extraction layers
  - Bounding box regression prediction layer

<a name='exercise-6-1'></a>
### Exercise 6.1: Define Checkpoints for the box predictor

In [None]:
# Define checkpoint to load box predictor
tmp_box_predictor_checkpoint = tf.train.Checkpoint(
    _base_tower_layers_for_heads = detection_model._box_predictor._base_tower_layers_for_heads,
    _box_prediction_head = detection_model._box_predictor._box_prediction_head
)

<a name='exercise-6-2'></a>
### Exercise 6.2: Define the temporary model checkpoint**


In [None]:
# Define checkpoint to load temporary model
tmp_model_checkpoint = tf.train.Checkpoint(
    _feature_extractor = detection_model._feature_extractor,
    _box_predictor = tmp_box_predictor_checkpoint
)

<a name='exercise-6-3'></a>
### Exercise 6.3: Restore the checkpoint

In [None]:
checkpoint_path = 'models/research/object_detection/test_data/checkpoint/ckpt-0'

# Define checkpoint to restore model
checkpoint = tf.train.Checkpoint(model=tmp_model_checkpoint)

# Restore checkpoint to checkpoint_path
checkpoint.restore(checkpoint_path)

<a name='exercise-7'></a>
### **Exercise 7**: Run a dummy image to generate the model variables

Run a dummy image through the model so that variables are created.

In [None]:
# Run dummy image through model to restore weights
tmp_image, tmp_shapes = detection_model.preprocess(tf.zeros([1, 640, 640, 3]))
tmp_prediction_dict = detection_model.predict(tmp_image, tmp_shapes)
tmp_detections = detection_model.postprocess(tmp_prediction_dict, tmp_shapes)

print('Weights restored!')

## Eager mode custom training loop

With the data and model now setup, you can now proceed to configure the training.


<a name='exercise-8'></a>
### **Exercise 8**: Set training hyperparameters

Set an appropriate learning rate and optimizer for the training. 

In [None]:
# Set model training hyperparameters
tf.keras.backend.set_learning_phase(True)

# Batch size
batch_size = 4

# Number of batches
num_batches = 1000

# Learning rate
learning_rate = 0.01

# Optimizer
optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=0.9)

## Choose the layers to fine-tune

In [None]:
# Inspect the layers of detection_model
for i, v in enumerate(detection_model.trainable_variables):
    print("i: {} \t name: {} \t shape:{} \t dtype={}".format(i, v.name, v.shape, v.dtype))

<a name='exercise-9'></a>

### **Exercise 9**: Select the prediction layer variables

In [None]:
# Collect layers to fine tune
to_fine_tune = []
prefixes_to_train = [
  'WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead',
  'WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead'
]

for var in detection_model.trainable_variables:
  if any([var.name.startswith(prefix) for prefix in prefixes_to_train]):
    to_fine_tune.append(var)

## Train the model

<a name='exercise-10'></a>

### **Exercise 10**: Define the training step

In [None]:
# Define training step
@tf.function
def train_step_fn(image_list,
                groundtruth_boxes_list,
                groundtruth_classes_list,
                model,
                optimizer,
                vars_to_fine_tune):
    """A single training iteration.

    Args:
      image_list: A list of [1, height, width, 3] Tensor of type tf.float32.
        Note that the height and width can vary across images, as they are
        reshaped within this function to be 640x640.
      groundtruth_boxes_list: A list of Tensors of shape [N_i, 4] with type
        tf.float32 representing groundtruth boxes for each image in the batch.
      groundtruth_classes_list: A list of Tensors of shape [N_i, num_classes]
        with type tf.float32 representing groundtruth boxes for each image in
        the batch.

    Returns:
      A scalar tensor representing the total loss for the input batch.
    """

    with tf.GradientTape() as tape:

        # Preprocess input images
        preprocessed_images = []
        image_shapes = []

        for img in image_list:
          preprocessed_img, img_shape = model.preprocess(img)
          preprocessed_images.append(preprocessed_img)
          image_shapes.append(img_shape)

        preprocessed_images_tensor = tf.concat(preprocessed_images, axis=0)
        image_shapes_tensor = tf.concat(image_shapes, axis=0)

        # Make a prediction
        prediction_dict = model.predict(preprocessed_images_tensor, image_shapes_tensor)

        # Calculate loss
        model.provide_groundtruth(
            groundtruth_boxes_list=groundtruth_boxes_list,
            groundtruth_classes_list=groundtruth_classes_list)
        
        losses_dict = model.loss(prediction_dict, image_shapes_tensor)
        total_loss = losses_dict['Loss/localization_loss'] + losses_dict['Loss/classification_loss']

        # Calculate the gradients
        gradients = tape.gradient(total_loss, vars_to_fine_tune)

        # Optimize training variables
        optimizer.apply_gradients(zip(gradients, vars_to_fine_tune))
        
    return total_loss

In [None]:
# Define early stopping callback
def early_stop_callback(losses, min_delta=0.00001, patience=100):
  # No early stopping for at least (2 * patience) epochs 
  if len(losses) < 2 * patience:
    return False
  
  # Get recent losses
  recent_losses = np.array(losses[-(patience + 1):])

  # Pop current loss
  current_loss, recent_losses = recent_losses[-1], recent_losses[:-1]

  # Test whether current_loss is less than moving avg loss
  if current_loss < (np.mean(recent_losses, axis=0) - min_delta):
    return False
  else:
    return True

## Run the training loop

In [None]:
print('Start fine-tuning!', flush=True)

# Define list to store losses
losses = []

for idx in range(num_batches):
  # Grab indices for random sample of images
  indices = list(range(len(train_images_np)))
  random.shuffle(indices)
  sample_keys = indices[:batch_size]

  # Get ground truth classes & boxes
  gt_classes_list = [gt_classes_one_hot_tensors[key] for key in sample_keys]
  gt_boxes_list = [gt_box_tensors[key] for key in sample_keys]
  
  # Get training images
  image_tensors = [train_image_tensors[key] for key in sample_keys]

  # Training step
  total_loss = train_step_fn(image_tensors, 
                              gt_boxes_list, 
                              gt_classes_list,
                              detection_model,
                              optimizer,
                              to_fine_tune
                            )
  
  # Save loss
  losses.append(total_loss)

  # Print progress
  if idx % 10 == 0:
    print('batch ' + str(idx) + ' of ' + str(num_batches)
    + ', loss=' +  str(total_loss.numpy()), flush=True)
  
  # Test for early stop
  if early_stop_callback(losses):
    # Print final output
    print('\nStopped at batch ' + str(idx) + ' of ' + str(num_batches)
    + ', loss=' +  str(total_loss.numpy()), flush=True)
    break

print('Done fine-tuning!')

## Load test images and run inference with new model!

You can now test your model on a new set of images. The cell below downloads 237 images of a walking zombie and stores them in a `results/` directory.

In [None]:
# Delete existing files
!rm zombie-walk-frames.zip
!rm -rf ./zombie-walk
!rm -rf ./results

# Download test images
!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/zombie-walk-frames.zip \
    -O zombie-walk-frames.zip

# Unzip test images
local_zip = './zombie-walk-frames.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('./results')
zip_ref.close()

Load these images into a numpy array to prepare for inference.

In [None]:
test_image_dir = './results/'
test_images_np = []

# Load images into a numpy array
for i in range(0, 237):
    image_path = os.path.join(test_image_dir, 'zombie-walk' + "{0:04}".format(i) + '.jpg')
    print(image_path)
    test_images_np.append(np.expand_dims(
      load_image_into_numpy_array(image_path), axis=0))

<a name='exercise-11'></a>

### **Exercise 11**: Preprocess, predict, and post process an image

In [None]:
@tf.function
def detect(input_tensor):
    """Run detection on an input image.

    Args:
    input_tensor: A [1, height, width, 3] Tensor of type tf.float32.
      Note that height and width can be anything since the image will be
      immediately resized according to the needs of the model within this
      function.

    Returns:
    A dict containing 3 Tensors (`detection_boxes`, `detection_classes`,
      and `detection_scores`).
    """
    preprocessed_image, shapes = detection_model.preprocess(input_tensor)
    prediction_dict = detection_model.predict(preprocessed_image, shapes)
    detections = detection_model.postprocess(prediction_dict, shapes)
    
    return detections

In [None]:
# Set label ID offset
label_id_offset = 1

# Initialize dict to store results
results = {'boxes': [], 'scores': []}

# Iterate over test images to get results
for i in range(len(test_images_np)):

    # Convert input image array to tensor
    input_tensor = tf.convert_to_tensor(test_images_np[i], dtype=tf.float32)

    # Detect on input image
    detections = detect(input_tensor)

    # Plot detections
    plot_detections(
      test_images_np[i][0],
      detections['detection_boxes'][0].numpy(),
      detections['detection_classes'][0].numpy().astype(np.uint32)
      + label_id_offset,
      detections['detection_scores'][0].numpy(),
      category_index, figsize=(15, 20), image_name="./results/gif_frame_" + ('%03d' % i) + ".jpg")
    
    # Save results
    results['boxes'].append(detections['detection_boxes'][0][0].numpy())
    results['scores'].append(detections['detection_scores'][0][0].numpy())

In [None]:
# TEST CODE

# Compare with expected bounding boxes
print(np.allclose(results['boxes'][0], [0.28838485, 0.06830047, 0.7213766 , 0.19833465], rtol=0.18))
print(np.allclose(results['boxes'][5], [0.29168868, 0.07529271, 0.72504973, 0.20099735], rtol=0.18))
print(np.allclose(results['boxes'][10], [0.29548776, 0.07994056, 0.7238164 , 0.20778716], rtol=0.18))

In [None]:
# Check percent of frames where a zombie is detected
scores = np.array(results['scores'])
zombie_detected = (np.where(scores > 0.9, 1, 0).sum())/237*100
print(zombie_detected)

In [None]:
# Inspect individual frames

print('Frame 0')
display(IPyImage('./results/gif_frame_000.jpg'))
print()
print('Frame 5')
display(IPyImage('./results/gif_frame_005.jpg'))
print()
print('Frame 10')
display(IPyImage('./results/gif_frame_010.jpg'))

## Create a zip of the zombie-walk images. 

In [None]:
# Create zombie detections zip file

zipf = zipfile.ZipFile('./zombie.zip', 'w', zipfile.ZIP_DEFLATED)

filenames = glob.glob('./results/gif_frame_*.jpg')
filenames = sorted(filenames)

for filename in filenames:
    zipf.write(filename)

zipf.close()

## Create Zombie animation

In [None]:
# Create zombie detections GIF

imageio.plugins.freeimage.download()

!rm -rf ./results/zombie-anim.gif

anim_file = './zombie-anim.gif'

filenames = glob.glob('./results/gif_frame_*.jpg')
filenames = sorted(filenames)
last = -1
images = []

for filename in filenames:
    image = imageio.imread(filename)
    images.append(image)

imageio.mimsave(anim_file, images, 'GIF-FI', fps=10)

## Save results file for grading

In [None]:
import pickle

# Remove file if it exists
!rm results.data

# Write results to binary file
with open('results.data', 'wb') as filehandle:
    pickle.dump(results['boxes'], filehandle)

print('Done saving!')

In [None]:
from google.colab import files

files.download('results.data')