# EDA + tensorflow object detection API training
This kernel aims to: 
- explore the dataset for the competition [TensorFlow - Help Protect the Great Barrier Reef](https://www.kaggle.com/c/tensorflow-great-barrier-reef/overview/description)
- show how to train a TensorFlow Object Detection API model and do transfer learning for this task (this part is taken from https://www.kaggle.com/khanhlvg/cots-detection-w-tensorflow-object-detection-api/)

# Table of Contents:
* **1. [TensorFlow Object Detection API installation and Libraries](#Libraries)** <br>
* **2. [Data Analysis](#Data_Analysis)** <br>
 2.0  [Helper functions](#Helper_functions) <br>
 2.1. [Video Id](#Video_Id) <br>
 2.2. [Sequences](#Sequences) <br>
 2.3. [Annotations](#Annotations) <br>
 2.4. [Image visualiation](#Image_visualization) <br>
* **3. [Data Preparation and install API](#Data_Preparation)** <br>
* **4. [Model](#Model)** <br>
* **5. [Results](#Results)** <br>
* **6. [Zip and Download trained model](#Download)** <br>

<a id="Libraries"></a> <br> 
# **1. TensorFlow Object Detection API installation and Libraries** 

## Install TensorFlow Object Detection API

### Clone github project

In [None]:
!git clone https://github.com/tensorflow/models
    
# Check out a certain commit to ensure that future changes in the TF ODT API codebase won't affect this notebook.
!cd models && git checkout ac8d06519

### Install tensorflow object detection API

In [None]:
%%bash
cd models/research

# Compile protos.
protoc object_detection/protos/*.proto --python_out=.

wget https://storage.googleapis.com/odml-dataset/others/setup.py
pip install -q --user .

# Test if the Object Dectection API is working correctly
python object_detection/builders/model_builder_tf2_test.py

# Import dependencies

In [None]:
# Libraries
import os
import io
import json
import sys
import cv2
from PIL import Image, ImageDraw
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib as mply
import matplotlib.patches as patches
import matplotlib.pyplot as plt
import contextlib2
import IPython
import time
import pathlib
import tensorflow as tf
import random
from PIL import Image, ImageDraw

In [None]:
INPUT_DIR = '/kaggle/input/tensorflow-great-barrier-reef/'
sys.path.insert(0, INPUT_DIR)

In [None]:
# The notebook is supposed to run with TF 2.6.0
print(tf.__version__)
print(tf.test.is_gpu_available())
print(tf.config.list_physical_devices('GPU'))

<a id="Data_Analysis"></a> <br> 
# **2. Data Analysis** 

In [None]:
# Read metadata
train_df = pd.read_csv(os.path.join(INPUT_DIR,"train.csv"))
test_df =  pd.read_csv(os.path.join(INPUT_DIR,"test.csv"))

In [None]:
train_df.shape

In [None]:
train_df.head(5)

In [None]:
test_df.shape

In [None]:
test_df

We have looked at the content of [train/test].csv - 
These files contains the metadata for the images. Most of the test metadata data is only available to your notebook upon submission. Just the first few rows available for download.

We have 6 columns (train data has also the annotations for each frame):
* **video_id** - ID number of the video the image was part of. The video ids are not meaningfully ordered.
* **video_frame** - The frame number of the image within the video. Expect to see occasional gaps in the frame number from when the diver surfaced.
* **sequence - ID** of a gap-free subset of a given video. The sequence ids are not meaningfully ordered.
* **sequence_frame** - The frame number within a given sequence.
* **image_id** - ID code for the image, in the format '{video_id}-{video_frame}'
* **annotations** - The bounding boxes of any starfish detections in a string format that can be evaluated directly with Python. Does not use the same format as the predictions you will submit. Not available in test.csv. A bounding box is described by the pixel coordinate (x_min, y_min) of its upper left corner within the image together with its width and height in pixels.

In [None]:
# Look at the data types
train_df.dtypes

In [None]:
type(train_df['annotations'][0])

In [None]:
# See if the training dataset contains null values
train_df.isnull().sum()

<a id="Helper_functions"></a> <br> 
# **2.0 Helper functions** 

In [None]:
def show_values_on_bars(axs, h_v="v", space=0.4):
    '''Plots the value at the end of the a seaborn barplot.
    axs: the ax of the plot
    h_v: weather or not the barplot is vertical/ horizontal'''
    
    def _show_on_single_plot(ax):
        if h_v == "v":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() / 2
                _y = p.get_y() + p.get_height()
                value = int(p.get_height())
                ax.text(_x, _y, format(value, ','), ha="center") 
        elif h_v == "h":
            for p in ax.patches:
                _x = p.get_x() + p.get_width() + float(space)
                _y = p.get_y() + p.get_height()
                value = int(p.get_width())
                ax.text(_x, _y, format(value, ','), ha="left")

    if isinstance(axs, np.ndarray):
        for idx, ax in np.ndenumerate(axs):
            _show_on_single_plot(ax)
    else:
        _show_on_single_plot(axs)

        
#----------------------------------------------
def show_image(path, annot, axs=None):
    '''Shows an image and marks any COTS annotated within the frame.
    path: full path to the .jpg image
    annot: string of the annotation for the coordinates of COTS'''
    
    # This is in case we plot only 1 image
    if axs==None:
        fig, axs = plt.subplots(figsize=(23, 8))
    
    img = plt.imread(path)
    axs.imshow(img)

    if annot:
        for a in eval(annot):
            rect = patches.Rectangle((a["x"], a["y"]), a["width"], a["height"], 
                                     linewidth=3, edgecolor="#FF6103", facecolor='none')
            axs.add_patch(rect)

    axs.axis("off")

<a id="Video_Id"></a> <br> 
# **2.1 Video Id** 

In [None]:
# count the number of occurences in a video
fig_dims = (10, 8)
fig, ax = plt.subplots(figsize=fig_dims)
df1 = train_df["video_id"].value_counts().reset_index()
sns.barplot(data=df1, x="index", y="video_id", ax=ax,
            palette=["r","g","b"])
show_values_on_bars(ax, h_v="v", space=0.1)
ax.set_xlabel("Video ID")
ax.set_ylabel("")
ax.title.set_text("Frequency of Frames per Video")
ax.set_yticks([])

We have only 3 videos. The third video (index 2) has a large number of frames.

<a id="Sequences"></a> <br> 
# **2.2 Sequences** 

In [None]:
groups = train_df.groupby(["video_id","sequence"]).size()

In [None]:
groups

In [None]:
print('sequences first video: {} \nsequences second video: {} \nsequences third video: {}'.format(len(groups[0]),len(groups[1]),len(groups[2])))

We have 3 videos, the first and the second video contains 8 sequences each, the third video contains only 4 sequences but has more frames (see previous section).

<a id="Annotations"></a> <br> 
# **2.3 Annotations** 

In [None]:
# Calculate the number of total number of annotations within each frame. 
train_df["no_annotations"] = train_df["annotations"].apply(lambda x: len(eval(x)))

In [None]:
fig_dims = (10, 8)
fig, ax = plt.subplots(figsize=fig_dims)
train_df["no_annotations"].hist()
n_of_images = len(train_df)
no_annotations = round(train_df[train_df.no_annotations==0].shape[0])
with_annotations = round(train_df[train_df.no_annotations!=0].shape[0])

In [None]:
print('Total number of frames: {} \nframes with annotations: {} \nframes without annotations: {}'.format(n_of_images,with_annotations,no_annotations))

We can see that most of the frame have only 1 annotion. We have 23501 total frames, 18582 with annotations and 4919 without annotations.

<a id="Image_visualization"></a> <br> 
# **2.5 Image visualization** 

In [None]:
# Create a "path" column containing full path to the frames
base_folder = os.path.join(INPUT_DIR,"train_images")

train_df["path"] = base_folder + "/video_" + \
                         train_df['video_id'].astype(str) + "/" +\
                         train_df['video_frame'].astype(str) +".jpg"
train_df.head()

In [None]:
# look for an image with some annotations
image_path = list(train_df[train_df["no_annotations"] == 4]["path"])[0]
annotation = list(train_df[train_df["no_annotations"] == 4]["annotations"])[0]
show_image(image_path, annotation, axs=None)

<a id="Data_Preparation"></a> <br> 
# **3. Data Preparation** 

In [None]:
train_df=train_df.loc[train_df["annotations"].astype(str) != "[]"] # remove images without annotations
train_df['annotations'] = train_df['annotations'].apply(eval)


In [None]:
TRAINING_RATIO = 0.8

# Split the dataset so that no sequence is leaked from the training dataset into the validation dataset.
split_index = int(TRAINING_RATIO * len(train_df))
while train_df.iloc[split_index - 1].sequence == train_df.iloc[split_index].sequence:
    split_index += 1
    
# Shuffle both the training and validation datasets.
train_data_df = train_df.iloc[:split_index].sample(frac=1).reset_index(drop=True)
val_data_df = train_df.iloc[split_index:].sample(frac=1).reset_index(drop=True)

print('Training ratio:', 
      float(len(train_data_df)) / (len(train_data_df) + len(val_data_df)))


In [None]:
del train_df

Convert the training and validation dataset into TFRecord format as required by the TensorFlow Object Detection API.

In [None]:
from object_detection.utils import dataset_util
from object_detection.dataset_tools import tf_record_creation_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import label_map_util

def create_tf_example(video_id, video_frame, data_df, image_path):
    """Create a tf.Example entry for a given training image."""
    full_path = os.path.join(image_path, os.path.join(f'video_{video_id}', f'{video_frame}.jpg'))
    with tf.io.gfile.GFile(full_path, 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    if image.format != 'JPEG':
        raise ValueError('Image format not JPEG')

    height = image.size[1] # Image height
    width = image.size[0] # Image width
    filename = f'{video_id}:{video_frame}'.encode('utf8') # Unique id of the image.
    encoded_image_data = None # Encoded image bytes
    image_format = 'jpeg'.encode('utf8') # b'jpeg' or b'png'

    xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
    xmaxs = [] # List of normalized right x coordinates in bounding box
             # (1 per box)
    ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
    ymaxs = [] # List of normalized bottom y coordinates in bounding box
             # (1 per box)
    classes_text = [] # List of string class name of bounding box (1 per box)
    classes = [] # List of integer class id of bounding box (1 per box)
    
    rows = data_df[(data_df.video_id == video_id) & (data_df.video_frame == video_frame)]
    for _, row in rows.iterrows():
        for annotation in row.annotations:
            xmins.append(annotation['x'] / width) 
            xmaxs.append((annotation['x'] + annotation['width']) / width) 
            ymins.append(annotation['y'] / height) 
            ymaxs.append((annotation['y'] + annotation['height']) / height) 

            classes_text.append('COTS'.encode('utf8'))
            classes.append(1)

    tf_example = tf.train.Example(features=tf.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
      'image/encoded': dataset_util.bytes_feature(encoded_jpg),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    
    return tf_example


def convert_to_tfrecord(data_df, tfrecord_filebase, image_path, num_shards = 10):
    """Convert the object detection dataset to TFRecord as required by the TF ODT API."""
    with contextlib2.ExitStack() as tf_record_close_stack:
        output_tfrecords = tf_record_creation_util.open_sharded_output_tfrecords(
            tf_record_close_stack, tfrecord_filebase, num_shards)
        
        for index, row in data_df.iterrows():
            if index % 500 == 0:
                print('Processed {0} images.'.format(index))
            tf_example = create_tf_example(row.video_id, row.video_frame, data_df, image_path)
            output_shard_index = index % num_shards
            output_tfrecords[output_shard_index].write(tf_example.SerializeToString())
        
        print('Completed processing {0} images.'.format(len(data_df)))

In [None]:
!mkdir dataset
image_path = os.path.join(INPUT_DIR, 'train_images')

# Convert train images to TFRecord
print('Converting TRAIN images...')
convert_to_tfrecord(
  train_data_df,
  'dataset/cots_train',
  image_path,
  num_shards = 4
)

# Convert validation images to TFRecord
print('Converting VALIDATION images...')
convert_to_tfrecord(
  val_data_df,
  'dataset/cots_val',
  image_path,
  num_shards = 4
)

In [None]:
# Create a label map to map between label index and human-readable label name.

label_map_str = """item {
  id: 1
  name: 'COTS'
}"""

with open('dataset/label_map.pbtxt', 'w') as f:
  f.write(label_map_str)

!more dataset/label_map.pbtxt

<a id="Model"></a> <br> 
# **4. Model** 

## Train an object detection model

I'll use [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) and an [EfficientDet-D0](https://arxiv.org/pdf/1911.09070v7.pdf) base model and apply transfer learning to train a COTS detection model. 

### Download the pretrained EfficientDet-D0 checkpoint

In [None]:
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz
!tar -xvzf efficientdet_d0_coco17_tpu-32.tar.gz

## Prepare templete for model configuration

For this task we need to copy the content of the pipeline.config file (this file can be found in the dataset repository) and modify some fields:

- *num_classes* should be changed to 1
- *fine_tune_checkpoint* should be set to the checkpoint path
- *label_map_path* inside train_input_reader should be set to the labels file
- *label_map_path* inside eval_input_reader should be set to the labels file
- *input_path* inside train_input_reader should be set to the train/validation dataset
- *input_path* inside eval_input_reader should be set to the train/validation dataset

Other parameters can be changed to optimize the model

In [None]:
from string import Template

config_file_template = """
# SSD with EfficientNet-b0 + BiFPN feature extractor,
# shared box predictor and focal loss (a.k.a EfficientDet-d0).
# See EfficientDet, Tan et al, https://arxiv.org/abs/1911.09070
# See Lin et al, https://arxiv.org/abs/1708.02002
# Initialized from an EfficientDet-D0 checkpoint.
#
# Train on GPU

model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    num_classes: 1
    add_background_class: false
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    encode_background_as_zeros: true
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: [1.0, 2.0, 0.5]
        scales_per_octave: 3
      }
    }
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 1280
        max_dimension: 1280
        pad_to_max_dimension: true
        }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        depth: 64
        class_prediction_bias_init: -4.6
        conv_hyperparams {
          force_use_bias: true
          activation: SWISH
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            random_normal_initializer {
              stddev: 0.01
              mean: 0.0
            }
          }
          batch_norm {
            scale: true
            decay: 0.99
            epsilon: 0.001
          }
        }
        num_layers_before_predictor: 3
        kernel_size: 3
        use_depthwise: true
      }
    }
    feature_extractor {
      type: 'ssd_efficientnet-b0_bifpn_keras'
      bifpn {
        min_level: 3
        max_level: 7
        num_iterations: 3
        num_filters: 64
      }
      conv_hyperparams {
        force_use_bias: true
        activation: SWISH
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          scale: true,
          decay: 0.99,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.25
          gamma: 1.5
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.5
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  fine_tune_checkpoint: "efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0"
  fine_tune_checkpoint_version: V2
  fine_tune_checkpoint_type: "detection"
  batch_size: 2
  sync_replicas: false
  startup_delay_steps: 0
  replicas_to_aggregate: 1
  use_bfloat16: false
  num_steps: $training_steps
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    random_scale_crop_and_pad_to_square {
      output_size: 1280
      scale_min: 0.5
      scale_max: 2.0
    }
  }
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 5e-3
          total_steps: $training_steps
          warmup_learning_rate: 5e-4
          warmup_steps: $warmup_steps
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
}

train_input_reader: {
  label_map_path: "dataset/label_map.pbtxt"
  tf_record_input_reader {
    input_path: "dataset/cots_train-?????-of-00004"
  }
}

eval_config: {
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
  batch_size: 2;
}

eval_input_reader: {
  label_map_path: "dataset/label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "dataset/cots_val-?????-of-00004"
  }
}
"""

### Define the training pipeline

In [None]:
# Here I redefine the training and warmup steps
# Note. by setting TRAINING_STEPS = 20000 and WARMUP_STEPS = 2000 --> I can obtaina  score of 0.335
TRAINING_STEPS = 28000 # change to improve results
WARMUP_STEPS = 2000 # change to improve results
PIPELINE_CONFIG_PATH='dataset/pipeline.config'

pipeline = Template(config_file_template).substitute(
    training_steps=TRAINING_STEPS, warmup_steps=WARMUP_STEPS)

with open(PIPELINE_CONFIG_PATH, 'w') as f:
    f.write(pipeline)

## Train the object detection model

In [None]:
MODEL_DIR='cots_efficientdet_d0'
!mkdir {MODEL_DIR}
!python models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={PIPELINE_CONFIG_PATH} \
    --model_dir={MODEL_DIR} \
    --alsologtostderr

## Evaluate the object detection model

In [None]:
!python models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={PIPELINE_CONFIG_PATH} \
    --model_dir={MODEL_DIR} \
    --checkpoint_dir={MODEL_DIR} \
    --eval_timeout=0 \
    --alsologtostderr

# Export as SavedModel for inference

In [None]:
!python models/research/object_detection/exporter_main_v2.py \
    --input_type image_tensor \
    --pipeline_config_path={PIPELINE_CONFIG_PATH} \
    --trained_checkpoint_dir={MODEL_DIR} \
    --output_directory={MODEL_DIR}/output

In [None]:
!ls {MODEL_DIR}/output

In [None]:
MODEL_DIR

<a id="Results"></a> <br> 
# **5. Results** 

In [None]:
# Define some utils method for prediction and display images
def load_image_into_numpy_array(path):
    """Load an image from file into a numpy array.

    Puts image into numpy array to feed into tensorflow graph.
    Note that by convention we put it into a numpy array with shape
    (height, width, channels), where channels=3 for RGB.

    Args:
    path: a file path (this can be local or on colossus)

    Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
    """
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(io.BytesIO(img_data))
    (im_width, im_height) = image.size
    
    return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

def detect(image_np, model):
    """Detect COTS from a given numpy image."""

    input_tensor = np.expand_dims(image_np, 0)
    start_time = time.time()
    detections = model(input_tensor)
    return detections

# redefine function to show images
def show_image(path, annot, axs=None):
    '''Shows an image and marks any COTS annotated within the frame.
    path: full path to the .jpg image
    annot: string of the annotation for the coordinates of COTS'''
        
    # This is in case we plot only 1 image
    if axs==None:
        fig, axs = plt.subplots(figsize=(23, 8))
        
    img = plt.imread(path)
    axs.imshow(img)

    if annot:
        for a in annot:
            rect = patches.Rectangle((a["x"], a["y"]), a["width"], a["height"], 
                                     linewidth=3, edgecolor="#FF6103", facecolor='none')
            axs.add_patch(rect)

    axs.axis("off") 
  
# disp predicted detection box on images
def disp_prediction(path, detections, detection_threshold, axs=None):
    '''Shows an image and marks any COTS annotated within the frame.
    path: full path to the .jpg image
    annot: string of the annotation for the coordinates of COTS'''
    
    image_np = load_image_into_numpy_array(image_path)
    height, width, _ = image_np.shape
    
    num = len(detections['detection_boxes'].numpy()[0])
    detection_array = detections['detection_boxes'].numpy()[0]
    
    
    # This is in case we plot only 1 image
    if axs==None:
        fig, axs = plt.subplots(figsize=(23, 8))
    
    img = plt.imread(path)
    axs.imshow(img)

    if detection_array is not None:
        for i in range(0, num):
            score = detections['detection_scores'][0][i].numpy()
            
            if score < detection_threshold:
                continue
        
            bbox = detection_array[i]
            y_min = int(bbox[0] * height)
            x_min = int(bbox[1] * width)
            y_max = int(bbox[2] * height)
            x_max = int(bbox[3] * width)
                                   
            bbox_width = x_max - x_min
            bbox_height = y_max - y_min
                                   
            rect = patches.Rectangle((x_min, y_min), bbox_width, bbox_height, 
                                     linewidth=3, edgecolor="#FF6103", facecolor='none')
            axs.add_patch(rect)

    axs.axis("off")

In [None]:
# Load the TensorFlow COTS detection model into memory.
start_time = time.time()
tf.keras.backend.clear_session()
detect_fn_tf_odt = tf.saved_model.load(os.path.join(os.path.join(MODEL_DIR, 'output'), 'saved_model'))
end_time = time.time()
elapsed_time = end_time - start_time
print('Elapsed time: ' + str(elapsed_time) + 's')

In [None]:
# look for an image with some annotations
image_path = list(train_data_df[train_data_df["no_annotations"] == 4]["path"])[0]
annotation = list(train_data_df[train_data_df["no_annotations"] == 4]["annotations"])[0]
show_image(image_path, annotation, axs=None)

In [None]:
image_np = load_image_into_numpy_array(image_path)
detections = detect(image_np, detect_fn_tf_odt)

In [None]:
detection_threshold = 0.3
disp_prediction(image_path, detections, detection_threshold)

In [None]:
threshold = 0.25
# Visulize more images at the same time
def compare_predictions_real_detection(number_images, detection_threshold):
    # randomnly select some images from the validation set
    # disp on the left original images with annotations
    # on the right prediction
    indices = []
    for i in range(0, number_images):
        indices.append(random.randint(0, len(val_data_df)-1))
    fig, axs = plt.subplots(number_images, 2,figsize=(20,20))
    for i in range(0, number_images):
        index = indices[i]
        image_path = val_data_df["path"][index]
        image_np = load_image_into_numpy_array(image_path)
        detections = detect(image_np, detect_fn_tf_odt)
        annotations = val_data_df["annotations"][index]
        if number_images>1:
            show_image(image_path, annotations, axs=axs[i, 0])
            axs[i, 0].set_title('Image with annotations')
            disp_prediction(image_path, detections, detection_threshold, axs=axs[i, 1])
            axs[i, 1].set_title('Image with predicted bounding boxes')
        else:
            show_image(image_path, annotations, axs=axs[0])
            axs[0].set_title('Image with annotations')
            detection_threshold = 0.001
            disp_prediction(image_path, detections, detection_threshold, axs=axs[1])
            axs[1].set_title('Image with predicted bounding boxes')

Display some detections and original images with annotations

In [None]:
number_images = 5
compare_predictions_real_detection(number_images, threshold)

Results can be improved by fine tuning the model

<a id="Download"></a> <br> 
# **6. Zip and download trained model** 
Here I zip and download the trained model. I will do the inference in a second notebook since the competition rules don't allow internet connection

In [None]:
!ls

In [None]:
!zip -r trained_model.zip /kaggle/working/cots_efficientdet_d0

In [None]:
from IPython.display import FileLink
FileLink(r'trained_model.zip')

The model can then be downloaded and opened in a new notebook to make a submission