NOTE: to obtain the most recent version of this notebook, please copy from 

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1U3fkRu6-hwjk7wWIpg-iylL2u5T9t7rr#scrollTo=lsT4-_Eq45Ww)


## **Training Faster R-CNN Object Detection on a Custom Dataset**

### **Overview**

This notebook walks through how to train a Faster R-CNN object detection model using the TensorFlow Object Detection API.

In this specific example, we'll training an object detection model to recognize cells types: white blood cells, red blood cells and platelets. **To adapt this example to train on your own dataset, you only need to change two lines of code in this notebook.**

Everything in this notebook is also hosted on this [GitHub repo](https://github.com/roboflow-ai/tensorflow-object-detection-faster-rcnn).

![Blood Cell Example](https://i.imgur.com/QwyX2aD.png)

**Credit to [DLology](https://www.dlology.com/blog/how-to-train-an-object-detection-model-easy-for-free/) and [Tony607](https://github.com/Tony607)**, whom wrote the first notebook on which much of this is example is based. 

### **Our Data**

We'll be using an open source cell dataset called BCCD (Blood Cell Count and Detection). Our dataset contains 364 images (and 4888 annotations!) is hosted publicly on Roboflow [here](https://public.roboflow.ai/object-detection/bccd).

When adapting this example to your own data, create two datasets in Roboflow: `train` and `test`. Use Roboflow to generate TFRecords for each, replace their URLs in this notebook, and you're able to train on your own custom dataset.

### **Our Model**

We'll be training a Faster R-CNN neural network. Faster R-CNN is a two-stage detector: first it identifies regions of interest, and then passes these regions to a convolutional neural network. The outputted features maps are passed to a support vector machine (SVM) for classification. Regression between predicted bounding boxes and ground truth bounding boxes are computed. (Consider [this](https://towardsdatascience.com/faster-r-cnn-object-detection-implemented-by-keras-for-custom-data-from-googles-open-images-125f62b9141a) deep dive for more!)

The model arechitecture is one of many available via TensorFlow's [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models).

### **Training**

Google Colab provides free GPU resources. Click "Runtime" → "Change runtime type" → Hardware Accelerator dropdown to "GPU."

Colab does have memory limitations, and notebooks must be open in your browser to run. Sessions automatically clear themselves after 12 hours.

### **Inference**

We'll run inference directly in this notebook, and on three test images contained in the "test" folder from our GitHub repo. 

When adapting to your own dataset, you'll need to add test images to the `test` folder located at `tensorflow-object-detection/test`.

### **About**

[Roboflow](https://roboflow.ai) makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, automate labelling quality assurance, save training time, and increase model reproducibility.

#### ![Roboflow Workmark](https://i.imgur.com/WHFqYSJ.png)







In [None]:
!pip install tensorflow_gpu==1.15

Collecting tensorflow_gpu==1.15
[?25l  Downloading https://files.pythonhosted.org/packages/a5/ad/933140e74973fb917a194ab814785e7c23680ca5dee6d663a509fe9579b6/tensorflow_gpu-1.15.0-cp36-cp36m-manylinux2010_x86_64.whl (411.5MB)
[K     |████████████████████████████████| 411.5MB 41kB/s 
[?25hCollecting tensorboard<1.16.0,>=1.15.0
[?25l  Downloading https://files.pythonhosted.org/packages/1e/e9/d3d747a97f7188f48aa5eda486907f3b345cd409f0a0850468ba867db246/tensorboard-1.15.0-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 31.6MB/s 
[?25hCollecting gast==0.2.2
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting tensorflow-estimator==1.15.1
[?25l  Downloading https://files.pythonhosted.org/packages/de/62/2ee9cd74c9fa2fa450877847ba560b260f5d0fb70ee0595203082dafcc9d/tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503kB)
[K     |████████████████████████████████| 512k

## Configs and Hyperparameters

Support a variety of models, you can find more pretrained model from [Tensorflow detection model zoo: COCO-trained models](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models), as well as their pipline config files in [object_detection/samples/configs/](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).

In [None]:
# If you forked the repo, you can replace the link.
repo_url = 'https://github.com/roboflow-ai/tensorflow-object-detection-faster-rcnn'

# Number of training steps - 1000 will train very quickly, but more steps will increase accuracy.
num_steps = 200000  # 200000 to improve

# Number of evaluation steps.
num_eval_steps = 50

MODELS_CONFIG = {
    'ssd_mobilenet_v2': {
        'model_name': 'ssd_mobilenet_v2_coco_2018_03_29',
        'pipeline_file': 'ssd_mobilenet_v2_coco.config',
        'batch_size': 12
    },
    'faster_rcnn_inception_v2': {
        'model_name': 'faster_rcnn_inception_v2_coco_2018_01_28',
        'pipeline_file': 'faster_rcnn_inception_v2_pets.config',
        'batch_size': 12
    },
    'rfcn_resnet101': {
        'model_name': 'rfcn_resnet101_coco_2018_01_28',
        'pipeline_file': 'rfcn_resnet101_pets.config',
        'batch_size': 8
    }
}

# Pick the model you want to use
# Select a model in `MODELS_CONFIG`.
selected_model = 'faster_rcnn_inception_v2'

# Name of the object detection model to use.
MODEL = MODELS_CONFIG[selected_model]['model_name']

# Name of the pipline file in tensorflow object detection API.
pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']

# Training batch size fits in Colabe's Tesla K80 GPU memory for selected model.
batch_size = MODELS_CONFIG[selected_model]['batch_size']

## Clone the `tensorflow-object-detection` repository or your fork.

In [None]:
import os

%cd /content

repo_dir_path = os.path.abspath(os.path.join('.', os.path.basename(repo_url)))

!git clone {repo_url}
%cd {repo_dir_path}
!git pull

/content
fatal: destination path 'tensorflow-object-detection-faster-rcnn' already exists and is not an empty directory.
/content/tensorflow-object-detection-faster-rcnn
Already up to date.


## Install required packages

In [None]:
%cd /content
!git clone --quiet https://github.com/tensorflow/models.git

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'

!pip install tf_slim

!python object_detection/builders/model_builder_test.py

/content
Selecting previously unselected package python-bs4.
(Reading database ... 144328 files and directories currently installed.)
Preparing to unpack .../0-python-bs4_4.6.0-1_all.deb ...
Unpacking python-bs4 (4.6.0-1) ...
Selecting previously unselected package python-pkg-resources.
Preparing to unpack .../1-python-pkg-resources_39.0.1-2_all.deb ...
Unpacking python-pkg-resources (39.0.1-2) ...
Selecting previously unselected package python-chardet.
Preparing to unpack .../2-python-chardet_3.0.4-1_all.deb ...
Unpacking python-chardet (3.0.4-1) ...
Selecting previously unselected package python-six.
Preparing to unpack .../3-python-six_1.11.0-2_all.deb ...
Unpacking python-six (1.11.0-2) ...
Selecting previously unselected package python-webencodings.
Preparing to unpack .../4-python-webencodings_0.5-2_all.deb ...
Unpacking python-webencodings (0.5-2) ...
Selecting previously unselected package python-html5lib.
Preparing to unpack .../5-python-html5lib_0.999999999-1_all.deb ...
Unpa

## Prepare `tfrecord` files

Roboflow automatically creates our TFRecord and label_map files that we need!

**Generating your own TFRecords the only step you need to change for your own custom dataset.**

Because we need one TFRecord file for our training data, and one TFRecord file for our test data, we'll create two separate datasets in Roboflow and generate one set of TFRecords for each.

To create a dataset in Roboflow and generate TFRecords, follow [this step-by-step guide](https://blog.roboflow.ai/getting-started-with-roboflow/).

In [None]:
%cd /content/tensorflow-object-detection-faster-rcnn/data

/content/tensorflow-object-detection-faster-rcnn/data


In [None]:
# UPDATE THIS LINK - get our data from Roboflow
!curl -L https://public.roboflow.ai/ds/E301mdvkk3?key=k4PF0rxMgl > roboflow.zip; unzip roboflow.zip; rm roboflow.zip


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   891  100   891    0     0   1280      0 --:--:-- --:--:-- --:--:--  1280
100 3731k  100 3731k    0     0  2604k      0  0:00:01  0:00:01 --:--:-- 7027k
Archive:  roboflow.zip
 extracting: test/People.tfrecord    
 extracting: train/People.tfrecord   
 extracting: valid/People.tfrecord   
 extracting: test/People_label_map.pbtxt  
 extracting: train/People_label_map.pbtxt  
 extracting: valid/People_label_map.pbtxt  
 extracting: README.roboflow.txt     
 extracting: README.dataset.txt      


In [None]:
%ls

FYI.txt  README.dataset.txt  README.roboflow.txt  [0m[01;34mtest[0m/  [01;34mtrain[0m/  [01;34mvalid[0m/


In [None]:
# check out what we have in train
%ls train

People_label_map.pbtxt  People.tfrecord


In [None]:
# show what we have in test
%ls test

People_label_map.pbtxt  People.tfrecord


##Prepare tfrecord files

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [None]:
# NOTE: Update these TFRecord names from "cells" and "cells_label_map" to your files!
test_record_fname = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/tfrecord/coco_val.record-00000-of-00001'
train_record_fname = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/tfrecord/coco_train.record-00000-of-00001'
label_map_pbtxt_fname = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/tfrecord/tfrecord_label_map.pbtxt'

## Download base model

In [None]:
%cd /content/models/research

import os
import shutil
import glob
import urllib.request
import tarfile
MODEL_FILE = MODEL + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
DEST_DIR = '/content/models/research/pretrained_model'

if not (os.path.exists(MODEL_FILE)):
    urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)

tar = tarfile.open(MODEL_FILE)
tar.extractall()
tar.close()

os.remove(MODEL_FILE)
if (os.path.exists(DEST_DIR)):
    shutil.rmtree(DEST_DIR)
os.rename(MODEL, DEST_DIR)

/content/models/research


In [None]:
!echo {DEST_DIR}
!ls -alh {DEST_DIR}

/content/models/research/pretrained_model
total 111M
drwxr-xr-x  3 345018 5000 4.0K Feb  1  2018 .
drwxr-xr-x 64 root   root 4.0K Jun 14 14:25 ..
-rw-r--r--  1 345018 5000   77 Feb  1  2018 checkpoint
-rw-r--r--  1 345018 5000  55M Feb  1  2018 frozen_inference_graph.pb
-rw-r--r--  1 345018 5000  51M Feb  1  2018 model.ckpt.data-00000-of-00001
-rw-r--r--  1 345018 5000  16K Feb  1  2018 model.ckpt.index
-rw-r--r--  1 345018 5000 5.5M Feb  1  2018 model.ckpt.meta
-rw-r--r--  1 345018 5000 3.2K Feb  1  2018 pipeline.config
drwxr-xr-x  3 345018 5000 4.0K Feb  1  2018 saved_model


In [None]:
fine_tune_checkpoint = os.path.join(DEST_DIR, "model.ckpt")
fine_tune_checkpoint

'/content/models/research/pretrained_model/model.ckpt'

## Configuring a Training Pipeline

In [None]:
import os
pipeline_fname = os.path.join('/content/models/research/object_detection/samples/configs/', pipeline_file)

assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

In [None]:
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

In [None]:
import re

num_classes = get_num_classes(label_map_pbtxt_fname)  # Can actually just sub with no. of classes

# Re-writing the pipeline_fname file to know which directory contains what
with open(pipeline_fname) as f:
    s = f.read()
with open(pipeline_fname, 'w') as f:

    # fixed_shape_resizer - changed as aspect_ratio_resizer causes image shape errors
    # Refer to: https://github.com/tensorflow/tensorflow/issues/34544
    s = re.sub('''keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }''', 
      '''fixed_shape_resizer {
        height: 600
        width: 800
      }''', s)

    # # keep_aspect_ratio_resizer
    # s = re.sub('max_dimension: \d*',
    #            'max_dimension: {}'.format(1024), s)
    
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"',
               'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
    
    # tfrecord files train and test.
    s = re.sub(
        '(input_path: ".*?)(train.record)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
    s = re.sub(
        '(input_path: ".*?)(val.record)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)

    # label_map_path
    s = re.sub(
        'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)

    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+',
               'batch_size: {}'.format(batch_size), s)

    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+',
               'num_steps: {}'.format(num_steps), s)
    
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+',
               'num_classes: {}'.format(num_classes), s)
    f.write(s)

In [None]:
!cat {pipeline_fname}

# Faster R-CNN with Inception v2, configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  faster_rcnn {
    num_classes: 5
    image_resizer {
      fixed_shape_resizer {
        height: 600
        width: 800
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_v2'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initiali

In [None]:
model_dir = '/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Roboflow_tf_save_folder/'
# Optionally remove content in output model directory to fresh start.
# !rm -rf {model_dir}
os.makedirs(model_dir, exist_ok=True)

## Run Tensorboard(Optional)

In [None]:
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip

--2020-06-19 06:46:33--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
Resolving bin.equinox.io (bin.equinox.io)... 52.20.175.105, 54.208.57.0, 34.225.3.211, ...
Connecting to bin.equinox.io (bin.equinox.io)|52.20.175.105|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13773305 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.zip.1’


2020-06-19 06:46:34 (17.7 MB/s) - ‘ngrok-stable-linux-amd64.zip.1’ saved [13773305/13773305]

Archive:  ngrok-stable-linux-amd64.zip
  inflating: ngrok                   


In [None]:
LOG_DIR = model_dir
get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)

In [None]:
get_ipython().system_raw('./ngrok http 6006 &')

### Get Tensorboard link

In [None]:
! curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

https://57f8a2d1e88e.ngrok.io


## Train the model

In [None]:
!python /content/models/research/object_detection/model_main.py \
    --pipeline_config_path={pipeline_fname} \
    --model_dir={model_dir} \
    --alsologtostderr \
    --num_train_steps={num_steps} \
    --num_eval_steps={num_eval_steps}

W0614 15:09:03.489730 139873361508224 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 10000
I0614 15:09:03.489978 139873361508224 config_util.py:523] Maybe overwriting train_steps: 10000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0614 15:09:03.490128 139873361508224 config_util.py:523] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0614 15:09:03.490294 139873361508224 config_util.py:523] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0614 15:09:03.490489 139873361508224 config_util.py:523] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0614 15:09:03.490665 139873361508224 config_util.py:523] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0614 15:09:03.490848 139873361508224 config_util.py

In [None]:
# /content/models/research/training
!ls {model_dir}

checkpoint				     model.ckpt-0.index
events.out.tfevents.1592144802.e43594651d21  model.ckpt-0.meta
events.out.tfevents.1592145810.e43594651d21  model.ckpt-446.data-00000-of-00001
events.out.tfevents.1592147393.e43594651d21  model.ckpt-446.index
graph.pbtxt				     model.ckpt-446.meta
model.ckpt-0.data-00000-of-00001


## Exporting a Trained Inference Graph
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection. This can be done as follows:

In [None]:
import re
import numpy as np

output_directory = './fine_tuned_model'

lst = os.listdir(model_dir)
lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]
steps=np.array([int(re.findall('\d+', l)[0]) for l in lst])
last_model = lst[steps.argmax()].replace('.meta', '')

last_model_path = os.path.join(model_dir, last_model)
print(last_model_path)
!python /content/models/research/object_detection/export_inference_graph.py \
    --input_type=image_tensor \
    --pipeline_config_path={pipeline_fname} \
    --output_directory={output_directory} \
    --trained_checkpoint_prefix={last_model_path}

training/model.ckpt-446
Instructions for updating:
Please use `layer.__call__` method instead.
W0614 15:21:44.195484 140602950838144 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tf_slim/layers/layers.py:2802: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:Scale of 0 disables regularizer.
I0614 15:21:45.963340 140602950838144 regularizers.py:99] Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
I0614 15:21:45.980528 140602950838144 regularizers.py:99] Scale of 0 disables regularizer.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0614 15:21:45.980917 140602950838144 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0614 15:21:46.044064 1

In [None]:
!ls {output_directory}

checkpoint			model.ckpt.index  saved_model
frozen_inference_graph.pb	model.ckpt.meta
model.ckpt.data-00000-of-00001	pipeline.config


## Download the model `.pb` file

In [None]:
import os

pb_fname = os.path.join(os.path.abspath(output_directory), "frozen_inference_graph.pb")
assert os.path.isfile(pb_fname), '`{}` not exist'.format(pb_fname)

In [None]:
!ls -alh {pb_fname}

-rw-r--r-- 1 root root 50M Jun 14 15:22 /content/models/research/fine_tuned_model/frozen_inference_graph.pb


In [None]:
!cp /content/models/research/fine_tuned_model/frozen_inference_graph.pb "/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/roboflow_faster_rcnn_inception_v2"

### Option1 : upload the `.pb` file to your Google Drive
Then download it from your Google Drive to local file system.

During this step, you will be prompted to enter the token.

In [None]:
# Install the PyDrive wrapper & import libraries.
# This only needs to be done once in a notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials


# Authenticate and create the PyDrive client.
# This only needs to be done once in a notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

fname = os.path.basename(pb_fname)
# Create & upload a text file.
uploaded = drive.CreateFile({'title': fname})
uploaded.SetContentFile(pb_fname)
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 1xxOWOG26NV1tKxH8d7Vg4wOwXd8sWNLW


### Option2 :  Download the `.pb` file directly to your local file system
This method may not be stable when downloading large files like the model `.pb` file. Try **option 1** instead if not working.

In [None]:
from google.colab import files
files.download(pb_fname)

### OPTIONAL: Download the `label_map.pbtxt` file

In [None]:
from google.colab import files
files.download(label_map_pbtxt_fname)

### OPTIONAL: Download the modified pipline file
If you plan to use OpenVINO toolkit to convert the `.pb` file to inference faster on Intel's hardware (CPU/GPU, Movidius, etc.)

In [None]:
files.download(pipeline_fname)

In [None]:
# !tar cfz fine_tuned_model.tar.gz fine_tuned_model
# from google.colab import files
# files.download('fine_tuned_model.tar.gz')

## Run inference test

To test on your own images, you need to upload raw test images to the `test` folder located inside `/data`.

Right now, this folder contains TFRecord files from Roboflow. We need the raw images.


In [None]:
# optionally, remove the TFRecord and cells_label_map.pbtxt from
# the test directory so it is only raw images
%cd {repo_dir_path}
%cd data/test
%rm cells.tfrecord
%rm cells_label_map.pbtxt

/content/tensorflow-object-detection-faster-rcnn
/content/tensorflow-object-detection-faster-rcnn/data/test


In [None]:
import os
import glob

# Path to frozen detection graph. This is the actual model that is used for the object detection.
# PATH_TO_CKPT = pb_fname

# List of the strings that is used to add correct label for each box.
# PATH_TO_LABELS = label_map_pbtxt_fname

# If you want to test the code with your images, just add images files to the PATH_TO_TEST_IMAGES_DIR.
PATH_TO_TEST_IMAGES_DIR = "/content/drive/My Drive/Colab Notebooks/TIL/CV/DeepFashion2/train/test/"  # os.path.join(repo_dir_path, "data/test")
PATH_TO_TEST_IMAGES_DIR = "/content/drive/My Drive/Colab Notebooks/TIL/Search Rescue/Barbie/test/images"
# PATH_TO_TEST_IMAGES_DIR = "/content/drive/My Drive/Colab Notebooks/TIL/Search Rescue/Barbie/test/test_doll_video/dresses"
# PATH_TO_TEST_IMAGES_DIR = "/content/drive/My Drive/Colab Notebooks/TIL/Search Rescue/Barbie/doll"

# assert os.path.isfile(pb_fname)
assert os.path.isfile(PATH_TO_LABELS)
TEST_IMAGE_PATHS = glob.glob(os.path.join(PATH_TO_TEST_IMAGES_DIR, "*.*"))
assert len(TEST_IMAGE_PATHS) > 0, 'No image found in `{}`.'.format(PATH_TO_TEST_IMAGES_DIR)
print(TEST_IMAGE_PATHS[:2])

['/content/drive/My Drive/Colab Notebooks/TIL/Search Rescue/Barbie/test/images/dresses_0.png', '/content/drive/My Drive/Colab Notebooks/TIL/Search Rescue/Barbie/test/images/dresses_1.png']


In [None]:
%cd /content/models/research/object_detection

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops


# This is needed to display the images.
%matplotlib inline


from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

/content/models/research/object_detection


In [None]:
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')


label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)


def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)


def run_inference_for_single_image(image, graph):
    with graph.as_default():
        with tf.Session() as sess:
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {
                output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in [
                'num_detections', 'detection_boxes', 'detection_scores',
                'detection_classes', 'detection_masks'
            ]:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                        tensor_name)
            if 'detection_masks' in tensor_dict:
                # The following processing is only for single image
                detection_boxes = tf.squeeze(
                    tensor_dict['detection_boxes'], [0])
                detection_masks = tf.squeeze(
                    tensor_dict['detection_masks'], [0])
                # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
                real_num_detection = tf.cast(
                    tensor_dict['num_detections'][0], tf.int32)
                detection_boxes = tf.slice(detection_boxes, [0, 0], [
                                           real_num_detection, -1])
                detection_masks = tf.slice(detection_masks, [0, 0, 0], [
                                           real_num_detection, -1, -1])
                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                    detection_masks, detection_boxes, image.shape[0], image.shape[1])
                detection_masks_reframed = tf.cast(
                    tf.greater(detection_masks_reframed, 0.5), tf.uint8)
                # Follow the convention by adding back the batch dimension
                tensor_dict['detection_masks'] = tf.expand_dims(
                    detection_masks_reframed, 0)
            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

            # Run inference
            output_dict = sess.run(tensor_dict,
                                   feed_dict={image_tensor: np.expand_dims(image, 0)})

            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(
                output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict[
                'detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
            if 'detection_masks' in output_dict:
                output_dict['detection_masks'] = output_dict['detection_masks'][0]
    return output_dict

In [None]:
# Output images not showing? Run this cell again, and try the cell above
# This is needed to display the images.
%matplotlib inline

In [None]:
!fc-list | grep ""

/usr/share/fonts/truetype/liberation/LiberationSansNarrow-Italic.ttf: Liberation Sans Narrow:style=Italic
/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf: Liberation Sans:style=Regular
/usr/share/fonts/truetype/liberation/LiberationMono-BoldItalic.ttf: Liberation Mono:style=Bold Italic
/usr/share/fonts/truetype/liberation/LiberationSerif-Italic.ttf: Liberation Serif:style=Italic
/usr/share/fonts/truetype/liberation/LiberationMono-Bold.ttf: Liberation Mono:style=Bold
/usr/share/fonts/truetype/liberation/LiberationSansNarrow-Regular.ttf: Liberation Sans Narrow:style=Regular
/usr/share/fonts/truetype/liberation/LiberationSerif-Bold.ttf: Liberation Serif:style=Bold
/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf: Liberation Mono:style=Regular
/usr/share/fonts/truetype/liberation/LiberationSans-Italic.ttf: Liberation Sans:style=Italic
/usr/share/fonts/truetype/liberation/LiberationSerif-BoldItalic.ttf: Liberation Serif:style=Bold Italic
/usr/share/fonts/truet

In [None]:
from PIL import ImageFont, ImageDraw

# Double check if this is still valid
cat_list = ['tops', 'trousers', 'outerwear', 'dresses', 'skirts']
rank_colors = ['cyan', 'magenta', 'DarkOrange', 'DimGray', 'DarkTurquoise']

for image_path in TEST_IMAGE_PATHS[:20]:
    image = Image.open(image_path)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    try:
      if np.array(image).shape[2] == 4:
        png = image.convert('RGBA')
        background = Image.new('RGBA', png.size, (255,255,255))
        image = Image.alpha_composite(background, png).convert('RGB')
      image_np = load_image_into_numpy_array(image)
    except ValueError:
      print("image of shape:", np.array(image).shape)
      continue

    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    
    score_list = output_dict['detection_scores']
    cat_id_list = output_dict['detection_classes']
    bbox_list = output_dict['detection_boxes']

    for i_in, i_val in enumerate(score_list>0.85):
      if i_val:
        score = score_list[i_in]
        cat_id = cat_id_list[i_in]
        y0,x0,y1,x1 = bbox_list[i_in]
        W,H = image.size
        wh = W*H

        x0 = int(W*x0)
        y0 = int(H*y0)
        x1 = int(W*x1)
        y1 = int(H*y1)
        text = cat_list[cat_id-1]
        score = str(round(score, 5))

        if wh>2000000:
          fn_size = wh//80000
          rec_width = wh//500000
        else:
          fn_size = wh//30000
          rec_width = wh//80000
        print(W, H)
        print(fn_size, rec_width)

        font = ImageFont.truetype('/usr/share/fonts/truetype/liberation/LiberationMono-Bold.ttf', int(fn_size))
        draw = ImageDraw.Draw(image)
        draw.rectangle([x0, y0, x1, y1], outline = rank_colors[cat_id-1], width=rec_width)
        draw.text([x0, y0], text, fill = (255,255,255), font=font)
        draw.text([x0, y0-100], score, fill = (255,255,255), font=font)
 
    # display(image.resize((W//5,H//5)))
    display(image)

# Always confused with skirts/trousers-top pair and dresses

Output hidden; open in https://colab.research.google.com to view.

##Evaluation

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [None]:
!pip install tensorflow_gpu==1.15

Collecting tensorflow_gpu==1.15
[?25l  Downloading https://files.pythonhosted.org/packages/a5/ad/933140e74973fb917a194ab814785e7c23680ca5dee6d663a509fe9579b6/tensorflow_gpu-1.15.0-cp36-cp36m-manylinux2010_x86_64.whl (411.5MB)
[K     |████████████████████████████████| 411.5MB 38kB/s 
Collecting tensorflow-estimator==1.15.1
[?25l  Downloading https://files.pythonhosted.org/packages/de/62/2ee9cd74c9fa2fa450877847ba560b260f5d0fb70ee0595203082dafcc9d/tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503kB)
[K     |████████████████████████████████| 512kB 21.7MB/s 
Collecting tensorboard<1.16.0,>=1.15.0
[?25l  Downloading https://files.pythonhosted.org/packages/1e/e9/d3d747a97f7188f48aa5eda486907f3b345cd409f0a0850468ba867db246/tensorboard-1.15.0-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 36.5MB/s 
Collecting gast==0.2.2
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz


In [None]:
%cd /content
!git clone --quiet https://github.com/tensorflow/models.git

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'

!pip install tf_slim

!python object_detection/builders/model_builder_test.py

/content
Selecting previously unselected package python-bs4.
(Reading database ... 144379 files and directories currently installed.)
Preparing to unpack .../0-python-bs4_4.6.0-1_all.deb ...
Unpacking python-bs4 (4.6.0-1) ...
Selecting previously unselected package python-pkg-resources.
Preparing to unpack .../1-python-pkg-resources_39.0.1-2_all.deb ...
Unpacking python-pkg-resources (39.0.1-2) ...
Selecting previously unselected package python-chardet.
Preparing to unpack .../2-python-chardet_3.0.4-1_all.deb ...
Unpacking python-chardet (3.0.4-1) ...
Selecting previously unselected package python-six.
Preparing to unpack .../3-python-six_1.11.0-2_all.deb ...
Unpacking python-six (1.11.0-2) ...
Selecting previously unselected package python-webencodings.
Preparing to unpack .../4-python-webencodings_0.5-2_all.deb ...
Unpacking python-webencodings (0.5-2) ...
Selecting previously unselected package python-html5lib.
Preparing to unpack .../5-python-html5lib_0.999999999-1_all.deb ...
Unpa

In [None]:
import os
import PIL
import json
import numpy as np
import tensorflow as tf
from PIL import Image
from tqdm import tqdm
from multiprocessing import Pool
from object_detection.utils import label_map_util
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.python.keras.utils.data_utils import Sequence


In [None]:
# PATH_TO_CKPT = '/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/roboflow_faster_rcnn_inception_v2_12595.pb'
# PATH_TO_LABELS = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/tfrecord/tfrecord_label_map.pbtxt'
# _eval_folder = "interim_1"
# base_folder = "/content/drive/My Drive/Colab Notebooks/TIL/CV"
# eval_folder = os.path.join(base_folder, _eval_folder)
# model_name = "roboflow"
# num_classes = 5

# # Load dataset path
# eval_annotations = os.path.join(eval_folder, "CV_interim_evaluation.json")
# eval_imgs_folder = os.path.join(eval_folder, "CV_interim_images")

# # Submission file name
# submit_annotations = os.path.join( base_folder, _eval_folder, "interim_2", "Jun_Kai_" + model_name + "_12595_submission.json")

# print("{:<20}{}".format("eval_folder:", eval_folder))
# print("{:<20}{}".format("model_name:", model_name))
# print("{:<20}{}".format("eval_annotations:", eval_annotations))
# print("{:<20}{}".format("eval_imgs_folder:", eval_imgs_folder))
# print("{:<20}{}".format("PATH_TO_CKPT:", PATH_TO_CKPT))
# print("{:<20}{}".format("submit_annotations:", submit_annotations))


PATH_TO_CKPT = '/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/improved_frcnn_inception_v2_36000.pb'
PATH_TO_LABELS = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/tfrecord/tfrecord_label_map.pbtxt'
_eval_folder = "input"
base_folder = "/content/drive/My Drive/Colab Notebooks/TIL/CV"
eval_folder = os.path.join(base_folder, _eval_folder)
model_name = "frcnn"
num_classes = 5

# Load dataset path
eval_annotations = os.path.join(eval_folder, "val.json")
eval_imgs_folder = os.path.join(eval_folder, "val", "val")

# Submission file name
submit_annotations = os.path.join( base_folder, "junkai", "Jun_Kai_" + _eval_folder + "_" + model_name + "_36000.json")

print("{:<20}{}".format("eval_folder:", eval_folder))
print("{:<20}{}".format("model_name:", model_name))
print("{:<20}{}".format("eval_annotations:", eval_annotations))
print("{:<20}{}".format("eval_imgs_folder:", eval_imgs_folder))
print("{:<20}{}".format("PATH_TO_CKPT:", PATH_TO_CKPT))
print("{:<20}{}".format("submit_annotations:", submit_annotations))


eval_folder:        /content/drive/My Drive/Colab Notebooks/TIL/CV/input
model_name:         frcnn
eval_annotations:   /content/drive/My Drive/Colab Notebooks/TIL/CV/input/val.json
eval_imgs_folder:   /content/drive/My Drive/Colab Notebooks/TIL/CV/input/val/val
PATH_TO_CKPT:       /content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/improved_frcnn_inception_v2_36000.pb
submit_annotations: /content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json


In [None]:
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

def run_inference_for_single_image(image, graph):
    with graph.as_default():
        with tf.Session() as sess:
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {
                output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in [
                'num_detections', 'detection_boxes', 'detection_scores',
                'detection_classes', 'detection_masks'
            ]:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                        tensor_name)
            if 'detection_masks' in tensor_dict:
                # The following processing is only for single image
                detection_boxes = tf.squeeze(
                    tensor_dict['detection_boxes'], [0])
                detection_masks = tf.squeeze(
                    tensor_dict['detection_masks'], [0])
                # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
                real_num_detection = tf.cast(
                    tensor_dict['num_detections'][0], tf.int32)
                detection_boxes = tf.slice(detection_boxes, [0, 0], [
                                           real_num_detection, -1])
                detection_masks = tf.slice(detection_masks, [0, 0, 0], [
                                           real_num_detection, -1, -1])
                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                    detection_masks, detection_boxes, image.shape[0], image.shape[1])
                detection_masks_reframed = tf.cast(
                    tf.greater(detection_masks_reframed, 0.5), tf.uint8)
                # Follow the convention by adding back the batch dimension
                tensor_dict['detection_masks'] = tf.expand_dims(
                    detection_masks_reframed, 0)
            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

            # Run inference
            output_dict = sess.run(tensor_dict,
                                   feed_dict={image_tensor: np.expand_dims(image, 0)})

            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(
                output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict[
                'detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
            if 'detection_masks' in output_dict:
                output_dict['detection_masks'] = output_dict['detection_masks'][0]
    return output_dict

# Custom function to replace decode_tensor()
def unpack_preds(pred_dict):
  results = []
  det_num = pred_dict['num_detections']
  det_score = pred_dict['detection_scores']
  category_id = pred_dict['detection_classes']
  for i in range(det_num):
    predx, predy, predw, predh = pred_dict['detection_boxes'][i]
    results.append((det_score[i], category_id[i], predx, predy, predw, predh))
  return results


In [None]:
# Custom sequence to run the evaluation annotation file

print("Loading eval images from:     ", eval_imgs_folder)
imgs_dict = {im.split('.')[0]:im for im in os.listdir(eval_imgs_folder) if im.endswith('.jpg')}

print("Loading eval annotations from:", eval_annotations)
test_sequence = []
with open(eval_annotations, 'r') as f:
  annotations_dict = json.load(f)
annotations_list = annotations_dict['images']
for annotation in annotations_list:
  img_id = str(annotation['id'])
  if img_id in imgs_dict:
    img_fp = os.path.join(eval_imgs_folder, imgs_dict[img_id])
    test_sequence.append( (int(img_id), img_fp) )  # img_id, (w,h), input_arr

print('Running detections:')

# Generating detections on the folder of validation images
detections = []
det_threshold=0.
for i in tqdm(range(len(test_sequence))):
  img_id, img_fp = test_sequence[i]
  image_pil = PIL.Image.open(img_fp)
  W,H = image_pil.size
  image_np = np.array(image_pil)

  try:
    pred_dict = run_inference_for_single_image(image_np, detection_graph)
  except ValueError:
    print("Image shape is not (?, ?, ?, 3): ", end='')
    print(img_id, image_np.shape)
    continue

  # Visualise output
  # vis_util.visualize_boxes_and_labels_on_image_array(
  #     image_np,
  #     pred_dict['detection_boxes'],
  #     pred_dict['detection_classes'],
  #     pred_dict['detection_scores'],
  #     category_index,
  #     instance_masks=pred_dict.get('detection_masks'),
  #     use_normalized_coordinates=True,
  #     line_thickness=8)
  # plt.figure(figsize=(10,10))
  # plt.imshow(image_np)

  preds = unpack_preds(pred_dict)

  # Post-processing
  preds = [pred for pred in preds if pred[0] >= det_threshold]
  preds.sort( key=lambda x:x[0], reverse=True )
  preds = preds[:100] # we only evaluate you on 100 detections per image
  
  for i, pred in enumerate(preds):
    conf,cat_id,y1,x1,y2,x2 = pred  # y1, x1, y2, x2
    width = W*(x2-x1)
    height = H*(y2-y1)
    x1 = W*x1
    y1 = H*y1

    width = round(width,1)
    height = round(height,1)
    x1 = round(x1,1)
    y1 = round(y1,1)
    conf = float(conf)
    cat_id = int(cat_id)
    detections.append( {'image_id':img_id, 'category_id':cat_id, 'bbox':[x1, y1, width, height], 'score':conf} )


Loading eval images from:      /content/drive/My Drive/Colab Notebooks/TIL/CV/input/val/val
Loading eval annotations from: /content/drive/My Drive/Colab Notebooks/TIL/CV/input/val.json


  0%|          | 0/1474 [00:00<?, ?it/s]

Running detections:


  4%|▍         | 61/1474 [03:51<1:13:09,  3.11s/it]

Image shape is not (?, ?, ?, 3): 10221 (2592, 1944, 4)


  5%|▌         | 76/1474 [04:45<1:07:02,  2.88s/it]

Image shape is not (?, ?, ?, 3): 10266 (3264, 1675, 4)


 12%|█▏        | 178/1474 [11:14<1:02:09,  2.88s/it]

Image shape is not (?, ?, ?, 3): 10609 (3264, 1836)


 22%|██▏       | 327/1474 [20:57<59:54,  3.13s/it]  

Image shape is not (?, ?, ?, 3): 1512 (2668, 1480, 4)


 26%|██▋       | 387/1474 [24:36<48:37,  2.68s/it]  

Image shape is not (?, ?, ?, 3): 1746 (851, 555, 4)


 33%|███▎      | 480/1474 [30:26<45:42,  2.76s/it]  

Image shape is not (?, ?, ?, 3): 95 (1127, 805, 4)


 41%|████      | 597/1474 [37:44<40:13,  2.75s/it]

Image shape is not (?, ?, ?, 3): 785 (554, 800, 4)


 49%|████▉     | 726/1474 [46:20<33:49,  2.71s/it]

Image shape is not (?, ?, ?, 3): 343 (525, 632, 4)


 50%|█████     | 742/1474 [47:18<34:15,  2.81s/it]

Image shape is not (?, ?, ?, 3): 401 (619, 619, 4)


 53%|█████▎    | 780/1474 [49:45<32:53,  2.84s/it]

Image shape is not (?, ?, ?, 3): 8276 (2048, 1150, 4)


 54%|█████▍    | 801/1474 [51:01<31:57,  2.85s/it]

Image shape is not (?, ?, ?, 3): 6977 (4608, 2844)


 59%|█████▉    | 876/1474 [55:42<27:30,  2.76s/it]

Image shape is not (?, ?, ?, 3): 12497 (636, 497, 4)


 67%|██████▋   | 994/1474 [1:03:36<22:30,  2.81s/it]

Image shape is not (?, ?, ?, 3): 5010 (594, 608, 4)


 69%|██████▉   | 1014/1474 [1:04:49<21:08,  2.76s/it]

Image shape is not (?, ?, ?, 3): 14737 (960, 640, 4)


 71%|███████   | 1045/1474 [1:06:45<20:29,  2.87s/it]

Image shape is not (?, ?, ?, 3): 8876 (1080, 893, 4)


 72%|███████▏  | 1063/1474 [1:07:50<20:29,  2.99s/it]

Image shape is not (?, ?, ?, 3): 7556 (2352, 1262, 4)


 77%|███████▋  | 1134/1474 [1:12:44<15:37,  2.76s/it]

Image shape is not (?, ?, ?, 3): 7306 (3264, 1669)


 83%|████████▎ | 1228/1474 [1:18:42<11:31,  2.81s/it]

Image shape is not (?, ?, ?, 3): 17677 (1077, 566, 4)


 84%|████████▍ | 1235/1474 [1:19:05<10:50,  2.72s/it]

Image shape is not (?, ?, ?, 3): 4875 (2048, 1366)


100%|██████████| 1474/1474 [1:34:58<00:00,  3.87s/it]


In [None]:
with open(submit_annotations, 'w') as f:
  json.dump(detections, f)

In [None]:
submit_annotations

'/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json'

In [None]:
!nvidia-smi


NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



In [None]:
# Auto reload after installing new modules
%load_ext autoreload
%autoreload 2

In [None]:
# First, we need to install cocoapi to evaluate our detections
# This installation is a modified version of the original to suit this competition
! pip install git+https://github.com/jinmingteo/cocoapi.git#subdirectory=PythonAPI --upgrade

Collecting git+https://github.com/jinmingteo/cocoapi.git#subdirectory=PythonAPI
  Cloning https://github.com/jinmingteo/cocoapi.git to /tmp/pip-req-build-4wvnjk6r
  Running command git clone -q https://github.com/jinmingteo/cocoapi.git /tmp/pip-req-build-4wvnjk6r
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (setup.py) ... [?25l[?25hdone
  Created wheel for pycocotools: filename=pycocotools-2.0-cp36-cp36m-linux_x86_64.whl size=267047 sha256=54456ab404a78bf6c79c27df7820099f81784ff5999d4bf67801811af88a9c85
  Stored in directory: /tmp/pip-ephem-wheel-cache-2r0thblt/wheels/27/81/92/3a512329d1b1ae7fc278285a1f114ef08082568bf32eee0002
Successfully built pycocotools
Installing collected packages: pycocotools
  Found existing installation: pycocotools 2.0.1
    Uninstalling pycocotools-2.0.1:
      Successfully uninstalled pycocotools-2.0.1
Successfully installed pycocotools-2.0


In [None]:
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

In [None]:
val_annotations = "/content/drive/My Drive/Colab Notebooks/TIL/CV/input/val.json"
print(val_annotations)
submit_annotations = '/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json'
print(submit_annotations)

/content/drive/My Drive/Colab Notebooks/TIL/CV/input/val.json
/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json


In [None]:
# Get evaluation score against validation set
coco_gt = COCO(val_annotations)
coco_dt = coco_gt.loadRes(submit_annotations)
cocoEval = COCOeval(cocoGt=coco_gt, cocoDt=coco_dt, iouType='bbox')
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()

loading annotations into memory...
Done (t=1.98s)
creating index...
index created!
Loading and preparing results...
DONE (t=1.78s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=4.06s).
Accumulating evaluation results...
DONE (t=1.53s).
 Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.587
 Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.628
 Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.605
 Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.576
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.532


## Initial Run:
#### 12595 FIXED
```
Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.629
Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.643
Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.635
Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.625
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.607
```

#### 22182 (44.49%)
```
 Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.645
 Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.658
 Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.652
 Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.642
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.626
```
#### 28182 (56.94%)
```
Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.644
Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.657
Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.651
Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.641
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.626
```



## Improved
#### 20000
```
Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.565
Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.609
Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.581
Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.555
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.508
```

#### 24000 (52.06%)
```
Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.574
Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.616
Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.592
Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.562
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.520
```
#### 36000
```
Average Precision  (AP) @[ IoU=0.20:0.50 | area=   all | maxDets=100 ] = 0.587
Average Precision  (AP) @[ IoU=0.20      | area=   all | maxDets=100 ] = 0.628
Average Precision  (AP) @[ IoU=0.30      | area=   all | maxDets=100 ] = 0.605
Average Precision  (AP) @[ IoU=0.40      | area=   all | maxDets=100 ] = 0.576
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.532
```

In [None]:
with open("/content/drive/My Drive/Colab Notebooks/TIL/CV/working/submission-model-7x7-14x14-3aspect-modyoloposneg-wd0.0005.json") as f:
  valannos = json.load(f)
with open(submit_annotations) as f:
  subannos = json.load(f)

In [None]:
valannos[0]

{'bbox': [571.5, 194.6, 815.2, 821.1],
 'category_id': 4,
 'image_id': 1,
 'score': 0.6992745399475098}

In [None]:
subannos[0]

{'bbox': [526.3, 127.6, -1021701.6, -246942.5],
 'category_id': 4,
 'image_id': 1,
 'score': 0.8680194616317749}

In [None]:
a=0
for i in valannos:
  if i['image_id'] == 1:  # 2084, 1472, 3030
    a+=1
    print (i)
print(a)

{'image_id': 1, 'category_id': 4, 'bbox': [571.5, 194.6, 815.2, 821.1], 'score': 0.6992745399475098}
{'image_id': 1, 'category_id': 4, 'bbox': [731.3, 128.9, 530.9, 857.4], 'score': 0.11952656507492065}
{'image_id': 1, 'category_id': 4, 'bbox': [668.4, 226.5, 613.6, 971.3], 'score': 0.03665538132190704}
{'image_id': 1, 'category_id': 3, 'bbox': [879.9, 223.0, 901.2, 669.5], 'score': 0.014186333864927292}
{'image_id': 1, 'category_id': 4, 'bbox': [582.3, 108.9, 709.3, 917.1], 'score': 0.01228678971529007}
{'image_id': 1, 'category_id': 4, 'bbox': [632.7, 57.0, 704.6, 505.9], 'score': 0.011506722308695316}
{'image_id': 1, 'category_id': 3, 'bbox': [1292.6, 144.2, 382.8, 665.6], 'score': 0.004970135632902384}
{'image_id': 1, 'category_id': 4, 'bbox': [891.0, 32.2, 655.1, 426.8], 'score': 0.00437589455395937}
{'image_id': 1, 'category_id': 5, 'bbox': [759.9, 550.0, 495.9, 805.6], 'score': 0.0040878006257116795}
{'image_id': 1, 'category_id': 2, 'bbox': [769.7, 842.0, 491.4, 654.4], 'score'

In [None]:
a=0
for i in subannos:
  if i['image_id'] == 1:  # 2037, 4601, 3921
    a+=1
    print (i)
print(a)

{'image_id': 1, 'category_id': 4, 'area': 893729.7, 'bbox': [935.0, 659.4, 819.6, 1090.4], 'score': 0.8415889739990234}
{'image_id': 1, 'category_id': 3, 'area': 570104.6, 'bbox': [959.2, 402.6, 929.4, 613.4], 'score': 0.6006994247436523}
{'image_id': 1, 'category_id': 2, 'area': 113575.7, 'bbox': [1011.7, 1485.6, 190.4, 596.5], 'score': 0.13425743579864502}
{'image_id': 1, 'category_id': 3, 'area': 944046.1, 'bbox': [967.3, 584.3, 1019.4, 926.1], 'score': 0.12345073372125626}
{'image_id': 1, 'category_id': 2, 'area': 194867.6, 'bbox': [901.0, 1454.5, 387.8, 502.6], 'score': 0.11707131564617157}
{'image_id': 1, 'category_id': 5, 'area': 667710.1, 'bbox': [906.4, 781.2, 759.0, 879.7], 'score': 0.07988986372947693}
{'image_id': 1, 'category_id': 2, 'area': 346890.4, 'bbox': [917.8, 1463.1, 549.9, 630.8], 'score': 0.06630423665046692}
{'image_id': 1, 'category_id': 2, 'area': 132761.8, 'bbox': [826.1, 1482.7, 220.0, 603.4], 'score': 0.05301739275455475}
{'image_id': 1, 'category_id': 1, '

In [None]:
a


100

In [None]:
with open(submit_annotations) as f:
  subannos = json.load(f)

In [None]:
for i in subannos:
  x, y, width, height = i['bbox']

  x1 = x - width/2
  y1 = y - height/2

  x1 = round(x1, 1)
  y1 = round(y1, 1)

  i['bbox'] = [x1, y1, width, height]


In [None]:
a=0
for i in subannos:
  if i['image_id'] == 1:  # 2037, 4601, 3921
    a+=1
    print (i)
print(a)

{'image_id': 1, 'category_id': 4, 'area': 893729.7, 'bbox': [525.2, 114.2, 819.6, 1090.4], 'score': 0.8415889739990234}
{'image_id': 1, 'category_id': 3, 'area': 570104.6, 'bbox': [494.5, 95.9, 929.4, 613.4], 'score': 0.6006994247436523}
{'image_id': 1, 'category_id': 2, 'area': 113575.7, 'bbox': [916.5, 1187.3, 190.4, 596.5], 'score': 0.13425743579864502}
{'image_id': 1, 'category_id': 3, 'area': 944046.1, 'bbox': [457.6, 121.2, 1019.4, 926.1], 'score': 0.12345073372125626}
{'image_id': 1, 'category_id': 2, 'area': 194867.6, 'bbox': [707.1, 1203.2, 387.8, 502.6], 'score': 0.11707131564617157}
{'image_id': 1, 'category_id': 5, 'area': 667710.1, 'bbox': [526.9, 341.4, 759.0, 879.7], 'score': 0.07988986372947693}
{'image_id': 1, 'category_id': 2, 'area': 346890.4, 'bbox': [642.8, 1147.7, 549.9, 630.8], 'score': 0.06630423665046692}
{'image_id': 1, 'category_id': 2, 'area': 132761.8, 'bbox': [716.1, 1181.0, 220.0, 603.4], 'score': 0.05301739275455475}
{'image_id': 1, 'category_id': 1, 'ar

In [None]:
{'image_id': 1, 'category_id': 4, 'bbox': [571.5, 194.6, 815.2, 821.1], 'score': 0.6992745399475098}
{'image_id': 1, 'category_id': 4, 'bbox': [731.3, 128.9, 530.9, 857.4], 'score': 0.11952656507492065}
{'image_id': 1, 'category_id': 4, 'bbox': [668.4, 226.5, 613.6, 971.3], 'score': 0.03665538132190704}
{'image_id': 1, 'category_id': 3, 'bbox': [879.9, 223.0, 901.2, 669.5], 'score': 0.014186333864927292}
{'image_id': 1, 'category_id': 4, 'bbox': [582.3, 108.9, 709.3, 917.1], 'score': 0.01228678971529007}
{'image_id': 1, 'category_id': 4, 'bbox': [632.7, 57.0, 704.6, 505.9], 'score': 0.011506722308695316}
{'image_id': 1, 'category_id': 3, 'bbox': [1292.6, 144.2, 382.8, 665.6], 'score': 0.004970135632902384}
{'image_id': 1, 'category_id': 4, 'bbox': [891.0, 32.2, 655.1, 426.8], 'score': 0.00437589455395937}
{'image_id': 1, 'category_id': 5, 'bbox': [759.9, 550.0, 495.9, 805.6], 'score': 0.0040878006257116795}
{'image_id': 1, 'category_id': 2, 'bbox': [769.7, 842.0, 491.4, 654.4], 'score': 0.003458512481302023}
{'image_id': 1, 'category_id': 4, 'bbox': [240.1, 116.0, 999.3, 1029.8], 'score': 0.0032126344740390778}
{'image_id': 1, 'category_id': 4, 'bbox': [1015.3, 108.2, 408.5, 731.6], 'score': 0.0021131697576493025}
{'image_id': 1, 'category_id': 2, 'bbox': [1287.0, 529.4, 460.0, 782.3], 'score': 0.0020606054458767176}
{'image_id': 1, 'category_id': 4, 'bbox': [810.7, 222.9, 729.1, 598.3], 'score': 0.001816713367588818}
{'image_id': 1, 'category_id': 4, 'bbox': [469.0, 202.3, 924.5, 737.0], 'score': 0.0018118319567292929}
{'image_id': 1, 'category_id': 4, 'bbox': [506.5, 221.6, 870.6, 1004.6], 'score': 0.001490538357757032}
{'image_id': 1, 'category_id': 2, 'bbox': [632.2, 1003.7, 638.9, 470.0], 'score': 0.0012443060986697674}
{'image_id': 1, 'category_id': 5, 'bbox': [688.0, 771.1, 640.4, 467.9], 'score': 0.001170646632090211}
{'image_id': 1, 'category_id': 4, 'bbox': [861.2, 315.5, 210.0, 208.8], 'score': 0.0010894681327044964}
{'image_id': 1, 'category_id': 4, 'bbox': [538.0, 480.3, 711.3, 898.4], 'score': 0.0010651021730154753}

{'image_id': 1, 'category_id': 4, 'area': 893729.7, 'bbox': [525.2, 114.2, 819.6, 1090.4], 'score': 0.8415889739990234}
{'image_id': 1, 'category_id': 3, 'area': 570104.6, 'bbox': [494.5, 95.9, 929.4, 613.4], 'score': 0.6006994247436523}
{'image_id': 1, 'category_id': 2, 'area': 113575.7, 'bbox': [916.5, 1187.3, 190.4, 596.5], 'score': 0.13425743579864502}
{'image_id': 1, 'category_id': 3, 'area': 944046.1, 'bbox': [457.6, 121.2, 1019.4, 926.1], 'score': 0.12345073372125626}
{'image_id': 1, 'category_id': 2, 'area': 194867.6, 'bbox': [707.1, 1203.2, 387.8, 502.6], 'score': 0.11707131564617157}
{'image_id': 1, 'category_id': 5, 'area': 667710.1, 'bbox': [526.9, 341.4, 759.0, 879.7], 'score': 0.07988986372947693}
{'image_id': 1, 'category_id': 2, 'area': 346890.4, 'bbox': [642.8, 1147.7, 549.9, 630.8], 'score': 0.06630423665046692}
{'image_id': 1, 'category_id': 2, 'area': 132761.8, 'bbox': [716.1, 1181.0, 220.0, 603.4], 'score': 0.05301739275455475}
{'image_id': 1, 'category_id': 1, 'area': 432656.9, 'bbox': [531.6, 73.7, 903.7, 478.8], 'score': 0.03916747868061066}
{'image_id': 1, 'category_id': 1, 'area': 739060.4, 'bbox': [426.1, 83.8, 990.0, 746.5], 'score': 0.03728179633617401}
{'image_id': 1, 'category_id': 3, 'area': 325459.8, 'bbox': [603.0, 91.6, 726.5, 448.0], 'score': 0.031403981149196625}
{'image_id': 1, 'category_id': 5, 'area': 1166127.0, 'bbox': [486.0, 234.0, 1013.5, 1150.5], 'score': 0.026136226952075958}
{'image_id': 1, 'category_id': 3, 'area': 673538.9, 'bbox': [491.2, 135.5, 661.6, 1018.0], 'score': 0.018608085811138153}
{'image_id': 1, 'category_id': 2, 'area': 73202.2, 'bbox': [751.2, 1214.8, 140.5, 521.1], 'score': 0.013777441345155239}
{'image_id': 1, 'category_id': 1, 'area': 250654.0, 'bbox': [630.5, 96.1, 646.1, 387.9], 'score': 0.013378710485994816}
{'image_id': 1, 'category_id': 2, 'area': 1022226.1, 'bbox': [459.6, 171.6, 913.3, 1119.3], 'score': 0.01241259090602398}
{'image_id': 1, 'category_id': 2, 'area': 251845.7, 'bbox': [812.2, 1110.2, 353.1, 713.1], 'score': 0.011168341152369976}
{'image_id': 1, 'category_id': 2, 'area': 555896.6, 'bbox': [556.6, 475.9, 754.3, 737.0], 'score': 0.011137024499475956}
{'image_id': 1, 'category_id': 2, 'area': 49380.1, 'bbox': [957.2, 1277.8, 119.9, 411.9], 'score': 0.010573280975222588}
{'image_id': 1, 'category_id': 1, 'area': 969069.8, 'bbox': [455.5, 160.1, 914.5, 1059.7], 'score': 0.009601874276995659}
{'image_id': 1, 'category_id': 3, 'area': 231278.7, 'bbox': [418.5, 87.2, 990.5, 233.5], 'score': 0.009305201470851898}
{'image_id': 1, 'category_id': 2, 'area': 930816.7, 'bbox': [504.5, 604.9, 849.5, 1095.7], 'score': 0.008759144693613052}

In [None]:
# To fix multiple, we introduce non-maximum suppression, or NMS for short
def nms(detections, iou_thresh=0.):
  dets_by_class = {}
  final_result = []
  for det in detections:
    cls = det[1]
    if cls not in dets_by_class:
      dets_by_class[cls] = []
    dets_by_class[cls].append( det )
  for _, dets in dets_by_class.items():
    candidates = list(dets)
    candidates.sort( key=lambda x:x[0], reverse=True )
    while len(candidates) > 0:
      candidate = candidates.pop(0)
      _,_,cx,cy,cw,ch = candidate
      copy = list(candidates)
      for other in candidates:
        # Compute the IoU. If it exceeds thresh, we remove it
        _,_,ox,oy,ow,oh = other
        if iou( (cx,cy,cw,ch), (ox,oy,ow,oh) ) > iou_thresh:
          copy.remove(other)
      candidates = list(copy)
      final_result.append(candidate)
  return final_result

# Computes the intersection-over-union (IoU) of two bounding boxes
def iou(bb1, bb2):
  x1,y1,w1,h1 = bb1
  xmin1 = x1 - w1/2
  xmax1 = x1 + w1/2
  ymin1 = y1 - h1/2
  ymax1 = y1 + h1/2

  x2,y2,w2,h2 = bb2
  xmin2 = x2 - w2/2
  xmax2 = x2 + w2/2
  ymin2 = y2 - h2/2
  ymax2 = y2 + h2/2

  area1 = w1*h1
  area2 = w2*h2

  # Compute the boundary of the intersection
  xmin_int = max( xmin1, xmin2 )
  xmax_int = min( xmax1, xmax2 )
  ymin_int = max( ymin1, ymin2 )
  ymax_int = min( ymax1, ymax2 )
  intersection = max(xmax_int - xmin_int, 0) * max( ymax_int - ymin_int, 0 )

  # Remove the double counted region
  union = area1+area2-intersection

  return intersection / union


In [None]:
submit_annotations

'/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json'

In [None]:
# Display some of the images
import os
import PIL
import json
from collections import OrderedDict
from PIL import ImageEnhance, ImageFont, ImageDraw

submit_annotations = '/content/drive/My Drive/Colab Notebooks/TIL/CV/junkai/Jun_Kai_input_frcnn_36000.json'
val_annotations = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/val.json'
eval_imgs_folder = '/content/drive/My Drive/Colab Notebooks/TIL/CV/input/val/val'

# Double check if this is still valid
cat_list = ['tops', 'trousers', 'outerwear', 'dresses', 'skirts']

with open(submit_annotations, 'r') as f:
  results = json.load(f)

sorted_results = OrderedDict()
for i in results:
  img_id = i['image_id']

  if img_id in sorted_results:
    sorted_results[img_id].append(i)
  else:
    sorted_results[img_id] = [i]
  
# sorted_results = OrderedDict({img_id_1: {}, img_id_2: {}, ...})

# Run this to visualize
rank_colors = ['cyan', 'magenta', 'DarkOrange', 'DimGray', 'DarkTurquoise']
det_threshold=0.
top_dets=3

start=0
end=20
for k in range(start,end):
  image_id = list(sorted_results.keys())[k]
  image = PIL.Image.open(os.path.join(eval_imgs_folder, str(image_id)+'.jpg'))

  detection = []
  for image_dict in sorted_results[image_id]:
    score = image_dict['score']
    cat_id = image_dict['category_id']
    x,y,w,h = image_dict['bbox']  # [x,y,width,height]

    detection.append((score, cat_id, x, y, w, h))

  # nms takes in and returns [(score, cat, bbox,bbox1,bbox1,bbox1), (...)]
  preds = nms(detection, iou_thresh=0.01)  # Originally 0.5

  for num, pred in enumerate(preds):
    score = pred[0]
    cat_id = pred[1]
    x0,y0,w,h = pred[2:6]

    x1 = int(x0 + w)
    y1 = int(y0 + h)
    text = cat_list[cat_id-1]
    score = str(round(score, 5))

    font = ImageFont.truetype('/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf', 100)
    draw = ImageDraw.Draw(image)
    draw.rectangle([x0, y0, x1, y1], outline = rank_colors[cat_id-1], width=30)
    draw.text([x1, y1], text, fill = rank_colors[cat_id-1], font=font)
    draw.text([x1, y1+100], score, fill = rank_colors[cat_id-1], font=font)
  display(image.resize((255,255)))


Output hidden; open in https://colab.research.google.com to view.