# Tensorflow Object Detection API - Tutorial
This tutorial serves as an introduction to the basic workflows surrounding the use of the most popular research model in Tensorflow, the Object Detection API. Here we go through all the steps required to setup a development environment for assembling a dataset, preparing the input files, training detection models and running data through them. We demonstrate all the above by using the Oxford-IIIT Pet Dataset.

## Environment setup

Check the GPU type assigned to your instance.

In [1]:
!nvidia-smi

Thu Oct 22 10:28:19 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   66C    P8    11W /  70W |      0MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

Browse information about the instance's CPU.

In [2]:
!lscpu

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU @ 2.20GHz
Stepping:            0
CPU MHz:             2200.000
BogoMIPS:            4400.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            56320K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_sin

## Installation - Dependencies

In [5]:
use_my_drive = False

if use_my_drive:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=False)

    from os import chdir
    chdir("/content/drive/My Drive/")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [10]:
!pwd

/content/drive/My Drive


In [11]:
!git clone https://github.com/tensorflow/models

Cloning into 'models'...
remote: Enumerating objects: 67, done.[K
remote: Counting objects: 100% (67/67), done.[K
remote: Compressing objects: 100% (65/65), done.[K
remote: Total 46144 (delta 26), reused 43 (delta 2), pack-reused 46077[K
Receiving objects: 100% (46144/46144), 551.17 MiB | 18.96 MiB/s, done.
Resolving deltas: 100% (31629/31629), done.
Checking out files: 100% (2105/2105), done.


In [12]:
%%bash
cd models
git reset --hard 126ce65
rm -rf .git

HEAD is now at 126ce652 Do not access self.embeddings.dtype.


Checking out files:  79% (65/82)   Checking out files:  80% (66/82)   Checking out files:  81% (67/82)   Checking out files:  82% (68/82)   Checking out files:  84% (69/82)   Checking out files:  85% (70/82)   Checking out files:  86% (71/82)   Checking out files:  87% (72/82)   Checking out files:  89% (73/82)   Checking out files:  90% (74/82)   Checking out files:  91% (75/82)   Checking out files:  92% (76/82)   Checking out files:  93% (77/82)   Checking out files:  95% (78/82)   Checking out files:  96% (79/82)   Checking out files:  97% (80/82)   Checking out files:  98% (81/82)   Checking out files: 100% (82/82)   Checking out files: 100% (82/82), done.


In [13]:
!pip install -U --pre tensorflow=="2.3.0"
!pip install tf_slim

Requirement already up-to-date: tensorflow==2.3.0 in /usr/local/lib/python3.6/dist-packages (2.3.0)
Collecting tf_slim
[?25l  Downloading https://files.pythonhosted.org/packages/02/97/b0f4a64df018ca018cc035d44f2ef08f91e2e8aa67271f6f19633a015ff7/tf_slim-1.1.0-py2.py3-none-any.whl (352kB)
[K     |████████████████████████████████| 358kB 4.5MB/s 
Installing collected packages: tf-slim
Successfully installed tf-slim-1.1.0


In [14]:
!apt-get install git protobuf-compiler python3-pil python3-lxml python3-tk
!pip install --user Cython
!pip install --user contextlib2
!pip install --user jupyter
!pip install --user matplotlib

Reading package lists... Done
Building dependency tree       
Reading state information... Done
protobuf-compiler is already the newest version (3.0.0-9.1ubuntu1).
git is already the newest version (1:2.17.1-1ubuntu0.7).
python3-tk is already the newest version (3.6.9-1~18.04).
The following additional packages will be installed:
  python3-bs4 python3-chardet python3-html5lib python3-olefile
  python3-pkg-resources python3-six python3-webencodings
Suggested packages:
  python3-genshi python3-lxml-dbg python-lxml-doc python-pil-doc
  python3-pil-dbg python3-setuptools
The following NEW packages will be installed:
  python3-bs4 python3-chardet python3-html5lib python3-lxml python3-olefile
  python3-pil python3-pkg-resources python3-six python3-webencodings
0 upgraded, 9 newly installed, 0 to remove and 21 not upgraded.
Need to get 1,805 kB of archives.
After this operation, 7,686 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 python3-bs

In [15]:
!pip install --user pycocotools



In [16]:
%cd models/research/

/content/drive/My Drive/models/research


In [17]:
!protoc object_detection/protos/*.proto --python_out=.

In [18]:
!pip install .

Processing /content/drive/My Drive/models/research
Building wheels for collected packages: object-detection
  Building wheel for object-detection (setup.py) ... [?25l[?25hdone
  Created wheel for object-detection: filename=object_detection-0.1-cp36-none-any.whl size=1350338 sha256=a6d509938d06f53ad6992eaa40e1f17e1db2f320e769a54d8c9d1acd88dcfc39
  Stored in directory: /tmp/pip-ephem-wheel-cache-ofebzvkx/wheels/c2/aa/a5/d66d82acb7c4274fd81cecca818e0bf840ace7b657663f672e
Successfully built object-detection
Installing collected packages: object-detection
Successfully installed object-detection-0.1


In [19]:
%env PYTHONPATH=/env/python:/content/models:/content/models/research/slim

env: PYTHONPATH=/env/python:/content/models:/content/models/research/slim


In [20]:
!python object_detection/builders/model_builder_tf2_test.py

2020-10-22 10:32:16.985029: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Running tests under Python 3.6.9: /usr/bin/python3
[ RUN      ] ModelBuilderTF2Test.test_create_center_net_model
2020-10-22 10:32:19.452566: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-22 10:32:19.511819: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-22 10:32:19.512428: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-10-22 10:32:19.512490: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully o

## Data preparation

In [None]:
!mkdir data data/tfrecords

In [None]:
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
!tar -xvf images.tar.gz
!tar -xvf annotations.tar.gz
!rm -rf images.tar.gz
!rm -rf annotations.tar.gz
!mv images data/
!mv annotations data/

In [None]:
!git clone https://github.com/johntikas/pet-detection

In [None]:
!cp -avr pet-detection/images .

In [None]:
!cp -avr pet-detection/xmls annotations/

In [None]:
!python object_detection/dataset_tools/create_pet_tf_record.py \
 --label_map_path=object_detection/data/pet_label_map.pbtxt \
 --faces_only=False \
 --data_dir=data \
 --output_dir=data/tfrecords

In [None]:
!wget http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz
!tar -xvf mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz
!rm -rf mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz

## Model - Training

In [None]:
!python object_detection/model_main_tf2.py \
--logtostderr \
--pipeline_config_path=pet-detection/configs/mask_rcnn_inception_resnet_v2_pets.config \
--model_dir=model

2020-10-22 10:27:15.973507: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-22 10:27:18.197331: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-22 10:27:18.200015: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-22 10:27:18.200431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla P100-PCIE-16GB computeCapability: 6.0
coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-10-22 10:27:18.200465: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-22 10:27:18.202448: I tensorflow/stream_executor/pl

## Model - Inference

In [None]:
!python object_detection/exporter_main_v2.py \
--input_type=image_tensor \
--pipeline_config_path=pet-detection/configs/mask_rcnn_inception_resnet_v2_pets.config \
--trained_checkpoint_dir=model \
--output_directory=model/export

In [None]:
import sys
sys.path.append('/content/models')

In [None]:
import tensorflow as tf

import matplotlib
import matplotlib.pyplot as plt

import io
import os
import pathlib
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont

from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.builders import model_builder

%matplotlib inline

In [None]:
def load_image_into_numpy_array(path):
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(img_data))
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

def get_keypoint_tuples(eval_config):
    tuple_list = []
    kp_list = eval_config.keypoint_edge
    for edge in kp_list:
      tuple_list.append((edge.start, edge.end))
    return tuple_list

In [None]:
current_dir = os.getcwd()
config_dir = '/pet-detection/configs/mask_rcnn_inception_resnet_v2_pets.config'
pipeline_config = os.path.join(current_dir + config_dir)
model_dir = current_dir + 'model/'

# Load pipeline config and build a detection model
configs = config_util.get_configs_from_pipeline_file(pipeline_config)
model_config = configs['model']
detection_model = model_builder.build(
      model_config=model_config, is_training=False)

# Restore checkpoint
ckpt = tf.compat.v2.train.Checkpoint(model=detection_model)
# ckpt.restore(os.path.join(model_dir, 'ckpt-0')).expect_partial()

def get_model_detection_function(model):
  @tf.function
  def detect_fn(image):
    image, shapes = model.preprocess(image)
    prediction_dict = model.predict(image, shapes)
    detections = model.postprocess(prediction_dict, shapes)
    return detections, prediction_dict, tf.reshape(shapes, [-1])
  return detect_fn

detect_fn = get_model_detection_function(detection_model)

In [None]:
label_map_path = configs['eval_input_config'].label_map_path
label_map = label_map_util.load_labelmap(label_map_path)
categories = label_map_util.convert_label_map_to_categories(
    label_map,
    max_num_classes=label_map_util.get_max_label_map_index(label_map),
    use_display_name=True)
category_index = label_map_util.create_category_index(categories)
label_map_dict = label_map_util.get_label_map_dict(label_map, use_display_name=False)

In [None]:
image_dir = '/content/models/research/data/images/'
image_path = os.path.join(image_dir, 'Abyssinian_100.jpg')
image_np = load_image_into_numpy_array(image_path)
input_tensor = tf.convert_to_tensor(
    np.expand_dims(image_np, 0), dtype=tf.float32)
detections, predictions_dict, shapes = detect_fn(input_tensor)

label_id_offset = 1
image_np_with_detections = image_np.copy()

# Use keypoints if available in detections
keypoints, keypoint_scores = None, None
if 'detection_keypoints' in detections:
  keypoints = detections['detection_keypoints'][0].numpy()
  keypoint_scores = detections['detection_keypoint_scores'][0].numpy()

viz_utils.visualize_boxes_and_labels_on_image_array(
      image_np_with_detections,
      detections['detection_boxes'][0].numpy(),
      (detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
      detections['detection_scores'][0].numpy(),
      category_index,
      use_normalized_coordinates=True,
      max_boxes_to_draw=200,
      min_score_thresh=.30,
      agnostic_mode=False,
      keypoints=keypoints,
      keypoint_scores=keypoint_scores,
      keypoint_edges=get_keypoint_tuples(configs['eval_config']))

plt.figure(figsize=(12,16))
plt.imshow(image_np_with_detections)
plt.show()