<a href="https://colab.research.google.com/github/JAMES-YI/T01_Tensorflow_Tutorials/blob/master/Object_Detection_Sample_Code_JYI_20210712.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2020 The TensorFlow Hub Authors.

Licensed under the Apache License, Version 2.0 (the "License");

Codes originally from: https://github.com/tensorflow/hub/blob/master/examples/colab/tf2_object_detection.ipynb

Modified by JYI, 07/12/2021

In [None]:
#@title Copyright 2020 The TensorFlow Hub Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/hub/tutorials/tf2_object_detection"><img src="https://www.tensorflow.org/images/tf_logo_32px.png" />View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/tf2_object_detection.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/hub/blob/master/examples/colab/tf2_object_detection.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/hub/examples/colab/tf2_object_detection.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
  <td>
    <a href="https://tfhub.dev/tensorflow/collections/object_detection/1"><img src="https://www.tensorflow.org/images/hub_logo_32px.png" />See TF Hub models</a>
  </td>
</table>


- visit [object detection models in TF]([This](https://tfhub.dev/tensorflow/collections/object_detection/1))
- visit [object detection models in tfhub](https://tfhub.dev/s?module-type=image-object-detection)

# Step: Library installation, imports and setup


In [None]:
# This Colab requires TF 2.5. Upgrade tensorflow to 2.5
!pip install -U tensorflow>=2.5 


In [None]:
import os
import pathlib

import matplotlib
import matplotlib.pyplot as plt

import io
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from six.moves.urllib.request import urlopen

import tensorflow as tf
import tensorflow_hub as hub

tf.get_logger().setLevel('ERROR')

Documentations
- pip install, https://pip.pypa.io/en/stable/cli/pip_install/
- tf.get_logger().setLevel('ERROR'), control the printing of log information [ref 1](https://stackoverflow.com/questions/38073432/how-to-suppress-verbose-tensorflow-logging)
-



# Step: Utilities

Run the following cell to create some utils that will be needed later:

- Helper method to load an image
- Map of Model Name to TF Hub handle
- List of tuples with Human Keypoints for the COCO 2017 dataset. This is needed for models with keypoints.

In [None]:
def load_image_into_numpy_array(path):
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
  """
  image = None
  if(path.startswith('http')):
    response = urlopen(path)
    image_data = response.read()
    image_data = BytesIO(image_data)
    image = Image.open(image_data)
  else:
    image_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(BytesIO(image_data))

  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (1, im_height, im_width, 3)).astype(np.uint8)


In [None]:
"""
JYI - Exploration of image loading

"""

# IMAGES_FOR_TEST = {
#   'Beach' : 'models/research/object_detection/test_images/image2.jpg',
#   'Dogs' : 'models/research/object_detection/test_images/image1.jpg',
#   # By Heiko Gorski, Source: https://commons.wikimedia.org/wiki/File:Naxos_Taverna.jpg
#   'Naxos Taverna' : 'https://upload.wikimedia.org/wikipedia/commons/6/60/Naxos_Taverna.jpg',
#   # Source: https://commons.wikimedia.org/wiki/File:The_Coleoptera_of_the_British_islands_(Plate_125)_(8592917784).jpg
#   'Beatles' : 'https://upload.wikimedia.org/wikipedia/commons/1/1b/The_Coleoptera_of_the_British_islands_%28Plate_125%29_%288592917784%29.jpg',
#   # By Américo Toledano, Source: https://commons.wikimedia.org/wiki/File:Biblioteca_Maim%C3%B3nides,_Campus_Universitario_de_Rabanales_007.jpg
#   'Phones' : 'https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg/1024px-Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg',
#   # Source: https://commons.wikimedia.org/wiki/File:The_smaller_British_birds_(8053836633).jpg
#   'Birds' : 'https://upload.wikimedia.org/wikipedia/commons/0/09/The_smaller_British_birds_%288053836633%29.jpg',
# }

# img_path = IMAGES_FOR_TEST["Birds"]

# response = urlopen(img_path)
# # print(f"response: {response}")
# # print(f"response: {response.read()}")
# # print(f"response: {type(response.read())}")

# image_data = response.read() # binary data
# print(f"image_data: {image_data}")

# image_data = BytesIO(image_data) # io.Bytesio object
# print(f"image data: {image_data}")

# image = Image.open(image_data) # PIL.JpegImagePlugin.JpegImageFile
# print(f"image: {image}")
# print(f"image size: {image.size}")

# image = image.getdata() # ImagingCore object
# print(f"image: {image}")

# image = np.array(image) # np.ndarray
# print(f"image: {image}")


Tips
- images need to be imported as numpy array to feed into tensorflow graph
- urlopen() returns a response object, and the read method of response object can read binary data --> BytesIO() returns an io.BytesIO object --> Image.open() returns a PIL.JpegImagePlugin object --> PIL.JpegImagePlugin object.getdata() returns a ImagingCore object --> np.array turns ImagingCore object into np.ndarray
- 


ToDos
- try other models
- try other images
- read papers about the models
- meaning of COCO17_HUMAN_POSE_KEYPOINTS? specifies how the 17 key points are connected
- explore all the models in ALL_MODELS

Documentations
- path.startswith, https://www.tutorialspoint.com/python/string_startswith.htm
- urlopen，https://docs.python.org/3/library/urllib.request.html
- BytesIO, https://docs.python.org/3/library/io.html#io.BytesIO
- Image.open, https://pillow.readthedocs.io/en/stable/reference/Image.html
- tf.io.gfile.GFile, https://www.tensorflow.org/api_docs/python/tf/io/gfile/GFile
- 

In [None]:
# model list, and image list

ALL_MODELS = {
'CenterNet HourGlass104 512x512' : 'https://tfhub.dev/tensorflow/centernet/hourglass_512x512/1',
'CenterNet HourGlass104 Keypoints 512x512' : 'https://tfhub.dev/tensorflow/centernet/hourglass_512x512_kpts/1',
'CenterNet HourGlass104 1024x1024' : 'https://tfhub.dev/tensorflow/centernet/hourglass_1024x1024/1',
'CenterNet HourGlass104 Keypoints 1024x1024' : 'https://tfhub.dev/tensorflow/centernet/hourglass_1024x1024_kpts/1',
'CenterNet Resnet50 V1 FPN 512x512' : 'https://tfhub.dev/tensorflow/centernet/resnet50v1_fpn_512x512/1',
'CenterNet Resnet50 V1 FPN Keypoints 512x512' : 'https://tfhub.dev/tensorflow/centernet/resnet50v1_fpn_512x512_kpts/1',
'CenterNet Resnet101 V1 FPN 512x512' : 'https://tfhub.dev/tensorflow/centernet/resnet101v1_fpn_512x512/1',
'CenterNet Resnet50 V2 512x512' : 'https://tfhub.dev/tensorflow/centernet/resnet50v2_512x512/1',
'CenterNet Resnet50 V2 Keypoints 512x512' : 'https://tfhub.dev/tensorflow/centernet/resnet50v2_512x512_kpts/1',
'EfficientDet D0 512x512' : 'https://tfhub.dev/tensorflow/efficientdet/d0/1',
'EfficientDet D1 640x640' : 'https://tfhub.dev/tensorflow/efficientdet/d1/1',
'EfficientDet D2 768x768' : 'https://tfhub.dev/tensorflow/efficientdet/d2/1',
'EfficientDet D3 896x896' : 'https://tfhub.dev/tensorflow/efficientdet/d3/1',
'EfficientDet D4 1024x1024' : 'https://tfhub.dev/tensorflow/efficientdet/d4/1',
'EfficientDet D5 1280x1280' : 'https://tfhub.dev/tensorflow/efficientdet/d5/1',
'EfficientDet D6 1280x1280' : 'https://tfhub.dev/tensorflow/efficientdet/d6/1',
'EfficientDet D7 1536x1536' : 'https://tfhub.dev/tensorflow/efficientdet/d7/1',
'SSD MobileNet v2 320x320' : 'https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2',
'SSD MobileNet V1 FPN 640x640' : 'https://tfhub.dev/tensorflow/ssd_mobilenet_v1/fpn_640x640/1',
'SSD MobileNet V2 FPNLite 320x320' : 'https://tfhub.dev/tensorflow/ssd_mobilenet_v2/fpnlite_320x320/1',
'SSD MobileNet V2 FPNLite 640x640' : 'https://tfhub.dev/tensorflow/ssd_mobilenet_v2/fpnlite_640x640/1',
'SSD ResNet50 V1 FPN 640x640 (RetinaNet50)' : 'https://tfhub.dev/tensorflow/retinanet/resnet50_v1_fpn_640x640/1',
'SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)' : 'https://tfhub.dev/tensorflow/retinanet/resnet50_v1_fpn_1024x1024/1',
'SSD ResNet101 V1 FPN 640x640 (RetinaNet101)' : 'https://tfhub.dev/tensorflow/retinanet/resnet101_v1_fpn_640x640/1',
'SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)' : 'https://tfhub.dev/tensorflow/retinanet/resnet101_v1_fpn_1024x1024/1',
'SSD ResNet152 V1 FPN 640x640 (RetinaNet152)' : 'https://tfhub.dev/tensorflow/retinanet/resnet152_v1_fpn_640x640/1',
'SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)' : 'https://tfhub.dev/tensorflow/retinanet/resnet152_v1_fpn_1024x1024/1',
'Faster R-CNN ResNet50 V1 640x640' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet50_v1_640x640/1',
'Faster R-CNN ResNet50 V1 1024x1024' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet50_v1_1024x1024/1',
'Faster R-CNN ResNet50 V1 800x1333' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet50_v1_800x1333/1',
'Faster R-CNN ResNet101 V1 640x640' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet101_v1_640x640/1',
'Faster R-CNN ResNet101 V1 1024x1024' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet101_v1_1024x1024/1',
'Faster R-CNN ResNet101 V1 800x1333' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet101_v1_800x1333/1',
'Faster R-CNN ResNet152 V1 640x640' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet152_v1_640x640/1',
'Faster R-CNN ResNet152 V1 1024x1024' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet152_v1_1024x1024/1',
'Faster R-CNN ResNet152 V1 800x1333' : 'https://tfhub.dev/tensorflow/faster_rcnn/resnet152_v1_800x1333/1',
'Faster R-CNN Inception ResNet V2 640x640' : 'https://tfhub.dev/tensorflow/faster_rcnn/inception_resnet_v2_640x640/1',
'Faster R-CNN Inception ResNet V2 1024x1024' : 'https://tfhub.dev/tensorflow/faster_rcnn/inception_resnet_v2_1024x1024/1',
'Mask R-CNN Inception ResNet V2 1024x1024' : 'https://tfhub.dev/tensorflow/mask_rcnn/inception_resnet_v2_1024x1024/1'
}

COCO17_HUMAN_POSE_KEYPOINTS = [(0, 1),
 (0, 2),
 (1, 3),
 (2, 4),
 (0, 5),
 (0, 6),
 (5, 7),
 (7, 9),
 (6, 8),
 (8, 10),
 (5, 6),
 (5, 11),
 (6, 12),
 (11, 12),
 (11, 13),
 (13, 15),
 (12, 14),
 (14, 16)]

# Step: Visualization tools

commonly used utility functions
- visualize the images with the proper detected boxes, keypoints and segmentation, we will use the TensorFlow Object Detection API. To install it we will clone the repo.

Tips
- visit https://github.com/tensorflow/models
- file managements in colab, https://neptune.ai/blog/google-colab-dealing-with-files
- use line-magic (%) or bash (!) to use shell commands
- clone entire github repository to colab environment via git clone; read
files as in local machines
- access local file systems using python code, from google.colab import files
- access google drive from google colab, from google.colab import drive
- access google sheets from google colab
- access google cloud storage from google colab
- access AWS S3 from google colab
- access kaggle datasets from google colab
- access MySQL databases from google colab

ToDos
- how to check the directories in colab? use file panel on the left, https://neptune.ai/blog/google-colab-dealing-with-files


Documentations
- git clone --depth, https://www.perforce.com/blog/vcs/git-beyond-basics-using-shallow-clones 

In [None]:
# Clone the tensorflow models repository for the most recent commit
!git clone --depth 1 https://github.com/tensorflow/models

Tips
- after clone all the files from https://github.com/tensorflow/models, we simply use it as common functions. The function dependencies have been set up.
- protoc object_detection/protos/*.proto --python_out=. (this command converts the codes into python codes)

Documentations
- sudo apt install, https://linuxize.com/post/how-to-use-apt-command/
- protobuf-compiler, https://developers.google.com/protocol-buffers/docs/overview
- protoc, https://developers.google.com/protocol-buffers/docs/proto3
- more about protoc see [ref1](https://towardsdatascience.com/how-to-install-tensorflow-2-object-detection-api-on-windows-2eef9b7ae869)
- python -m pip install . (this command will install all the listed libraries in .)
- cp object_detection/packages/tf2/setup.py . (this command copies all contents in setup.py to .)
- more about cp in terminal see [ref 1](https://www.geeksforgeeks.org/cp-command-linux-examples/)


Tips
- dot in terminal: (1) current directory; (2) temporary variable for storing all data; 
- check version of pip via: !python -m pip --version
- 



In [None]:
# install object detection API
%%bash
sudo apt install -y protobuf-compiler
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .


In [None]:

"""
JYI - Exploration of installed object detection APIs

"""


Tips
- use %%bash in the beginning to start shell scripting programming
- 

In [None]:
!pwd
!ls


Tips
- after runing the bash commands, the dirctory will return to the original directory when the bash script is run
- 

ToDos
- how to import functions from deep_speech, lstm_object_detection
- play with a complete research project in a research paper


In [None]:
"""
JYI - Import dependencies
- the following is required for importing functions from object_detection.utils

%%bash
sudo apt install -y protobuf-compiler
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .

- 
"""
# import dependencies
from urllib.request import OpenerDirector
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import ops as utils_ops

%matplotlib inline

Documentations
- urllib.request.OpenDirector
- object_detection.utils.label_map_util
- object_detection.utils.visualization_utils
- object_detection.utils.ops

ToDos
- how to import libraries, module, directory, functions?
- 


# Step: Load label map data (for plotting).

Tops
- Label maps correspond index numbers to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine.
- load from the repository that we loaded the Object Detection API code
- label map is a nested dictionary, outter dictionary contains the index of objects. Inner dictionary contains 'id' and 'name' of objects
- try different label maps to see how data is stored

In [None]:
PATH_TO_LABELS = './models/research/object_detection/data/mscoco_label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

In [None]:
"""
JYI - Exploration of other label_map
"""

# path_to_label = './models/research/object_detection/data/face_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # face

# path_to_label = './models/research/object_detection/data/ava_label_map_v2.1.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # activities

# path_to_label = './models/research/object_detection/data/face_person_with_keypoints_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # key points in a face
# print(f"cate_index[1]: {cate_index[1]}")

# path_to_label = './models/research/object_detection/data/fgvc_2854_classes_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") #?

# path_to_label = './models/research/object_detection/data/kitti_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # car & ped

# path_to_label = './models/research/object_detection/data/mscoco_complete_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # various objects

# path_to_label = './models/research/object_detection/data/oid_bbox_trainable_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # various objects

# path_to_label = './models/research/object_detection/data/pascal_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # various objects

# path_to_label = './models/research/object_detection/data/pet_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}") # different pets

# path_to_label = './models/research/object_detection/data/snapshot_serengeti_label_map.pbtxt'
# cate_index = label_map_util.create_category_index_from_labelmap(path_to_label,
#                                                                 use_display_name=True)
# print(f"cate_index: {cate_index}")

In [None]:
"""
JYI - Exploration of label map
"""

# print(f"category_index: {category_index}")
# print(f"category_index[2]: {category_index[2]}")
# print(f"category_index[2]['id']: {category_index[2]['id']}")
# print(f"category_index[2]['name']: {category_index[2]['name']}")


# Step: Build a detection model and load pre-trained model weights

Tips
- choose which Object Detection model we will use.
Select the architecture and it will be loaded automatically.
If you want to change the model to try other architectures later, just change the next cell and execute following ones.
- we specify the http of the model in model_handle


ToDos
- construct customized object detection models
- retrain the model
- fine-tuning of models

In [None]:
# Select model and specify the web link of models
model_display_name = 'CenterNet HourGlass104 Keypoints 512x512' 
model_handle = ALL_MODELS[model_display_name]

print('Selected model:'+ model_display_name)
print('Model Handle at TensorFlow Hub: {}'.format(model_handle))

Tips
- create code chunk title by using #@title
- list of all models: ['CenterNet HourGlass104 512x512','CenterNet HourGlass104 Keypoints 512x512','CenterNet HourGlass104 1024x1024','CenterNet HourGlass104 Keypoints 1024x1024','CenterNet Resnet50 V1 FPN 512x512','CenterNet Resnet50 V1 FPN Keypoints 512x512','CenterNet Resnet101 V1 FPN 512x512','CenterNet Resnet50 V2 512x512','CenterNet Resnet50 V2 Keypoints 512x512','EfficientDet D0 512x512','EfficientDet D1 640x640','EfficientDet D2 768x768','EfficientDet D3 896x896','EfficientDet D4 1024x1024','EfficientDet D5 1280x1280','EfficientDet D6 1280x1280','EfficientDet D7 1536x1536','SSD MobileNet v2 320x320','SSD MobileNet V1 FPN 640x640','SSD MobileNet V2 FPNLite 320x320','SSD MobileNet V2 FPNLite 640x640','SSD ResNet50 V1 FPN 640x640 (RetinaNet50)','SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)','SSD ResNet101 V1 FPN 640x640 (RetinaNet101)','SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)','SSD ResNet152 V1 FPN 640x640 (RetinaNet152)','SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)','Faster R-CNN ResNet50 V1 640x640','Faster R-CNN ResNet50 V1 1024x1024','Faster R-CNN ResNet50 V1 800x1333','Faster R-CNN ResNet101 V1 640x640','Faster R-CNN ResNet101 V1 1024x1024','Faster R-CNN ResNet101 V1 800x1333','Faster R-CNN ResNet152 V1 640x640','Faster R-CNN ResNet152 V1 1024x1024','Faster R-CNN ResNet152 V1 800x1333','Faster R-CNN Inception ResNet V2 640x640','Faster R-CNN Inception ResNet V2 1024x1024','Mask R-CNN Inception ResNet V2 1024x1024']

# Step: Loading the selected model from TensorFlow Hub

Tips
- Here we just need the model handle that was selected and use the Tensorflow Hub library to load it to memory.
- handle is essentially the hyperlink to the model
- 

Documentations
- hub.load, https://github.com/tensorflow/hub/blob/v0.12.0/tensorflow_hub/module_v2.py#L50-L108
- tf.saved_model.load, https://github.com/tensorflow/tensorflow/blob/v2.5.0/tensorflow/python/saved_model/load.py#L778-L869



In [None]:
# Load model
print('loading model...')
hub_model = hub.load(model_handle)
print('model loaded!')

In [None]:
"""
JYI - Exploration of loaded model
- hub_model has the following attributes or methods: 
hub_model.graph_debug_info,hub_model.tensorflow_git_version,
hub_model.tensorflow_version,hub_model.signature 
"""

print(f"hub_model: {hub_model}")

In [None]:
"""
JYI - Exploration of loaded model
"""

In [None]:
"""
JYI - Exploration of loading other models
"""

ToDos
- how to explore and use loaded models? 

Tips
- tensorflow_hub.load returns a trackable object

Documentations
- hub.load, similar to tf.saved_model.load, https://www.tensorflow.org/hub/api_docs/python/hub/load
- tf.saved_model.load, https://www.tensorflow.org/api_docs/python/tf/saved_model/load

- exploration of loaded model

# Step: Loading an image

ToDos
- try difference images, ['Beach', 'Dogs', 'Naxos Taverna', 'Beatles', 'Phones', 'Birds']
- Try running inference on your own images, just upload them to colab and load the same way it's done in the cell below.
- Modify some of the input images and see if detection still works.  Some simple things to try out here include flipping the image horizontally, or converting to grayscale (note that we still expect the input image to have 3 channels).
- when using images with an alpha channel, the model expect 3 channels images and the alpha will count as a 4th.
- alpha channel



In [None]:
# utility functions

# Flip horizontally
def horizontal_flip(image_np):
  """
  image_np: a single image stored in numpy array
  """
  image_np = np.fliplr(image_np).copy()
  return image_np

# Flip vertical

def vertical_flip(image_np):

  pass

# Convert image to grayscale
def to_gray(image_np):
  """
  image_np: a single image stored in numpy array

  tips
  - average over channels, then replicate the channel by three times
  - the returned image is still of shape (W,H,C)
  """
  image_np = np.tile(
      np.mean(image_np, 2, keepdims=True),
      (1, 1, 3)).astype(np.uint8)
  return image_np

# pixel shifting

def shift(image_np):
  pass

# rotation

def rotation(image_np):
  pass


ToDos
- implement vertical_flip, shift, rotation

Documentations
- np.fliplr, https://numpy.org/doc/stable/reference/generated/numpy.fliplr.html
- np.tile, replicate given number of times along given dimension, https://numpy.org/doc/stable/reference/generated/numpy.tile.html
- np.mean, https://numpy.org/doc/stable/reference/generated/numpy.mean.html

In [None]:
# image exploration

IMAGES_FOR_TEST = {
  'Beach' : 'models/research/object_detection/test_images/image2.jpg',
  'Dogs' : 'models/research/object_detection/test_images/image1.jpg',
  # By Heiko Gorski, Source: https://commons.wikimedia.org/wiki/File:Naxos_Taverna.jpg
  'Naxos Taverna' : 'https://upload.wikimedia.org/wikipedia/commons/6/60/Naxos_Taverna.jpg',
  # Source: https://commons.wikimedia.org/wiki/File:The_Coleoptera_of_the_British_islands_(Plate_125)_(8592917784).jpg
  'Beatles' : 'https://upload.wikimedia.org/wikipedia/commons/1/1b/The_Coleoptera_of_the_British_islands_%28Plate_125%29_%288592917784%29.jpg',
  # By Américo Toledano, Source: https://commons.wikimedia.org/wiki/File:Biblioteca_Maim%C3%B3nides,_Campus_Universitario_de_Rabanales_007.jpg
  'Phones' : 'https://upload.wikimedia.org/wikipedia/commons/thumb/0/0d/Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg/1024px-Biblioteca_Maim%C3%B3nides%2C_Campus_Universitario_de_Rabanales_007.jpg',
  # Source: https://commons.wikimedia.org/wiki/File:The_smaller_British_birds_(8053836633).jpg
  'Birds' : 'https://upload.wikimedia.org/wikipedia/commons/0/09/The_smaller_British_birds_%288053836633%29.jpg',
}

selected_image = 'Beach' 
flip_image_horizontally = False 
convert_image_to_grayscale = False 

image_path = IMAGES_FOR_TEST[selected_image]
image_np = load_image_into_numpy_array(image_path)

plt.figure(figsize=(10,10))
plt.imshow(image_np[0])
plt.show()

# image_fliplr = horizontal_flip(image_np[0])
# plt.figure(figsize=(10,10))
# plt.imshow(image_fliplr)
# plt.show()

# image_gray = to_gray(image_np[0])
# plt.figure(figsize=(10,10))
# plt.imshow(image_gray)
# plt.show()


# Step: Doing the inference

Tips
- Print out `result['detection_boxes']` and try to match the box locations to the boxes in the image.  Notice that coordinates are given in normalized form (i.e., in the interval [0, 1]).
- representation of detection boxes
- 
- inspect other output keys present in the result. A full documentation can be seen on the models documentation page (pointing your browser to the model handle printed earlier)

ToDos
- retrain models
- transfer learning
- customized models
- exploration of loaded model


In [None]:
# running inference
results = hub_model(image_np)

# different object detection models have additional results
# all of them are explained in the documentation
result = {key:value.numpy() for key,value in results.items()}
print(result.keys())

In [None]:
"""
JYI - Exploration of model
"""

print(hub_model)

In [None]:
"""
JYI - Exploration of results

"""

# print(f"results: {type(results)}")
# #print(f"results: {results['detection_boxes']}")
# print(f"results: {type(results['detection_boxes'])}")

ToDos
- meaning of result['detection_scores']
- meaning of result['detection_keypoints']
- difference between EagerTensor, Tensor, numpy.ndarry?

Tips
- the category_index is not for all the test images. Each test image may have its own category index set
- tensorflow eager execution computes results immediately, more see [ref 1](https://www.tensorflow.org/guide/eager)

In [None]:
"""
JYI - Inference results exploration
- 
"""

# # detection_scores
# print(f"result['detection_scores']: {result['detection_scores']}")
# print(f"result['detection_scores'] sum: {np.sum(result['detection_scores'])}")
# plt.figure()
# plt.plot(result['detection_scores'][0]) # (100,1)
# plt.show()

# # detection_keypoints
# print(f"result['detection_keypoints']: {result['detection_keypoints']}")
# print(f"result['detection_keypoints shape']: {result['detection_keypoints'].shape}") # (1,100,17,2)
# print(f"result['detection_keypoints range']: {np.max(result['detection_keypoints'])}, {np.min(result['detection_keypoints'])}") # (1,100,17,2)
# plt.figure(figsize=(24,24))
# for row_ind in range(10):
#   for col_ind in range(10):

#     temp_ind = row_ind*10 + col_ind
#     plt.subplot(10,10,temp_ind+1)
    
#     plt.plot(result['detection_keypoints'][0,temp_ind][:,0],
#              result['detection_keypoints'][0,temp_ind][:,1],'+')
# plt.show()

# # temp_ind =1 
# plt.figure()
# plt.plot(result['detection_keypoints'][0,1][:,0],
#          result['detection_keypoints'][0,1][:,1],"+")
# plt.show()

# # detection classes
# print(f"result['detection_classes']: {result['detection_classes']}")
# print(f"result['detection_classes']: {result['detection_classes'].shape}")
# print(f"category_index: {category_index[result['detection_classes'][0,1]]}")
# print(f"category_index: {category_index[38]}")

# # num_detections
# # print(f"result['num_detections']: {result['num_detections']}")

# # detection_keypoint_scores
# print(f"result['detection_keypoint_scores']: {result['detection_keypoint_scores']}")
# print(f"result['detection_keypoint_scores shape']: {result['detection_keypoint_scores'].shape}")
# plt.figure()
# plt.subplot(2,1,1)
# plt.plot(result['detection_keypoint_scores'][0,1])
# print(f"sum result['detection_keypoint_scores'][0,1]: {np.sum(result['detection_keypoint_scores'][0,1])}")
# plt.subplot(2,1,2)
# plt.plot(result['detection_keypoint_scores'][0,:,1])
# print(f"sum result['result['detection_keypoint_scores'][0,:,1]: {np.sum(result['detection_keypoint_scores'][0,:,1])}")
# plt.show()

# # detection_boxes
# print(f"result['detection_boxes']: {result['detection_boxes']}")
# print(f"result['detection_boxes shape']: {result['detection_boxes'].shape}")


Tips
- num_detections: a tf.int tensor with only one value, the number of detections [N].
- detection_boxes: a tf.float32 tensor of shape [N, 4] containing bounding box coordinates in the following order: [ymin, xmin, ymax, xmax]. each object has a box
- detection_classes: a tf.int tensor of shape [N] containing detection class index from the label file. numerical class index of objects
- detection_scores: a tf.float32 tensor of shape [N] containing detection scores. confidence score for classifying the object as a particular class

# Step: Visualizing the results

Tips
- we will need the TensorFlow Object Detection API to show the squares from the inference step (and the keypoints when available). [here](https://github.com/tensorflow/models/blob/master/research/object_detection/utils/visualization_utils.py)
- you can, for example, set `min_score_thresh` to other values (between 0 and 1) to allow more detections in or to filter out more detections.

ToDos
- 

Documentations
- copy
- deepcopy


Tips
- in viz_utils.visualize_boxes_and_labels_on_image_array, the image input should not contain batch number dimension, and only a single image should be fed. Similarly, for detection_keypoints, detection_keypoints_scores, detection_boxes, detection_classes, detection_scores
- viz_utils: from object_detection.utils import visualization_utils as viz_utils

ToDos
- meaning of COCO17_HUMAN_POSE_KEYPOINTS
- what's return of viz_utils.visualize_boxes_and_labels_on_image_array

In [None]:
label_id_offset = 0
image_np_with_detections = image_np.copy()

# show image without boxes or labels or scores
plt.figure(figsize=(24,32))
plt.imshow(image_np_with_detections[0])
plt.show()

# Use keypoints if available in detections
keypoints, keypoint_scores = None, None
if 'detection_keypoints' in result:
  keypoints = result['detection_keypoints'][0] # (100,17,2)
  keypoint_scores = result['detection_keypoint_scores'][0] # (100,17)

# show image with boxes, labels, and scores
viz_utils.visualize_boxes_and_labels_on_image_array(
      image_np_with_detections[0],
      result['detection_boxes'][0],
      (result['detection_classes'][0] + label_id_offset).astype(int),
      result['detection_scores'][0],
      category_index,
      use_normalized_coordinates=True,
      max_boxes_to_draw=200,
      min_score_thresh=.30,
      agnostic_mode=False,
      keypoints=keypoints,
      keypoint_scores=keypoint_scores,
      keypoint_edges=COCO17_HUMAN_POSE_KEYPOINTS)

plt.figure(figsize=(24,32))
plt.imshow(image_np_with_detections[0])
plt.show()

# Step: Segmentation

Tips
- Among the available object detection models there's Mask R-CNN and the output of this model allows instance segmentation. To visualize it we will use the same method we did before but adding an aditional parameter: `instance_masks=output_dict.get('detection_masks_reframed', None)`

ToDos
- difference between object detection and segmentation


Documentations
- tf.convert_to_tensor
- 

In [None]:
# Handle models with masks:
image_np_with_mask = image_np.copy()

if 'detection_masks' in result:
  # we need to convert np.arrays to tensors
  detection_masks = tf.convert_to_tensor(result['detection_masks'][0])
  detection_boxes = tf.convert_to_tensor(result['detection_boxes'][0])

  # Reframe the the bbox mask to the image size.
  detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes,
              image_np.shape[1], image_np.shape[2])
  detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                      tf.uint8)
  result['detection_masks_reframed'] = detection_masks_reframed.numpy()

viz_utils.visualize_boxes_and_labels_on_image_array(
      image_np_with_mask[0],
      result['detection_boxes'][0],
      (result['detection_classes'][0] + label_id_offset).astype(int),
      result['detection_scores'][0],
      category_index,
      use_normalized_coordinates=True,
      max_boxes_to_draw=200,
      min_score_thresh=.30,
      agnostic_mode=False,
      instance_masks=result.get('detection_masks_reframed', None),
      line_thickness=8)

plt.figure(figsize=(24,32))
plt.imshow(image_np_with_mask[0])
plt.show()