<a href="https://colab.research.google.com/github/ricky-kiva/dl-deep-tf-cv-advanced/blob/main/2_l2_object_detection_visualized.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Object Detection with Visualized Output**

Import packages

In [4]:
import time
import tempfile
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub

import matplotlib.pyplot as plt

Pick model

- `SSD + MobileNetV2`: small & fast
- `Faster R-CNN + InceptionResNetV2`: high accuracy

In [3]:
# SSD + MobileNetV2
module_handle = "https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1"

# Faster R-CNN + InceptionResNetV2
#module_handle = "https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1"

Load the model

In [5]:
model = hub.load(module_handle)

Check *model signature* (shortly, tasks that could be done by the model. thoroughly, represents named collection of input & output tensors)

**Note:**
- The output for `default` signature of `SSD + MobileNetV2`:
  - Class Names (None, 1)
  - Class Labels (None, 1)
  - Detection Boxes (None, 4)
  - Detection Scores (None, 1)
  - Class Entities (None, 1)
    - **Note:** `None` in `shape` indicates the size of the dimension is not fixed (can vary), commonly used to represent `batch size`

In [6]:
model.signatures.keys()

KeysView(_SignatureMap({'default': <ConcreteFunction () -> Dict[['detection_class_names', TensorSpec(shape=(None, 1), dtype=tf.string, name=None)], ['detection_class_labels', TensorSpec(shape=(None, 1), dtype=tf.int64, name=None)], ['detection_boxes', TensorSpec(shape=(None, 4), dtype=tf.float32, name=None)], ['detection_scores', TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)], ['detection_class_entities', TensorSpec(shape=(None, 1), dtype=tf.string, name=None)]] at 0x7ED180A57010>}))

Pick model signature

In [8]:
detector = model.signatures['default']

Check model input for this specific signature (`default`)

**Note:** It accepts multi-batch of colored images (3 channel) with arbitrary width & height

In [10]:
detector.inputs

[<tf.Tensor 'hub_input/image_tensor:0' shape=(None, None, None, 3) dtype=float32>]

Function: display an image

In [11]:
def display_image(img):
  plt.figure(figsize=(20, 15))
  plt.grid(False)
  plt.imshow(img)

Function: download & resize image

In [12]:
from io import BytesIO
from PIL import Image
from PIL import ImageOps
from six.moves.urllib.request import urlopen

def download_and_resize_image(url, new_width=256, new_height=256, display=False):
  _, filename = tempfile.mkstemp(suffix='.jpg') # make temp. file with '.jpg' suffix

  response = urlopen(url) # opens given url

  img_data = response.read() # reads image fetched from URL
  img_data = BytesIO(img_data) # puts image data to memory buffer

  pil_img = Image.open(img_data) # open image using PIL

  # resizes the image. crop IF the aspect ratio is different
  # `Resampling.LANCZOS`: ensure high-quality interpolation during resizing operation
  # - 'interpolation': estimation on pixel values at non-integer coordinates, based on known pixel values
  # --- 'non-integer coordinates': also considers floating position (ex: 2.3, 4.7) who doesn't align with the exact pixel grid
  pil_img = ImageOps.fit(pil_img, (new_width, new_height), Image.Resampling.LANCZOS)
  pil_img_rgb = pil_img.convert('RGB')  # convert to RGB colorspace

  pil_img_rgb.save(filename, format='JPEG', quality=90) # saves image to the temporary file

  print(f"Image saved to {filename}")

  if display:
    display_image(pil_img)

  return filename

Download image for detection

In [None]:
import ssl

# create unverified SSL permission
ssl._create_default_https_context = ssl._create_unverified_context

img_url = 'https://backpanel.kemlu.go.id/PublishingImages/Foto_Berita/2021/praha%201sept21.jpg'

dl_img_path = download_and_resize_image(img_url, 1996, 2000, True)

In [None]:
from PIL import ImageDraw

def draw_bounding_box_on_image(img, ymin, xmin, ymax, xmax, color, font, thickness=4, display_str_list=()):
  draw = ImageDraw.Draw(img) # instantiate ImageDraw object, enable drawing on image
  im_width, im_height = img.size # get image width & height

  # scale bounding box coordinates to height & width of the image
  (left, right, top, bottom) = (xmin * im_width,
                                xmax * im_width,
                                ymin * im_height,
                                ymax * im_height)

  # draw the detection box
  draw.line([(left, top), (left, bottom), (right, bottom), (right, top), (left, top)],
            width=thickness, fill=color)