<a href="https://colab.research.google.com/github/mahdiislam79/Computer_Vision_practice/blob/main/Bounding_box_prediction_using_tensoflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is created to practice the programming assignment for predicting bounding boxes. The dataset used here is [`Caltech Birds - 2010`](https://www.vision.caltech.edu/datasets/cub_200_2011/) dataset. 

# Setting up data location

A copy of the data is provided in a google folder named [TF3 C3 W1 Data](https://drive.google.com/drive/folders/1xgqUw9uWzL5Kh88iPdX1TBQgnkc-wVKd). So we will create a shortcut of that folder in the drive so that we can use it for ourselves.

# Mounting the drive

In [2]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


# Imports

In [3]:
# If you get a checksum error with the dataset, you'll need this
!pip install tfds-nightly==4.0.1.dev202010100107

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tfds-nightly==4.0.1.dev202010100107
  Downloading tfds_nightly-4.0.1.dev202010100107-py3-none-any.whl (3.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.5/3.5 MB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m
Collecting dill
  Downloading dill-0.3.6-py3-none-any.whl (110 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.5/110.5 kB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: dill, tfds-nightly
Successfully installed dill-0.3.6 tfds-nightly-4.0.1.dev202010100107


In [8]:
import os,re,time,json
import PIL.Image,PIL.ImageFont,PIL.ImageDraw
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import cv2

In [7]:
data_dir =  '/content/drive/MyDrive/caltech_birds2010'

# Visualization tools

## 1. Bounding Box Utilities

These functions will help us create bounding boxes 

- `draw_bounding_box_on_image`: Draws a single bounding box on an image.
- `draw_bounding_boxes_on_image`: Draws multiple bounding boxes on an image.
- `draw_bounding_boxes_on_image_array`: Draws multiple bounding boxes on an array of images.

In [9]:
def draw_bounding_box_on_image(image, ymin, xmin, ymax, xmax, color=(255,0,0), thickness=5):

  """
    Adds a bounding box to an image.
    Bounding box coordinates can be specified in either absolute (pixel) or
    normalized coordinates by setting the use_normalized_coordinates argument.
    
    Args:
      image: a PIL.Image object.
      ymin: ymin of bounding box.
      xmin: xmin of bounding box.
      ymax: ymax of bounding box.
      xmax: xmax of bounding box.
      color: color to draw bounding box. Default is red.
      thickness: line thickness. Default value is 4.
  """

  image_width = image.shape[1]
  image_height = image.shape[0]
  cv2.rectangle(image, (int(xmin), int(ymin), int(xmax), int(ymax)), color, thickness)

def draw_bounding_boxes_on_image(image, boxes, color=[], thickness=5):

  """
    Draws bounding boxes on image.
    
    Args:
      image: a PIL.Image object.
      boxes: a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax).
             The coordinates are in normalized format between [0, 1].
      color: color to draw bounding box. Default is red.
      thickness: line thickness. Default value is 4.
                           
    Raises:
      ValueError: if boxes is not a [N, 4] array
  """

  boxes_shape = boxes.shape
  if not boxes_shape:
    return 
  if len(boxes_shape) != 2 or boxes_shape[1] != 4:
    raise ValueError('Input must be of size [N, 4]')
  for i in range(boxes_shape[0]):
    draw_bounding_box_on_image(image, boxes[i, 1], boxes[i, 0], boxes[i, 3], 
                               boxes[i, 2], color[i], thickness)
    
def draw_bounding_boxes_on_image_array(image, boxes, color=[], thickness=5):
  """
    Draws bounding boxes on image (numpy array).
    
    Args:
      image: a numpy array object.
      boxes: a 2 dimensional numpy array of [N, 4]: (ymin, xmin, ymax, xmax).
             The coordinates are in normalized format between [0, 1].
      color: color to draw bounding box. Default is red.
      thickness: line thickness. Default value is 4.
      display_str_list_list: a list of strings for each bounding box.
    
    Raises:
      ValueError: if boxes is not a [N, 4] array
  """

  draw_bounding_boxes_on_image(image, boxes, color, thickness)

  return image

# Data and Predictions Utilities

The below helper functions and code are used to visualize tha data and the model's predictions.

- `display_digits_with_boxes`: This displays a row of "digit" images along with the model's predictions for each image.
- `plot_metrics`: This plots a given metric (like loss) as it changes over multiple epochs of training.

In [10]:
# Matplotlib config 
plt.rc('image', cmap='grey')
plt.rc('grid', linewidth=0)
plt.rc('xtick', top=False, bottom=False, labelsize='large')
plt.rc('ytick', left=False, right=False, labelsize='large')
plt.rc('axes', facecolor='F8F8F8', titlesize='large', edgecolor='white')
plt.rc('text', color='a8151a')
plt.rc('figure', facecolor='F0F0F0') # Matplotlib fonts
MATPLOTLIB_FONT_DIR = os.path.join(os.path.dirname(plt.__file__), "mpl-data/fonts/ttf")

In [12]:
# utility to display a row of digits with their predictions
def display_digits_with_boxes(images, pred_bboxes, bboxes, iou, title, bboxes_normalized=False):

  n = len(images)

  fig = plt.figure(figsize=(20,4))
  plt.title(title)
  plt.yticks([])
  plt.xticks([])

  for i in range(n):
    ax = fig.add_subplot(1, 10, i+1)
    bboxes_to_plot = []
    if (len(pred_bboxes) > 1):
      bbox = pred_bboxes[i]
      bbox = [bbox[0] * images[i].shape[1], bbox[1] * images[i].shape[0], 
              bbox[2] * images[i].shape[1], bbox[3] * images[i].shape[0]], bboxes_to_plot.append(bbox)

    if (len(bboxes) > i):
        bbox = bboxes[i]
        if bboxes_normalized == True:
          bbox = [bbox[0] * images[i].shape[1],bbox[1] * images[i].shape[0], 
                  bbox[2] * images[i].shape[1], bbox[3] * images[i].shape[0] ]
        bboxes_to_plot.append(bbox)

    img_to_draw = draw_bounding_boxes_on_image_array(image=images[i], boxes=np.asarray(bboxes_to_plot), color=[(255,0,0), (0, 255, 0)])
    plt.xticks([])
    plt.yticks([])
    
    plt.imshow(img_to_draw)

    if len(iou) > i :
      color = "black"
      if (iou[i][0] < iou_threshold):
        color = "red"
      ax.text(0.2, -0.3, "iou: %s" %(iou[i][0]), color=color, transform=ax.transAxes)

In [13]:
# utility to display training and validation curves
def plot_metrics(metric_name, title, ylim=5):
    plt.title(title)
    plt.ylim(0,ylim)
    plt.plot(history.history[metric_name],color='blue',label=metric_name)
    plt.plot(history.history['val_' + metric_name],color='green',label='val_' + metric_name)