<a href="https://colab.research.google.com/github/rahiakela/pytorch-computer-vision-cookbook/blob/main/5-multi-object-detection/multi_object_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi-Object Detection

Object detection is the process of locating and classifying existing objects in an image. Identified objects are shown with bounding boxes in the image. There are two methods for general object detection: region proposal-based and regression/classification-based. 

In this notebook, we will use a regression/classification-based method called YOLO.we will learn how to implement the YOLO-v3 algorithm and train and
deploy it for object detection using PyTorch.


## Setup

In [1]:
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms.functional as TF
from torchvision.transforms.functional import to_pil_image
from torch import optim
from torch.optim.lr_scheduler import ReduceLROnPlateau


import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(torch.__version__)

from PIL import Image, ImageDraw, ImageFont

import copy
import os
import random
import numpy as np
import matplotlib.pylab as plt

%matplotlib inline

1.7.0+cu101


## Creating datasets

We will need to download the COCO dataset.

In [None]:
%%shell

# Download the following GitHub repository
git clone https://github.com/pjreddie/darknet

# Create a folder named data
mkdir data

# copy the get_coco_dataset.sh file
cp darknet/scripts/get_coco_dataset.sh data

# execute the get_coco_dataset.sh file
chmod 755 data/get_coco_dataset.sh
./data/get_coco_dataset.sh

# Create a folder named config
mkdir data/config
# copy the yolov3.cfg file
cp darknet/cfg/yolov3.cfg data/config/

# Finally, download the coco.names file and put it in the data folder
wget https://github.com/pjreddie/darknet/blob/master/data/coco.names
cp coco.names data/

### Creating a custom COCO dataset

Now that we've downloaded the COCO dataset, we will create training and validation datasets and dataloaders using PyTorch's Dataset and Dataloader classes.

we will define the CocoDataset class and show some sample images from
the training and validation datasets.

In [3]:
class CocoDataset(Dataset):

  def __init__(self, files_path, transform=None, trans_params=None):
    # get list of images
    with opne(files_path, "r") as file:
      self.img_path = file.readlines()
    # get list of labels
    self.label_path = [path.replace("images", "labels").replace(".png", "txt").replace(".jpg", ".txt") for path in self.img_path]
    self.trans_params = trans_params 
    self.transform = transform 

  def __len__(self):
    return len(self.img_path)

  def __getitem__(self, index):
    img_path = self.img_path[index % len(self.img_path)].rstrip()
    img = Image.open(img_path).convert("RGB")
    label_path = self.label_path[index % len(self.img_path)].rstrip()

    labels = None
    if os.path.exists(label_path):
      labels = np.loadtxt(label_path).replace(-1, 5)
    if self.transform:
      img, labels = self.transform(img, labels, self.trans_params)

    return img, labels, img_path

Next, we will create an object of the CocoDataset class for the validation data:

In [None]:
root_data = "./data/coco"
train_file_path = os.path.join(root_data, "trainvalno5k.txt")
coco_train = CocoDataset(train_file_path)
print(len(coco_train))

In [None]:
# Get a sample item from coco_val:
img, labels, img_path = coco_train[1] 
print("image size:", img.size, type(img))
print("labels shape:", labels.shape, type(labels))
print("labels \n", labels)

Let's display a sample image from the coco_train and coco_val datasets.

In [None]:
val_file_path = os.path.join(root_data, "5k.txt")
coco_val = CocoDataset(val_file_path, transform=None, trans_params=None)
print(len(coco_val))

In [None]:
# Get a sample item from coco_val:
img, labels, img_path = coco_val[7] 
print("image size:", img.size, type(img))
print("labels shape:", labels.shape, type(labels))
print("labels \n", labels)

Let's display a sample image from the coco_train and coco_val datasets.

In [None]:
# Get a list of COCO object names
coco_names_path="./data/coco.names"
fp = open(coco_names_path, "r")
coco_names = fp.read().split("\n")[:-1]
print("number of classese:", len(coco_names))
print(coco_names)

In [4]:
# Define a rescale_bbox helper function to rescale normalized bounding boxes to the original image size
def rescale_bbox(bb, W, H):
  x, y, w, h = bb
  return [x * W, y * H, w * W, h * H]

In [None]:
# Define the show_img_bbox helper function to show an image with object bounding boxes
COLORS = np.random.randint(0, 255, size=(80, 3),dtype="uint8")
# if the font that's passed to ImageFont.truetype is not available
# Alternatively, you may use a more common font
# fnt = ImageFont.truetype('arial.ttf', 16)
fnt = ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 16)

def show_img_bbox(img, targets):
  if torch.is_tensor(img):
      img=to_pil_image(img)
  if torch.is_tensor(targets):
      targets=targets.numpy()[:,1:]
      
  W, H=img.size
  draw = ImageDraw.Draw(img)
  
  for target in targets:
      id_=int(target[0])
      bbox=target[1:]
      bbox=rescale_bbox(bbox,W,H)
      xc, yc, w, h=bbox
      
      color = [int(c) for c in COLORS[id_]]
      name=coco_names[id_]
      
      draw.rectangle(((xc-w/2, yc-h/2), (xc+w/2, yc+h/2)), outline=tuple(color), width=3)
      draw.text((xc-w/2, yc-h/2), name, font=fnt, fill=(255, 255, 255, 0))
  plt.imshow(np.array(img))

In [None]:
# Call the show_img_bbox helper function to show a sample image from coco_train
np.random.seed(2)
rnd_ind=np.random.randint(len(coco_train))
img, labels, img_path = coco_train[rnd_ind] 
print(img.size, labels.shape)

plt.rcParams['figure.figsize'] = (20, 10)
show_img_bbox(img, labels)

In [None]:
# Call the show_img_bbox helper function to show a sample image from coco_val
np.random.seed(0)
rnd_ind=np.random.randint(len(coco_val))
img, labels, img_path = coco_val[rnd_ind] 
print(img.size, labels.shape)

plt.rcParams['figure.figsize'] = (20, 10)
show_img_bbox(img, labels)

### Transforming data