<a href="https://colab.research.google.com/github/wisdomscode/AI-Lab-Deep-Learning-PyTorch/blob/main/AI_Lab_Project_3_2_Traffic_Monitoring_in_Bangladesh.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Object Detection


Object detection is a computer vision task that involves identifying and location objects in images or video

Challenges of Object detection
1. partial or incomplete image
2. Overlapping image



**Object Classification vs Object Detection**

**Object Classification**

Is labelling an entire image with a single class of primary object, it uses single output per image, task is right or wrong predictions of that single label

**Object Detection**

Identify, locate and label multiple objects within an image,
multiple possible outputs per image
More complex, must account for multiple aspects of prediction
 * Presence of objects
 * Class of each object
 * Location of each object in a bounding box


Object Detection Applications
1. Wildlife monitoring
2. Medical Imaging
3. Traffic Data used in self driving cars


**Object Detection Components**
1. Model Architecture
 * Focus on YOLO (You Only Look Once)
2. Pretrained models
 * Fixed architecture
 * Pretrained weights for common objects
3. Training Custom models
 * For custom classes
 * Need labeled data
 * Define loss function for the multiple parts of objects detection

### 3.2 Traffic Data as Images and Video

Working with image and video data for object detection. We'll explore the specific traffic dataset for this project. The focus will be on how video data can be converted to images, understanding bounding boxes for object classification, and using XML annotations to represent those bounding boxes.

**Objectives:**

* Load and organize the project dataset, separating images and their corresponding XML annotations.
* Extract frames from a video at regular intervals.
* Parse XML annotations.
* Visualize bounding boxes on image data.

### Setting Up

In [None]:
import sys
import xml.etree.ElementTree as ET
from collections import Counter
from pathlib import Path

import cv2
import matplotlib.pyplot as plt
import pytubefix
from pytubefix import YouTube
from IPython.display import Video
import torch
import torchvision
from torchvision import transforms
from torchvision.io import read_image
from torchvision.transforms.functional import to_pil_image
from torchvision.utils import draw_bounding_boxes, make_grid

In [None]:
print("Platform:", sys.platform)
print("Python version:", sys.version)
print("---")
print("CV2 version : ", cv2.__version__)
print("torch version : ", torch.__version__)
print("torchvision version : ", torchvision.__version__)

 We are ready to start looking at the data. 🏎️💨

### Exploring Our Data

In this project, we'll work with two datasets. Let's start with the [Dhaka AI dataset](https://www.kaggle.com/datasets/rifat963/dhakaai-dhaka-based-traffic-detection-dataset), which contains images of vehicles in urban traffic scenes from Dhaka, Bangladesh. This dataset is particularly interesting for computer vision as it captures the unique characteristics of Dhaka's busy streets, including a diverse mix of vehicle types and dense traffic conditions.

We'll use the dataset for object detection which is a more complex task than image classification. Object detection identifies specific objects within an image (e.g., cars, buses, motorcycles), determines the precise location of these objects, and draws a bounding box around each detected object.

This dataset will be used in a later lesson to create a custom model. For now, we'll begin by exploring the dataset.

**Task 3.2.1:** Create a variable for the train directory using `pathlib` syntax.

In [None]:
dhaka_image_dir = Path("data_images", "train")

print("Data directory:", dhaka_image_dir)

#output
Data directory: data_images/train


Let's examine some of the contents of the train directory. You'll see two types of files:

1. `.xml` files: These contain the annotations for the images.
2. `.jpg` files: These are the actual image files.

Each image typically has a corresponding XML file.

In [None]:
dhaka_files = list(dhaka_image_dir.iterdir())
dhaka_files[-5:]

# Output
[PosixPath('data_images/train/Dipto_442.xml'),
 PosixPath('data_images/train/Numan_(56).jpg'),
 PosixPath('data_images/train/Navid_323.xml'),
 PosixPath('data_images/train/Navid_97.jpg'),
 PosixPath('data_images/train/Navid_181.jpg')]

Even though we only see one type of image file, it turns out that the image files can have many different possible extensions. Let's count the file extensions by type and print the results.

In [None]:
file_extension_counts = Counter(Path(file).suffix for file in dhaka_files)

for extension, count in file_extension_counts.items():
    print(f"Files with extension {extension}: {count}")

# Output
Files with extension .xml: 3003
Files with extension .jpg: 2844
Files with extension .JPG: 143
Files with extension .png: 12
Files with extension .jpeg: 2
Files with extension .PNG: 2

### Separating images and bounding boxes data

Bounding boxes are rectangles around a detected object. All bounding box information is contained in the `.xml` files. The images have several different extensions.   It makes sense to separate the different file types into different folders. We'll want to put all `.xml` files in an annotations folder and the various image types in an images folder.

**Task 3.2.2:** Create variables for the images and annotations directories using `pathlib` syntax.

In [None]:
images_dir = dhaka_image_dir / "images"
annotations_dir = dhaka_image_dir / "annotations"

images_dir.mkdir(exist_ok=True)
annotations_dir.mkdir(exist_ok=True)

**Task 3.2.3:** Move files to the appropriate directory based on file extensions.

In [None]:
for file in dhaka_files:
    if file.suffix.lower() in (".jpg", ".jpeg", ".png"):
        target_dir = images_dir
    elif file.suffix.lower() == ".xml":
        target_dir = annotations_dir
    file.rename(target_dir / file.name)

Let's confirm that all the files where moved by making sure there is equal number of images and annotations.

In [None]:
images_files = list(images_dir.iterdir())
annotations_files = list(annotations_dir.iterdir())

assert len(images_files) == len(annotations_files)