### CSE 5320 Neural Networks Project 2
#### Report (with Code)
Josias Moukpe\
11/18/2022

#### Introduction
Object detection is an advanced form of image classification where a neural network predicts objects in an image and points them out in bounding boxes. Object detection thus refers to the detection and localization of objects in an image that belongs to a predefined set of classes. Tasks like detection, recognition, or localization find widespread applicability in real-world scenarios such as autonomous driving, robotics, product quality assurance, etc., making object detection (also referred to as object recognition) a very important subdomain of Computer Vision. [2] We call Real-Time object detection when the objects in images can be recognized in mere milliseconds allowing for in-time reactions based on the detection. Our project will aim to build a real-time object detector to find and track objects of defined classes in images or video feeds.

#### Problem
This objective combines object classificationn and localization (bounding box regression task). To process images and capture the features, we will leverage convolutional neural networks and capture local pixel structures. We will comment on how our model performs in real-time object detection. To train our model, we will use the MS COCO dataset [1]. This dataset contains more than 200,000 labeled color images of 1.5 million object instances and 80 object categories. Each image is 640 x 480 pixels and includes various forms of annotations such as key points, captions, segmentations, and bounding boxes (which interest us). The model will take an image or batch of images and outputs the classes and bounding boxes of all objects detected in that image.

#### Methodology
To prepare the data, we .

#### Benchmarking
To measure training performance, we 

#### Conclusion



References\
[1] https://cocodataset.org/#home
[2] https://www.v7labs.com/blog/yolo-object-detection

Obtaining the COCO 2017 Dataset

In [1]:
# list the contents of the current 
# directory on my remote server
!ls
%cd "/notebooks/data"


README.md  data  dev.ipynb
/notebooks/data


In [1]:
# downloading coco (2017) dataset
!wget http://images.cocodataset.org/zips/train2017.zip
!wget http://images.cocodataset.org/zips/val2017.zip
!wget http://images.cocodataset.org/zips/test2017.zip
!wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!wget http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
!wget http://images.cocodataset.org/annotations/image_info_test2017.zip

# unzip the files
!unzip train2017.zip
!unzip val2017.zip
!unzip test2017.zip
!unzip annotations_trainval2017.zip
!unzip stuff_annotations_trainval2017.zip
!unzip image_info_test2017.zip

# remove the zip files
!rm train2017.zip
!rm val2017.zip
!rm test2017.zip
!rm annotations_trainval2017.zip
!rm stuff_annotations_trainval2017.zip
!rm image_info_test2017.zip


--2022-11-19 03:20:27--  http://images.cocodataset.org/zips/train2017.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 54.231.133.57, 3.5.19.134, 52.217.198.161, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|54.231.133.57|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19336861798 (18G) [application/zip]
Saving to: ‘train2017.zip’


2022-11-19 04:12:37 (5.89 MB/s) - ‘train2017.zip’ saved [19336861798/19336861798]

--2022-11-19 04:12:38--  http://images.cocodataset.org/zips/val2017.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 52.217.132.25, 52.217.229.137, 52.216.233.75, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|52.217.132.25|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 815585330 (778M) [application/zip]
Saving to: ‘val2017.zip’


2022-11-19 04:14:51 (5.85 MB/s) - ‘val2017.zip’ saved [815585330/815585330]

--2022-11-19 04:14:52--  http://images.cocodataset

In [None]:
# install gluoncv to help with the dataset
%pip install gluoncv

In [None]:
# imports 
from gluoncv import data, utils
from matplotlib import pyplot as plt
import numpy as np
import opencv as cv2
from random import randint
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import callbacks
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras import losses
from tensorflow.keras import metrics
from tensorflow.python.client import device_lib
import tensorboard


print(tf.__version__)

In [None]:
# check if gpu is available
numGPUs = len(tf.config.experimental.list_physical_devices('GPU'))

print('Num GPUs Available: ', numGPUs)

if numGPUs > 0:
    print(tf.test.gpu_device_name())
    for device in device_lib.list_local_devices():
        print(device.physical_device_desc)

In [None]:
# loading the coco dataset
train_ds = data.COCODetection(splits=['instances_train2017'])
val_ds = data.COCODetection(splits=['instances_val2017'])
# test_ds = data.COCODetection(splits=['test2017']) # TODO: figure this out 

# print the length of the dataset
print('Length of training dataset:', len(train_ds))
print('Length of validation dataset:', len(val_ds))

In [None]:
# plot a random image in the dataset
image, label = train_ds[randint(0, len(train_ds))]
print('Image size (height, width, RGB):', image.shape)
print('Label:', label)
# plot the image
bounding_boxes = label[:, :4]
class_ids = label[:, 4:5]
print('number of objects in the image:', bounding_boxes.shape[0])
print('bounding boxes (# boxes, min x, min y, max x, max y): \n', bounding_boxes)
print('class ids (# boxes, class id): \n', class_ids)
ax = utils.viz.plot_bbox(image.asnumpy(), bounding_boxes, scores=None, labels=class_ids, class_names=train_ds.classes)
plt.show()

2.9.1
