# Object Detection Challenge
Below is a project I completed as part of <b> Data-Driven Science's Object Detection Challenge</b>.

## Step 1: Setup

Below are some key set-up steps from <b>Data-Driven Science</b>:

In [None]:
!git clone https://github.com/ultralytics/yolov5
%cd yolov5
%pip install -qr requirements.txt

Now I will import the pytorch package and the utils package.

In [None]:
import torch
import utils
display = utils.notebook_init()

In [None]:
##import the coco 128 dataset
!bash data/scripts/get_coco128.sh


Now that I've set myself up with packages, I need to ensure that I've imported the Coco 128 dataset of images. I will do so by displaying the first image in the dataset, which should be some vegetables, according to <b>Data-Driven Science</b>.

In [None]:
import matplotlib.pyplot as plt 
import matplotlib.image as img
import os

##get the list of all files in the directory
file_list = os.listdir("/etc/noteable/project/datasets/coco128/images/train2017")

##use matplotlib to display the image
testImage = img.imread(f"/etc/noteable/project/datasets/coco128/images/train2017/{file_list[0]}")

plt.imshow(testImage)

## Step 2: Data Preparation
I'll now move into exploring the dataset to evaluate the quality of it.

To explain the structure of the dataset, I will answer a few questions about it.

I've summarized this exploration in the table below, and have followed with further details:
<table>
  <tr>
    <th>Question</th>
    <th>Answer</th>
  </tr>
  <tr>
    <td>What is the file type of the dataset?</td>
    <td>Each image is a jpeg</td>
  </tr>
  <tr>
    <td>What is the size of your dataset? (You’re on the right track if you find 128 images in the train set)</td>
    <td>Francisco Chang</td>
  </tr>
  <tr>
    <td>What are the dimensions of an image?</td>
    <td>Francisco Chang</td>
  </tr>
  <tr>
    <td>How many classes are there?</td>
    <td>Francisco Chang</td>
  </tr>
  <tr>
    <td>How are the labels formatted and stored in the data?</td>
    <td>Francisco Chang</td>
  </tr>

### Data Exploration
- What is the file type of the dataset?

In [None]:
#print out the first 10 files in the directory to understand what the file type of the dataset is
print(file_list[:10])


From the output above, you can see that each image is a jpeg.

- What is the size of the dataset?

In [None]:
# assign size
size = 0
 
# assign folder path
Folderpath = '/etc/noteable/project/datasets/coco128/images/train2017/'

# get size
for path, dirs, files in os.walk(Folderpath):
    for f in files:
        fp = os.path.join(path, f)
        size += os.path.getsize(fp)
 
# display size
print("Folder size: " + str(size))


The folder size of images is 6930964 bytes.

- What are the dimensions of an image?


In [None]:
## use the matplotlib library to get the dimensions of an image
img1 = plt.imread(f'/etc/noteable/project/datasets/coco128/images/train2017/{file_list[0]}')
width, height = img1.shape[:2]
print(width,height)

img2 = plt.imread(f'/etc/noteable/project/datasets/coco128/images/train2017/{file_list[1]}')
width2, height2 = img2.shape[:2]
print(width2,height2)


The images are 480 pixels x 640 pixels.

- How many classes are there?

In [None]:
prefix="/etc/noteable/project/datasets/coco128/labels/train2017/"
file_list = os.listdir(prefix)

classes=[]

for file in file_list:
    with open(prefix+file,"r") as f:
        contents=f.read()
        classes.append(contents)
print(len(set(classes)))

There are 129 unique classess. The labels are formatted and stored in text files in the data.

# Step 3: Image Augmentation
Now, I will show an example of image augmentation for one of the images in the dataset.

In [None]:
from PIL import Image
from pathlib import Path
import numpy as np
import torchvision.transforms as T

In [None]:
plt.rcParams["savefig.bbox"] = 'tight'
torch.manual_seed(0) #for randomly applied transforms

In [None]:
##use matplotlib to display the original image
img_prefix="/etc/noteable/project/datasets/coco128/images/train2017/"
img_list = os.listdir(img_prefix)

##randomly select an image
num = np.random.randint(1,101)
testImage = img.imread(f"{img_prefix}{img_list[num]}")

#use matplotlib to show the image
plt.imshow(testImage)

In [None]:
##convert the image to a numpy array
arr=np.array(testImage)

#we cannot resize the original image to be 128 x 128 because it does not own its data, but this is how it would work:
# resized_image=testImage.resize((128,128))
# resized_arr=np.array(resized_image)

In [None]:
#Grayscale the image
test_img_pil=Image.fromarray(testImage)
grayscale_image=T.Grayscale()(test_img_pil)

#display the grayscaled image
plt.imshow(grayscale_image)

In [None]:
#Rotate the image
rotate=T.RandomRotation(90)

rotated_image=rotate(test_img_pil)

#display the rotated image
plt.imshow(rotated_image)

In [None]:
#Randomly flip the image
flip=T.RandomHorizontalFlip()

flipped_image=flip(test_img_pil)

#display the rotated image
plt.imshow(flipped_image)

In [None]:
#normalize the original image
normalize = T.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
tensor_image = T.ToTensor()(test_img_pil)
normalized_img_tensor=normalize(tensor_image)
normalized_PIL=T.ToPILImage()(normalized_img_tensor) # convert back to a PIL image 

plt.imshow(normalized_PIL)

In [None]:
#Create a pipeline of 3 transformations and display final image
transform1 = T.Resize([224,224])
transform2 = T.RandomCrop([128,128])
transform3 = T.RandomHorizontalFlip()

pipeline_transform= torch.nn.Sequential(transform1, transform2, transform3)

transformed_img=pipeline_transform(test_img_pil)
plt.imshow(transformed_img)