# Open In Colab



[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/HelloJahid/Project-NID/blob/main/NID-%20Object%20Detection/NID-Object%20Detection.ipynb)


# Detecto 
Detecto is a Python package that allows you to build fully-functioning computer vision and object detection models.

Detecto is a Python library built on top of PyTorch that simplifies the process of building object detection models. The library acts as a lightweight package that reduces the amount of code needed to initialize models, apply transfer learning on custom datasets, and run inference on images and videos.

Detecto’s Model class is built on a Faster R-CNN ResNet-50 FPN architecture from torchvision’s models subpackage, which is pre-trained on the COCO 2017 dataset. By default, it can detect about 80 different objects such as fruits, animals, vehicles, kitchen appliances, and more.



First, mount your drive to give the notebook access to your Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [None]:
! dir '/content/drive/My Drive/Colab Notebooks/Object Detection/Dectecto/own/Data'

Data.zip  peoples.zip


In [None]:
import zipfile

zip_filepath = '/content/drive/My Drive/Colab Notebooks/Object Detection/Dectecto/own/Data/peoples.zip'

zip_ref = zipfile.ZipFile(zip_filepath, 'r')
file = zip_ref.extractall("./Data")
zip_ref.close()

print("ok")

ok


In [None]:
! dir  '/content/Data'

images	train_labels  val_labels


In [None]:
! ls

Data  drive  sample_data


In [None]:
# Note: if it states you must restart the runtime in order to use a
# newly installed version of a package, you do NOT need to do this. 
!pip install detecto

Collecting detecto
  Downloading https://files.pythonhosted.org/packages/a1/7f/f00c24877398ada1e56a401bcb8e6fa6ced476ddb2c73f310b0b0e89495d/detecto-1.1.4-py3-none-any.whl
Installing collected packages: detecto
Successfully installed detecto-1.1.4


Import everything we need in the following code block:

In [None]:
import torch
import torchvision
import matplotlib.pyplot as plt

from torchvision import transforms
from detecto import core, utils, visualize

To check that everything's working, we can try reading in one of the images from our images folder. 

In [None]:
# image = utils.read_image('/content/Data/peoples/images/DSC_0006.JPG')
# plt.imshow(image)
# plt.show()

How cute! Now, we're ready to create our dataset and train our model. However, before doing so, it's a bit slow working with hundreds of individual XML label files, so we should convert them into a single CSV file to save time later down the line. 

In [None]:
# Do this twice: once for our training labels and once for our validation labels
utils.xml_to_csv('/content/Data/peoples/train_labels', 'train.csv')
utils.xml_to_csv('/content/Data/peoples/val_labels', 'val.csv')

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
0,DSC_0003.JPG,3000,4496,h,1024,297,2129,1515
1,DSC_0114.JPG,4496,3000,j,2007,120,2513,685
2,IMG_20200131_124203.jpg,3120,4160,h,1359,1349,1820,1782
3,IMG_20200204_091655.jpg,4160,3120,j,1674,612,2507,1312
4,IMG_20200131_110220.jpg,3968,2976,h,1468,659,1937,1169
...,...,...,...,...,...,...,...,...
56,DSC_0105.JPG,3000,4496,h,1371,597,1994,1191
57,IMG_20200131_122719.jpg,4608,3456,j,854,1243,2248,2655
58,IMG_20200131_122719.jpg,4608,3456,h,2285,1062,4129,3018
59,IMG_20200204_162040.jpg,3456,4608,j,928,575,2665,2169


Below, we create our dataset, applying a couple of transforms beforehand. These are optional, but they can be useful for augmenting your dataset without gathering more data. 

In [None]:
# Specify a list of transformations for our dataset to apply on our images
transform_img = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize(800),
    transforms.RandomHorizontalFlip(0.5),
    transforms.ToTensor(),
    utils.normalize_transform(),
])

dataset = core.Dataset('train.csv', '/content/Data/peoples/images/', transform=transform_img)

# dataset[i] returns a tuple containing our transformed image and
# and a dictionary containing label and box data
image, target = dataset[4]

# Show our image along with the box. Note: it may
# be colored oddly due to being normalized by the 
# dataset and then reverse-normalized for plotting
visualize.show_labeled_image(image, target['boxes'], target['labels'])

Finally, let's train our model! First, we create a DataLoader over our dataset to specify how we feed the images into our model. We also use our validation dataset to track the accuracy of the model throughout training. 

In [None]:
# Create our validation dataset
val_dataset = core.Dataset('val.csv', '/content/Data/peoples/images/')

# Create the loader for our training dataset
loader = core.DataLoader(dataset, batch_size=2, shuffle=True)

# Create our model, passing in all unique classes we're predicting
# Note: make sure these match exactly with the labels in the XML/CSV files!
model = core.Model(['pen', 'mouse', 'remote'])

# Train the model! This step can take a while, so make sure you
# the GPU is turned on in Edit -> Notebook settings
losses = model.fit(loader, val_dataset, epochs=10, verbose=True)

# Plot the accuracy over time
plt.plot(losses)
plt.show()

Let's see how well our model does on a couple images from our validation set:

In [None]:
images = []
# Create a list of images 0, 5, 10, ... 40 from val_dataset
for i in range(0, 45, 5):
    image, _ = val_dataset[i]
    images.append(image)

# Plot a 3x3 grid of the model's predictions on our 9 images
visualize.plot_prediction_grid(model, images, dim=(3, 3), figsize=(16, 12))

Overall, the model works as expected; in most of the images, it outputs high confidence values for the correct breed of dog shown. With a bit more fine-tuning, we could make it even better!

## Conclusion

Thanks for making it this far through the demo!

This is as far as the demo goes, but a great next step would be seeing how well the model works on a live video of Chihuahuas and Golden Retrievers in the same frame at the same time. To learn more about Detecto, be sure to check out the [Quickstart guide](https://detecto.readthedocs.io/en/latest/usage/quickstart.html), [Further Usage guide](https://detecto.readthedocs.io/en/latest/usage/further-usage.html), and [API docs](https://detecto.readthedocs.io/en/latest/api.html)!