$$
\newcommand{\mat}[1]{\boldsymbol {#1}}
\newcommand{\mattr}[1]{\boldsymbol {#1}^\top}
\newcommand{\matinv}[1]{\boldsymbol {#1}^{-1}}
\newcommand{\vec}[1]{\boldsymbol {#1}}
\newcommand{\vectr}[1]{\boldsymbol {#1}^\top}
\newcommand{\rvar}[1]{\mathrm {#1}}
\newcommand{\rvec}[1]{\boldsymbol{\mathrm{#1}}}
\newcommand{\diag}{\mathop{\mathrm {diag}}}
\newcommand{\set}[1]{\mathbb {#1}}
\newcommand{\cset}[1]{\mathcal{#1}}
\newcommand{\norm}[1]{\left\lVert#1\right\rVert}
\newcommand{\pderiv}[2]{\frac{\partial #1}{\partial #2}}
\newcommand{\bb}[1]{\boldsymbol{#1}}
\newcommand{\E}[2][]{\mathbb{E}_{#1}\left[#2\right]}
\newcommand{\ip}[3]{\left<#1,#2\right>_{#3}}
\newcommand{\given}[]{\,\middle\vert\,}
\newcommand{\DKL}[2]{\cset{D}_{\text{KL}}\left(#1\,\Vert\, #2\right)}
\newcommand{\grad}[]{\nabla}
$$

# Part 1: Mini-Project
<a id=part3></a>

In this part you'll implement a small comparative-analysis project, heavily based on the materials from the tutorials and homework.

### Guidelines

- You should implement the code which displays your results in this notebook, and add any additional code files for your implementation in the `project/` directory. You can import these files here, as we do for the homeworks.
- Running this notebook should not perform any training - load your results from some output files and display them here. The notebook must be runnable from start to end without errors.
- You must include a detailed write-up (in the notebook) of what you implemented and how. 
- Explain the structure of your code and how to run it to reproduce your results.
- Explicitly state any external code you used, including built-in pytorch models and code from the course tutorials/homework.
- Analyze your numerical results, explaining **why** you got these results (not just specifying the results).
- Where relevant, place all results in a table or display them using a graph.
- Before submitting, make sure all files which are required to run this notebook are included in the generated submission zip.
- Try to keep the submission file size under 10MB. Do not include model checkpoint files, dataset files, or any other non-essentials files. Instead include your results as images/text files/pickles/etc, and load them for display in this notebook. 

## Object detection on TACO dataset

TACO is a growing image dataset of waste in the wild. It contains images of litter taken under diverse environments: woods, roads and beaches.

<center><img src="imgs/taco.png" /></center>


you can read more about the dataset here: https://github.com/pedropro/TACO

and can explore the data distribution and how to load it from here: https://github.com/pedropro/TACO/blob/master/demo.ipynb


The stable version of the dataset that contain 1500 images and 4787 annotations exist in `datasets/TACO-master`
You do not need to download the dataset.


### Project goals:

* You need to perform Object Detection task, over 7 of the dataset.
* The annotation for object detection can be downloaded from here: https://github.com/wimlds-trojmiasto/detect-waste/tree/main/annotations.
* The data and annotation format is like the COCOAPI: https://github.com/cocodataset/cocoapi (you can find a notebook of how to perform evalutation using it here: https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb)
(you need to install it..)
* if you need a beginner guild for OD in COCOAPI, you can read and watch this link: https://www.neuralception.com/cocodatasetapi/ 

### What do i need to do?

* **Everything is in the game!** as long as your model does not require more then 8 GB of memory and you follow the Guidelines above.


### What does it mean?
* you can use data augmentation, rather take what's implemented in the directory or use external libraries such as https://albumentations.ai/ (notice that when you create your own augmentations you need to change the annotation as well)
* you can use more data if you find it useful (for examples, reviwew https://github.com/AgaMiko/waste-datasets-review)


### What model can i use?
* Whatever you want!
you can review good models for the coco-OD task as a referance:
SOTA: https://paperswithcode.com/sota/object-detection-on-coco
Real-Time: https://paperswithcode.com/sota/real-time-object-detection-on-coco
Or you can use older models like YOLO-V3 or Faster-RCNN
* As long as you have a reason (complexity, speed, preformence), you are golden.

### Tips for a good grade:
* start as simple as possible. dealing with APIs are not the easiest for the first time and i predict that this would be your main issue. only when you have a running model that learn, you can add learning tricks.
* use the visualization of a notebook, as we did over the course, check that your input actually fitting the model, the output is the desired size and so on.
* It is recommanded to change the images to a fixed size, like shown in here :https://github.com/pedropro/TACO/blob/master/detector/inspect_data.ipynb
* Please adress the architecture and your loss function/s in this notebook. if you decided to add some loss component like the Focal loss for instance, try to show the results before and after using it.
* Plot your losses in this notebook, any evaluation metric can be shown as a function of time and possibe to analize per class.

Good luck!

## Implementation

**TODO**: This is where you should write your explanations and implement the code to display the results.
See guidelines about what to include in this section.

In [1]:
from roboflow import Roboflow
from ultralytics import YOLO
import project.model_training as mt

In [3]:
# run this block only on a first run 
# download the datasets
train_set, test_set = mt.load_datasets()

# edit the test_set to fit future evaluations
# %run project/edit_test.py

# store datasets for future usage
# %store train_set
# %store test_set
# %store

loading Roboflow workspace...
loading Roboflow project...
Dependency ultralytics<=8.0.20 is required but found version=8.0.59, to fix: `pip install ultralytics<=8.0.20`
Downloading Dataset Version Zip in TACO_train_only-1 to yolov8: 100% [126861221 / 126861221] bytes


Extracting Dataset Version Zip to TACO_train_only-1 in yolov8:: 100%|████████████████████████████████████████████████████| 2373/2373 [00:10<00:00, 226.23it/s]

loading Roboflow workspace...





loading Roboflow project...
Dependency ultralytics<=8.0.20 is required but found version=8.0.59, to fix: `pip install ultralytics<=8.0.20`
Downloading Dataset Version Zip in TACO_test_set-1 to yolov8: 100% [33459346 / 33459346] bytes


Extracting Dataset Version Zip to TACO_test_set-1 in yolov8:: 100%|████████████████████████████████████████████████████████| 640/640 [00:02<00:00, 281.45it/s]


In [8]:
# if youve already downloaded the datasets in the past, you can restore them here:
%store -r
%store

Stored variables and their in-db values:
test_set              -> <roboflow.core.dataset.Dataset object at 0x7f3b208
train_set             -> <roboflow.core.dataset.Dataset object at 0x7f39fa3


In [None]:
# in the first run, download the model
model, train_res = mt.set_model(train_set, "yolov8l.pt")

# in future runs, load the existing model
# model = YOLO("runs/detect/train21/weights/best.pt") # edit path

Ultralytics YOLOv8.0.59 🚀 Python-3.8.12 torch-1.10.1 CUDA:0 (NVIDIA GeForce GTX 1080 Ti, 11178MiB)
[34m[1myolo/engine/trainer: [0mtask=detect, mode=train, model=yolov8l.pt, data=/home/muradek/Deep_project/TACO_train_only-1/data.yaml, epochs=10, patience=5, batch=2, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=None, exist_ok=False, pretrained=False, optimizer=SGD, verbose=True, seed=0, deterministic=True, single_cls=False, image_weights=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, hide_labels=False, hide_conf=False, vid_stride=1, line_thickness=3, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, boxes=True, format=torchscript, ke

In [6]:
# predict with the model on the test set for evaluation 
test_res = mt.evaluate_model(test_set, model)

started evaluating


Ultralytics YOLOv8.0.59 🚀 Python-3.8.12 torch-1.10.1 CUDA:0 (NVIDIA GeForce GTX 1080 Ti, 11178MiB)
Model summary (fused): 268 layers, 43612005 parameters, 0 gradients, 164.8 GFLOPs
[34m[1mval: [0mScanning /home/muradek/Deep_project/TACO_test_set-1/valid/labels... 317 images, 0 backgrounds, 0 corrupt: 100%|██████████| 317/317 [00:00[0m
[34m[1mval: [0mNew cache created: /home/muradek/Deep_project/TACO_test_set-1/valid/labels.cache
This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 20/20 [00:24<00:00,  1.22s/it]
                   all        317        957     0.0269      0.464     0

finished evaluating
