# Тест YOLO на COCO датасете

1. Загрузите YOLOv5 из репозитория: [https://pytorch.org/hub/ultralytics_yolov5/](https://pytorch.org/hub/ultralytics_yolov5/)
или с [TorchHub](https://pytorch.org/hub/ultralytics_yolov5/)

2.  Используйте самую быструю модель yolov5s
  https://github.com/ultralytics/yolov5/releases

3. убедитесь что модель отрабатывает на произвольной картинке

4. Используя [pycocotools](https://github.com/cocodataset/cocoapi.git)
Загрузите 5 первых изображений с велосипедами


5. Проведите детектирование велосипедов при помощи YOLOv5

6. Выведите картинки с наложенными на них BoundingBox: детектированные YOLO - красным цветом, разметку из COCO - зеленым

7. Дополнително: Оцените точность детектирования по метрике [AP](https://cocodataset.org/#detection-eval) используя  [cocoEval.evaluate](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py)





##  Загрузите YOLOv5 
из репозитория: [https://pytorch.org/hub/ultralytics_yolov5/](https://pytorch.org/hub/ultralytics_yolov5/) или 
с с [TorchHub](https://pytorch.org/hub/ultralytics_yolov5/)

Используйте самую быструю модель **yolov5s** https://github.com/ultralytics/yolov5/releases

In [3]:
import torch

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2023-2-16 Python-3.8.10 torch-1.13.1+cu116 CPU

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...


[Errno 2] No such file or directory: '/usr/local/lib/python3.8/dist-packages/setuptools-57.4.0.dist-info/METADATA'


  0%|          | 0.00/14.1M [00:00<?, ?B/s]


Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 


## Smoke test

Убедитесь что модель отрабатывает на произвольной картинке

Используйте самую быструю модель yolov5s  https://github.com/ultralytics/yolov5/releases

In [4]:
from PIL import Image

!wget -O cat.png https://edunet.kea.su/repo/EduNet-content/L12/out/semantic_segmentation_1.png

img = Image.open("/content/cat.png")

--2023-02-16 10:34:15--  https://edunet.kea.su/repo/EduNet-content/L12/out/semantic_segmentation_1.png
Resolving edunet.kea.su (edunet.kea.su)... 188.44.53.138
Connecting to edunet.kea.su (edunet.kea.su)|188.44.53.138|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 348184 (340K) [image/png]
Saving to: ‘cat.png’


2023-02-16 10:34:16 (654 KB/s) - ‘cat.png’ saved [348184/348184]



In [5]:
display(img)

<PIL.PngImagePlugin.PngImageFile image mode=RGB size=600x450 at 0x7FE9E3DD4BE0>

In [6]:
result = model(img)

In [7]:
result.show()

<PIL.Image.Image image mode=RGB size=600x450 at 0x7FE9E3E66070>

## Загрузка данных

Загрузите 5 первых изображений с велосипедами из COCO

*  Используйте [pycocotools](https://github.com/cocodataset/cocoapi.git)

*  Выведите загруженные изображения



In [8]:
%%bash

rm -rf coco
mkdir coco
cd coco

# mkdir images
# cd images
# wget -c http://images.cocodataset.org/zips/val2017.zip
# unzip -q val2017.zip
# rm val2017.zip
# cd ../

wget -qc http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip -qu annotations_trainval2017.zip
rm annotations_trainval2017.zip

In [9]:
from pycocotools.coco import COCO

annFile = 'coco/annotations/instances_val2017.json'
coco = COCO(annFile)
cat_ids = coco.getCatIds(catNms=['bicycle'])
bicycle_ids = coco.getImgIds(catIds=cat_ids)
bicycle_imgs = coco.loadImgs(bicycle_ids[0:5])

loading annotations into memory...
Done (t=0.86s)
creating index...
index created!


In [10]:
from torchvision.utils import draw_bounding_boxes
from torchvision.ops import box_convert
import skimage.io as io
import torchvision.transforms.functional as F
import numpy as np

images = []
boxes = []
for img_json in bicycle_imgs:
  img = io.imread(img_json['coco_url'])
  img = np.moveaxis(img, 2, 0)
  img_tensor = torch.tensor(img)
  images.append(img_tensor)

  bbox_ids = coco.getAnnIds(imgIds=img_json['id'], catIds=cat_ids)
  bboxes = [ann['bbox'] for ann in coco.loadAnns(bbox_ids)]
  box_tensor = box_convert(torch.tensor(bboxes), 'xywh', 'xyxy')
  boxes.append(box_tensor)

  pic = draw_bounding_boxes(img_tensor, box_tensor, colors="#0f0")
  display(F.to_pil_image(pic))

<PIL.Image.Image image mode=RGB size=640x425 at 0x7FE9E5B7AC70>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9E5B7AC70>

<PIL.Image.Image image mode=RGB size=640x428 at 0x7FEA04640E80>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9E5CFD6D0>

<PIL.Image.Image image mode=RGB size=640x426 at 0x7FE9E5CFD6D0>

## Детектирование

Детектируйте на загруженных картинках велосипеды при помощи YOLO

Затем выведите предсказанные YOLO bbox красным цветом, а разметку из COCO зеленым цветом.

https://pytorch.org/vision/stable/utils.html

### Формат результата

<img src ="https://edunet.kea.su/repo/EduNet-web_dependencies/Exercises/EX12/result_4_task_ex12.png"  width="400">






In [11]:
preds = [] 
scores = []
for img, box in zip(images, boxes):
  result = model(F.to_pil_image(img))
  df = result.pandas().xyxy[0]
  cond = (df.name == 'bicycle') & (df.confidence >= 0.4)
  res_box = df[cond][["xmin","ymin","xmax","ymax"]] 
  box_tensor = torch.tensor(np.array(res_box))
  preds.append(box_tensor)
  scores.append(df[cond].confidence)

  pic = draw_bounding_boxes(img, box, colors="#0f0")
  pic = draw_bounding_boxes(pic, box_tensor, colors="#f00")
  display(F.to_pil_image(pic))

<PIL.Image.Image image mode=RGB size=640x425 at 0x7FE9D6AB9250>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9D6AC9100>

<PIL.Image.Image image mode=RGB size=640x428 at 0x7FE9D6AC9100>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9D6AC9100>

<PIL.Image.Image image mode=RGB size=640x426 at 0x7FE9E3E6D5E0>

## Оценка качества 

 Оцените точность детектирования по метрике mAP используя   [cocoEval.evaluate](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py)


 P.S. Допускается вместо pycocotools использовать https://github.com/qubvel/segmentation_models.pytorch или https://torchmetrics.readthedocs.io/en/stable/

### Формат результата


Значения AP:

```
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.098
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.259
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.102
 ...
```


Теперь нам надо засунуть наши результаты обратно в тот формат, который от нас ждёт COCO. К ниму нигде нет документации, и в нём вообще нужда всякая там сегментациия. Получается проще всего скопировать всё их оригинала, и заменить только `bbox`

In [12]:
from pycocotools.cocoeval import COCOeval

coco_pred = []
for pred, img_json, score in zip(preds, bicycle_imgs, scores):
  ann_ids = coco.getAnnIds(imgIds=img_json['id'], catIds=cat_ids)
  arr = coco.loadAnns(ann_ids)
  count = min(len(arr), pred.shape[0])
  for json, box, confidence in zip(arr[:count], pred, score):
    res = json.copy()
    res["bbox"] = box_convert(box, 'xyxy', 'xywh').tolist()
    res["score"] = confidence
    coco_pred.append(res)

# running evaluation
coco_results = coco.loadRes(coco_pred)
cocoEval = COCOeval(coco, coco_results, 'bbox')
cocoEval.params.imgIds = bicycle_ids[0:5]
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()

Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.04s).
Accumulating evaluation results...
DONE (t=0.08s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.029
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.046
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.033
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.050
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.163
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.014
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.032
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.032
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=10

Кажется результаты не очень хорошие. Наша (не наша) моделька руттино пропускает велосипеды, или путает с ними людей

## А что если модель будет больше

In [13]:
import torch

model = torch.hub.load('ultralytics/yolov5', 'yolov5x', pretrained=True)

Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master
[31m[1mrequirements:[0m YOLOv5 requirement "setuptools>=65.5.1" not found, attempting AutoUpdate...
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/

[31m[1mrequirements:[0m 1 package updated per /root/.cache/torch/hub/ultralytics_yolov5_master/requirements.txt
[31m[1mrequirements:[0m ⚠️ [1mRestart runtime or rerun command for updates to take effect[0m

YOLOv5 🚀 2023-2-16 Python-3.8.10 torch-1.13.1+cu116 CPU

Fusing layers... 
YOLOv5x summary: 444 layers, 86705005 parameters, 0 gradients
Adding AutoShape... 


In [14]:
preds = [] 
scores = []
for img, box in zip(images, boxes):
  result = model(F.to_pil_image(img))
  df = result.pandas().xyxy[0]
  cond = (df.name == 'bicycle') & (df.confidence >= 0.4)
  res_box = df[cond][["xmin","ymin","xmax","ymax"]] 
  box_tensor = torch.tensor(np.array(res_box))
  preds.append(box_tensor)
  scores.append(df[cond].confidence)

  pic = draw_bounding_boxes(img, box, colors="#0f0")
  pic = draw_bounding_boxes(pic, box_tensor, colors="#f00")
  display(F.to_pil_image(pic))

<PIL.Image.Image image mode=RGB size=640x425 at 0x7FE9D6019C70>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9E3E63E20>

<PIL.Image.Image image mode=RGB size=640x428 at 0x7FE9D60199A0>

<PIL.Image.Image image mode=RGB size=640x480 at 0x7FE9E3E6D490>

<PIL.Image.Image image mode=RGB size=640x426 at 0x7FE9D60198E0>

In [15]:
coco_pred = []
for pred, img_json, score in zip(preds, bicycle_imgs, scores):
  ann_ids = coco.getAnnIds(imgIds=img_json['id'], catIds=cat_ids)
  arr = coco.loadAnns(ann_ids)
  count = min(len(arr), pred.shape[0])
  for json, box, confidence in zip(arr[:count], pred, score):
    res = json.copy()
    res["bbox"] = box_convert(box, 'xyxy', 'xywh').tolist()
    res["score"] = confidence
    coco_pred.append(res)

# running evaluation
coco_results = coco.loadRes(coco_pred)
cocoEval = COCOeval(coco, coco_results, 'bbox')
cocoEval.params.imgIds = bicycle_ids[0:5]
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()

Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.04s).
Accumulating evaluation results...
DONE (t=0.11s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.036
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.060
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.030
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.031
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.067
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.155
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.014
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.039
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.039
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=10

Кажется стало ещё хуже... Если хотим лучшие результаты, надо наверное уже брать YOLO8