# Semana Acadêmica Semana Acadêmica do Curso de Informática IFSul (SACI) 2024
Minicurso: IA em ação: detecção de objetos com YOLO

Ministrante: Prof. Sandro Camargo

Para abrir no seu google colab, [clique aqui](https://colab.research.google.com/github/Sandrocamargo/courses/blob/main/2024%20SACI/Oficina_YOLOv5.ipynb).

*You only look once* (YOLO), ou Você Olha Apenas uma Vez, é o estado da arte em sistemas de detecção de objetos em tempo real. Em uma GPU Pascal Titan X são processadas imagens de 30 FPS e tem um mAP de 57.9% na base de testes COCO.

O algoritmo YOLO foi treinado com a base *Common Objects in Context* (COCO) <https://cocodataset.org/>

YOLO é um dos mais utilizados modelos de segmentação de imagens e detecção de objetos. O modelo YOLO foi desenvolvido por Joseph Redmon e Ali Farhadi na Universidade de Washington. Lançado em 2015, o YOLO ganhou popularidade rapidamente devido à sua velocidade e acurácia.

A última versão é o YOLOv10.

# Configuração de ambiente

O primeiro passo é cloar o [repositório](https://github.com/ultralytics/yolov5) GitHub, instalar as [dependências](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) e verificar o PyTorch e a GPU.

In [1]:
!git clone https://github.com/ultralytics/yolov5  # clone
%cd yolov5
%pip install -qr requirements.txt  # install

import torch
import utils
display = utils.notebook_init()  # checks

YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)


Setup complete ✅ (2 CPUs, 12.7 GB RAM, 32.9/112.6 GB disk)


In [2]:
from google.colab import drive
drive.mount('/content/drive',force_remount=True)

Mounted at /content/drive


# 1. Detectar objetos

`detect.py` executa a inferência YOLOv5 inferenceem várias fontes de imagens ou de vídeos, baixa os modelos automaticamente do site [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases) e armazena os resultados na pasta `runs/detect` do seu ambiente do google colab. Exemplos de fontes de inferência são:

```shell
python detect.py --source 0  # webcam
                          img.jpg  # image
                          vid.mp4  # video
                          screen  # screenshot
                          path/  # directory
                         'path/*.jpg'  #
```

In [3]:
# para executar sobre uma imagem que está no seu ambiente colab, informe o caminho do arquivo em source
!python detect.py --conf 0.3 --source 'data/images/bus.jpg' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=data/images/bus.jpg, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100% 14.1M/14.1M [00:00<00:00, 120MB/s] 

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /content/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 28.8ms
Speed: 0.5ms pre-process, 28.8ms inference, 844.2ms NMS per image at shape (1, 3, 640, 640)
Results save

In [4]:
# para executar sobre várias imagens que estão no seu ambiente colab, informe o caminho da pasta em source
!python detect.py --conf 0.3 --source 'data/images/' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=data/images/, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 29.2ms
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 2 persons, 1 tie, 34.8ms
Speed: 0.5ms pre-process, 32.0ms inference, 356.8ms NMS per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/exp2[0m


In [5]:
# Para executar sobre alguma imagem que está disponível na internet, informe o link direto
!python detect.py --conf 0.3 --source 'https://revistanews.com.br/wp-content/uploads/2022/09/Desfile-farroupilha-NH.jpg' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=https://revistanews.com.br/wp-content/uploads/2022/09/Desfile-farroupilha-NH.jpg, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
Downloading https://revistanews.com.br/wp-content/uploads/2022/09/Desfile-farroupilha-NH.jpg to Desfile-farroupilha-NH.jpg...
100% 54.0k/54.0k [00:00<00:00, 239kB/s]
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /content/yolov5/Desfile-farroupilha-NH.jpg: 384x640 6 persons, 2 cars, 7 horses, 31.6ms
Sp

In [6]:
!python detect.py --conf 0.3 --source 'https://estado.rs.gov.br/upload/recortes/201707/19202713_1174768_GD.jpg' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=https://estado.rs.gov.br/upload/recortes/201707/19202713_1174768_GD.jpg, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
Downloading https://estado.rs.gov.br/upload/recortes/201707/19202713_1174768_GD.jpg to 19202713_1174768_GD.jpg...
100% 112k/112k [00:00<00:00, 276kB/s]
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /content/yolov5/19202713_1174768_GD.jpg: 384x640 4 cows, 33.1ms
Speed: 0.5ms pre-process, 33.1ms inference, 514.0

Para executar sobre um video do youtube, faça o download do video e o armazene no seu ambiente.

In [7]:
!pip install yt-dlp
!yt-dlp -f best -o "/content/video0.mp4" "https://www.youtube.com/watch?v=JpPP7OQQasA"

Collecting yt-dlp
  Downloading yt_dlp-2024.8.6-py3-none-any.whl.metadata (170 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/170.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━[0m [32m153.6/170.1 kB[0m [31m4.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m170.1/170.1 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting brotli (from yt-dlp)
  Downloading Brotli-1.1.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (5.5 kB)
Collecting mutagen (from yt-dlp)
  Downloading mutagen-1.47.0-py3-none-any.whl.metadata (1.7 kB)
Collecting pycryptodomex (from yt-dlp)
  Downloading pycryptodomex-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting websockets>=12.0 (from yt-dlp)
  Downloading websockets-13.0.1-cp310-cp310-manylinux_2_5_x86_64.manyli

In [8]:
!python detect.py --conf 0.3 --source '/content/video0.mp4' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=/content/video0.mp4, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
video 1/1 (1/4466) /content/video0.mp4: 384x640 1 person, 10 cars, 1 bus, 1 traffic light, 1 clock, 31.9ms
video 1/1 (2/4466) /content/video0.mp4: 384x640 1 person, 10 cars, 1 bus, 1 traffic light, 1 clock, 8.0ms
video 1/1 (3/4466) /content/video0.mp4: 384x640 1 person, 10 cars, 1 bus, 1 traffic light, 1 clock, 7.9ms
video 1/

In [12]:
!mv /content/yolov5/runs/detect/exp5 /content/drive/MyDrive

mv: inter-device move failed: '/content/yolov5/runs/detect/exp5' to '/content/drive/MyDrive/exp5'; unable to remove target: Directory not empty


In [10]:
!yt-dlp -f best -o "/content/video1.mp4" "https://youtu.be/z0AfpPI_ecI"

         To let yt-dlp download and merge the best available formats, simply do not pass any format selection.
[youtube] Extracting URL: https://youtu.be/z0AfpPI_ecI
[youtube] z0AfpPI_ecI: Downloading webpage
[youtube] z0AfpPI_ecI: Downloading ios player API JSON
[youtube] z0AfpPI_ecI: Downloading web creator player API JSON
[youtube] z0AfpPI_ecI: Downloading m3u8 information
[info] z0AfpPI_ecI: Downloading 1 format(s): 18
[download] Destination: /content/video1.mp4
[K[download] 100% of   16.60MiB in [1;37m00:00:00[0m at [0;32m25.36MiB/s[0m


In [15]:
!yt-dlp -f best -o "/content/video2.mp4" "https://www.youtube.com/watch?v=AMPxKJdmnR8"

         To let yt-dlp download and merge the best available formats, simply do not pass any format selection.
[youtube] Extracting URL: https://www.youtube.com/watch?v=AMPxKJdmnR8
[youtube] AMPxKJdmnR8: Downloading webpage
[youtube] AMPxKJdmnR8: Downloading ios player API JSON
[youtube] AMPxKJdmnR8: Downloading web creator player API JSON
[youtube] AMPxKJdmnR8: Downloading m3u8 information
[info] AMPxKJdmnR8: Downloading 1 format(s): 18
[download] Destination: /content/video2.mp4
[K[download] 100% of    5.60MiB in [1;37m00:00:00[0m at [0;32m5.82MiB/s[0m


In [16]:
# Executando
!python detect.py --conf 0.3 --source '/content/video2.mp4' --data data/coco.yaml

[34m[1mdetect: [0mweights=yolov5s.pt, source=/content/video2.mp4, data=data/coco.yaml, imgsz=[640, 640], conf_thres=0.3, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-366-gf7322921 Python-3.10.12 torch-2.4.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
video 1/1 (1/2071) /content/video2.mp4: 384x640 1 person, 7 cars, 3 buss, 1 truck, 1 tv, 30.3ms
video 1/1 (2/2071) /content/video2.mp4: 384x640 1 person, 7 cars, 3 buss, 1 truck, 1 tv, 8.0ms
video 1/1 (3/2071) /content/video2.mp4: 384x640 6 cars, 2 buss, 2 trucks, 1 tv, 7.9ms
video 1/1 (4/2071) /content/video2.mp4: 384x640 6 

In [None]:
# Executando
!python detect.py --conf 0.3 --source '/content/video1.mp4' --data data/coco.yaml

[1;30;43mA saída de streaming foi truncada nas últimas 5000 linhas.[0m
video 1/1 (7830/12827) /content/video1.mp4: 384x640 (no detections), 8.2ms
video 1/1 (7831/12827) /content/video1.mp4: 384x640 (no detections), 7.5ms
video 1/1 (7832/12827) /content/video1.mp4: 384x640 (no detections), 5.9ms
video 1/1 (7833/12827) /content/video1.mp4: 384x640 (no detections), 5.9ms
video 1/1 (7834/12827) /content/video1.mp4: 384x640 (no detections), 5.6ms
video 1/1 (7835/12827) /content/video1.mp4: 384x640 (no detections), 5.9ms
video 1/1 (7836/12827) /content/video1.mp4: 384x640 (no detections), 5.8ms
video 1/1 (7837/12827) /content/video1.mp4: 384x640 (no detections), 6.2ms
video 1/1 (7838/12827) /content/video1.mp4: 384x640 (no detections), 6.0ms
video 1/1 (7839/12827) /content/video1.mp4: 384x640 (no detections), 7.7ms
video 1/1 (7840/12827) /content/video1.mp4: 384x640 (no detections), 7.2ms
video 1/1 (7841/12827) /content/video1.mp4: 384x640 (no detections), 5.9ms
video 1/1 (7842/12827) /con

In [None]:
!mv /content/yolov5/runs/detect/exp6 /content/drive/MyDrive

In [None]:
!python detect.py --conf 0.3 --source screen --data data/coco.yaml

# Appendix

Additional content below.

# Treinando o YOLO para novos problemas

Processo se inicia pela rotulagem, passa pelo retreinamento e finalmente chega à fase de uso.

A rotulagem é feita através de ferramentas como o LabelImg.