<a href="https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup

Clone repo, install dependencies and check PyTorch and GPU.

In [2]:
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
%pip install -qr requirements.txt  # install dependencies

import torch
from IPython.display import Image, clear_output  # to display images

clear_output()
print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

Setup complete. Using torch 2.3.1+cpu (CPU)


# 1. Inference

`detect.py` runs YOLOv5 inference on a variety of sources, downloading models automatically from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases), and saving results to `runs/detect`. Example inference sources are:

```shell
python detect.py --source 0  # webcam
                          file.jpg  # image 
                          file.mp4  # video
                          path/  # directory
                          path/*.jpg  # glob
                          'https://youtu.be/NUsoVlDFqZg'  # YouTube
                          'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream
```

In [3]:
!python yolov5-master/detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source yolov5-master/data/images/zidane.jpg
Image(filename='yolov5-master/data/images/zidane.jpg', width=600)

[34m[1mdetect: [0mweights=['yolov5s.pt'], source=yolov5-master/data/images/zidane.jpg, data=yolov5-master\data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5-master\runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-10-19 Python-3.11.9 torch-2.3.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 D:\WORKSHOP\code-deep-Learning-book\ch 11\yolov5-master\data\images\zidane.jpg: 384x640 2 persons, 2 ties, 138.0ms
Speed: 1.0ms pre-process, 138.0ms inference, 2.0ms NMS per image at shape (1, 3, 640, 640)
Results saved to [1myolov5-master\runs\detect\exp10[0m


<IPython.core.display.Image object>

In [3]:
from IPython.display import Image
Image(filename='yolov5-master/runs/detect/exp10/zidane.jpg')

<IPython.core.display.Image object>

In [17]:
#from google.colab import files
#uploaded = files.upload()

In [16]:
!python yolov5-master/detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source city2.jpg
Image(filename='city2.jpg', width=600)

[34m[1mdetect: [0mweights=['yolov5s.pt'], source=city2.jpg, data=yolov5-master\data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5-master\runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-10-19 Python-3.11.9 torch-2.3.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 D:\WORKSHOP\code-deep-Learning-book\ch 11\city2.jpg: 448x640 5 persons, 1 traffic light, 3 sports balls, 271.3ms
Speed: 3.0ms pre-process, 271.3ms inference, 6.0ms NMS per image at shape (1, 3, 640, 640)
Results saved to [1myolov5-master\runs\detect\exp8[0m


<IPython.core.display.Image object>

In [4]:
from IPython.display import Image
Image(filename='yolov5-master/runs/detect/exp8/city2.jpg')

<IPython.core.display.Image object>

In [12]:
!python yolov5-master/detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source vdo.mp4

[34m[1mdetect: [0mweights=['yolov5s.pt'], source=vdo.mp4, data=yolov5-master\data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5-master\runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-10-19 Python-3.11.9 torch-2.3.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
video 1/1 (1/443) D:\WORKSHOP\code-deep-Learning-book\ch 11\vdo.mp4: 384x640 2 persons, 1 tv, 1 laptop, 1 mouse, 1 cell phone, 140.3ms
video 1/1 (2/443) D:\WORKSHOP\code-deep-Learning-book\ch 11\vdo.mp4: 384x640 2 persons, 1 tv, 1 laptop, 1 mouse, 1 cell phone, 109.1ms
video 1/1 (3/443) D:\WORKSHOP\code-deep-Learning-book\ch 11\vdo.mp4: 