### This notebook is optionally accelerated with a GPU runtime.
### If you would like to use this acceleration, please select the menu option "Runtime" -> "Change runtime type", select "Hardware Accelerator" -> "GPU" and click "SAVE"

----------------------------------------------------------------------

# YOLOv5

*Author: Ultralytics*

**YOLOv5 in PyTorch > ONNX > CoreML > TFLite**

_ | _
- | -
![alt](https://pytorch.org/assets/images/ultralytics_yolov5_img1.jpg) | ![alt](https://pytorch.org/assets/images/ultralytics_yolov5_img2.png)


## Before You Start

Start from a **Python>=3.8** environment with **PyTorch>=1.7** installed. To install PyTorch see [https://pytorch.org/get-started/locally/](https://pytorch.org/get-started/locally/). To install YOLOv5 dependencies:

In [None]:
%%bash
pip install -qr https://raw.githubusercontent.com/ultralytics/yolov5/master/requirements.txt  # install dependencies
pip install ninja
pip install qtorch-plus==0.2.0

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 898.8/898.8 kB 59.9 MB/s eta 0:00:00
Collecting ninja
  Downloading ninja-1.11.1.2-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (5.3 kB)
Downloading ninja-1.11.1.2-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (422 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 422.9/422.9 kB 16.3 MB/s eta 0:00:00
Installing collected packages: ninja
Successfully installed ninja-1.11.1.2
Collecting qtorch-plus==0.2.0
  Downloading qtorch_plus-0.2.0-py3-none-any.whl.metadata (4.2 kB)
Downloading qtorch_plus-0.2.0-py3-none-any.whl (34 kB)
Installing collected packages: qtorch-plus
Successfully installed qtorch-plus-0.2.0


In [None]:
import torch
import qtorch_plus
from qtorch_plus.quant import posit_quantize
import numpy as np
a = np.arange(-20,20,2.5)
a = torch.tensor(a, dtype=torch.float)
b = posit_quantize(a,nsize=4,es=1)
print (a)
print ("cpu quantize ", b)
a = a.cuda()
b = posit_quantize(a,nsize=4,es=1)
print ("cuda quantize ", b)

Using /root/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
Creating extension directory /root/.cache/torch_extensions/py310_cu121/quant_cpu...
Emitting ninja build file /root/.cache/torch_extensions/py310_cu121/quant_cpu/build.ninja...
Building extension module quant_cpu...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module quant_cpu...
Using /root/.cache/torch_extensions/py310_cu121 as PyTorch extensions root...
Creating extension directory /root/.cache/torch_extensions/py310_cu121/quant_cuda...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py310_cu121/quant_cuda/build.ninja...
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
Building extension module quant_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module quant_cud

tensor([-20.0000, -17.5000, -15.0000, -12.5000, -10.0000,  -7.5000,  -5.0000,
         -2.5000,   0.0000,   2.5000,   5.0000,   7.5000,  10.0000,  12.5000,
         15.0000,  17.5000])
cpu quantize  tensor([-16., -16., -16., -16., -16.,  -4.,  -4.,  -2.,   0.,   2.,   4.,   4.,
         16.,  16.,  16.,  16.])
cuda quantize  tensor([-16., -16., -16., -16., -16.,  -4.,  -4.,  -2.,   0.,   2.,   4.,   4.,
         16.,  16.,  16.,  16.], device='cuda:0')


## Model Description

<img width="800" alt="YOLOv5 Model Comparison" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/model_comparison.png">
&nbsp;

[YOLOv5](https://ultralytics.com/yolov5) 🚀 is a family of compound-scaled object detection models trained on the COCO dataset, and includes simple functionality for Test Time Augmentation (TTA), model ensembling, hyperparameter evolution, and export to ONNX, CoreML and TFLite.

|Model |size<br><sup>(pixels) |mAP<sup>val<br>0.5:0.95 |mAP<sup>test<br>0.5:0.95 |mAP<sup>val<br>0.5 |Speed<br><sup>V100 (ms) | |params<br><sup>(M) |FLOPS<br><sup>640 (B)
|---   |---  |---        |---         |---             |---                |---|---              |---
|[YOLOv5s6](https://github.com/ultralytics/yolov5/releases)   |1280 |43.3     |43.3     |61.9     |**4.3** | |12.7  |17.4
|[YOLOv5m6](https://github.com/ultralytics/yolov5/releases)   |1280 |50.5     |50.5     |68.7     |8.4     | |35.9  |52.4
|[YOLOv5l6](https://github.com/ultralytics/yolov5/releases)   |1280 |53.4     |53.4     |71.1     |12.3    | |77.2  |117.7
|[YOLOv5x6](https://github.com/ultralytics/yolov5/releases)   |1280 |**54.4** |**54.4** |**72.0** |22.4    | |141.8 |222.9
|[YOLOv5x6](https://github.com/ultralytics/yolov5/releases) TTA |1280 |**55.0** |**55.0** |**72.0** |70.8 | |-  |-

<details>
  <summary>Table Notes (click to expand)</summary>

  * AP<sup>test</sup> denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results denote val2017 accuracy.
  * AP values are for single-model single-scale unless otherwise noted. **Reproduce mAP** by `python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65`
  * Speed<sub>GPU</sub> averaged over 5000 COCO val2017 images using a GCP [n1-standard-16](https://cloud.google.com/compute/docs/machine-types#n1_standard_machine_types) V100 instance, and includes FP16 inference, postprocessing and NMS. **Reproduce speed** by `python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45`
  * All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
  * Test Time Augmentation ([TTA](https://github.com/ultralytics/yolov5/issues/303)) includes reflection and scale augmentation. **Reproduce TTA** by `python test.py --data coco.yaml --img 1536 --iou 0.7 --augment`

</details>

<p align="left"><img width="800" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/model_plot.png"></p>

<details>
  <summary>Figure Notes (click to expand)</summary>

  * GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS.
  * EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.
  * **Reproduce** by `python test.py --task study --data coco.yaml --iou 0.7 --weights yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt`

</details>

## Load From PyTorch Hub


This example loads a pretrained **YOLOv5s** model and passes an image for inference. YOLOv5 accepts **URL**, **Filename**, **PIL**, **OpenCV**, **Numpy** and **PyTorch** inputs, and returns detections in **torch**, **pandas**, and **JSON** output formats. See our [YOLOv5 PyTorch Hub Tutorial](https://github.com/ultralytics/yolov5/issues/36) for details.

In [None]:
import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Images
imgs = ['https://ultralytics.com/images/zidane.jpg']  # batch of images

# Inference
results = model(imgs)

# Results
results.print()
results.save()  # or .show()

results.xyxy[0]  # img1 predictions (tensor)
results.pandas().xyxy[0]  # img1 predictions (pandas)

Downloading: "https://github.com/ultralytics/yolov5/zipball/master" to /root/.cache/torch/hub/master.zip


Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


YOLOv5 🚀 2024-12-9 Python-3.10.12 torch-2.5.1+cu121 CUDA:0 (Tesla T4, 15102MiB)

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100%|██████████| 14.1M/14.1M [00:00<00:00, 408MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 
  with amp.autocast(autocast):
image 1/1: 720x1280 2 persons, 1 tie, 1 cell phone
Speed: 1749.2ms pre-process, 94.1ms inference, 593.3ms NMS per image at shape (1, 3, 384, 640)
Saved 1 image to [1mruns/detect/exp[0m


Unnamed: 0,xmin,ymin,xmax,ymax,confidence,class,name
0,745.578735,48.470276,1142.694336,720.0,0.86891,0,person
1,124.74408,197.334503,844.397644,716.650513,0.630325,0,person
2,441.238708,439.350616,498.380737,708.570923,0.616793,27,tie
3,594.081787,377.300354,635.42395,437.147827,0.274014,67,cell phone


In [None]:
# preprocess model with posit 6_1  and posit 8_1
from qtorch_plus.quant import posit_quantize, float_quantize, configurable_table_quantize
import torch.nn as nn

def linear_weight(input):
  return posit_quantize(input,nsize=6, es=0, scale = 2)

def other_weight(input):
  return posit_quantize(input,nsize=8, es=1)

def linear_activation(input):
  return posit_quantize(input,nsize=6, es=0)

def other_activation(input):
  return posit_quantize(input,nsize=8, es=1)

def forward_pre_hook_linear(m, input):
    return (linear_activation(input[0]),)

def forward_hook(m, input,output):
    return other_activation(output)

def forward_pre_hook_other(m,input):
  if isinstance(input[0], torch.Tensor):
    if (input[0].dtype == torch.float32):
      return (other_activation(input[0]),)
    else:
      return input
  else:
    return input

layer_count = 0
op_count = 0
#assign hooks to preprocess and post-process layers.
for name, module in model.named_modules():
  if isinstance(module, nn.Conv2d) or isinstance(module, nn.Linear):
    module.weight.data = linear_weight(module.weight.data)
    layer_count = layer_count + 1
    module.register_forward_pre_hook(forward_pre_hook_linear)
    module.register_forward_hook(forward_hook)
    print ("Use posit(61) weight + activation for layer ", name)
  else: #use posit8 for other layers 'weight
    if hasattr(module, 'weight'):
      module.register_forward_pre_hook(forward_pre_hook_other)
      module.weight.data = other_weight(module.weight.data)
      module.register_forward_hook(forward_hook)

print ("total processed dense layer %d \n" %(layer_count))
print ("------------------")

# do the same thing for FP32
imgs = ['https://ultralytics.com/images/zidane.jpg']  # batch of images

# Inference
results_posit = model(imgs)

# Results
results_posit.print()
results_posit.save()  # or .show()

results_posit.xyxy[0]  # img1 predictions (tensor)
results_posit.pandas().xyxy[0]  # img1 predictions (pandas)

Use posit(61) weight + activation for layer  model.model.model.0.conv
Use posit(61) weight + activation for layer  model.model.model.1.conv
Use posit(61) weight + activation for layer  model.model.model.2.cv1.conv
Use posit(61) weight + activation for layer  model.model.model.2.cv2.conv
Use posit(61) weight + activation for layer  model.model.model.2.cv3.conv
Use posit(61) weight + activation for layer  model.model.model.2.m.0.cv1.conv
Use posit(61) weight + activation for layer  model.model.model.2.m.0.cv2.conv
Use posit(61) weight + activation for layer  model.model.model.3.conv
Use posit(61) weight + activation for layer  model.model.model.4.cv1.conv
Use posit(61) weight + activation for layer  model.model.model.4.cv2.conv
Use posit(61) weight + activation for layer  model.model.model.4.cv3.conv
Use posit(61) weight + activation for layer  model.model.model.4.m.0.cv1.conv
Use posit(61) weight + activation for layer  model.model.model.4.m.0.cv2.conv
Use posit(61) weight + activation 

  with amp.autocast(autocast):
image 1/1: 720x1280 (no detections)
Speed: 2260.6ms pre-process, 21.8ms inference, 1.0ms NMS per image at shape (1, 3, 384, 640)
Saved 1 image to [1mruns/detect/exp2[0m


Unnamed: 0,xmin,ymin,xmax,ymax,confidence,class,name


In [None]:
# try with posit 8_1 for everythiing :

# load the model again
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
def linear_weight(input):
  return posit_quantize(input,nsize=8, es=1, scale=8)
  # input_cpu = input.cpu().numpy()
  # epsilon = 1e-16  # To avoid log(0)
  # log2_weights = np.log2(np.abs(input_cpu) + epsilon)
  # counts, bins = np.histogram(log2_weights, bins=100)
  # max_bin_index = np.argmax(counts)
  # x_with_max_frequency = (bins[max_bin_index] + bins[max_bin_index + 1]) / 2  # Bin center
  # print(f"x_with_max_frequency for {name}: {x_with_max_frequency:.2f}")
  # scale = 2 ** (-x_with_max_frequency)
  # return posit_quantize(input,nsize=6, es=1, scale=scale)

def other_weight(input):
  return posit_quantize(input,nsize=8, es=1)
  # input_cpu = input.cpu().numpy()
  # epsilon = 1e-12  # To avoid log(0)
  # log2_weights = np.log2(np.abs(input_cpu) + epsilon)
  # counts, bins = np.histogram(log2_weights, bins=100)
  # max_bin_index = np.argmax(counts)
  # x_with_max_frequency = (bins[max_bin_index] + bins[max_bin_index + 1]) / 2  # Bin center
  # print(f"x_with_max_frequency for {name}: {x_with_max_frequency:.2f}")
  # scale = 2 ** (-x_with_max_frequency)
  # return posit_quantize(input,nsize=8, es=1, scale = scale)

def linear_activation(input):
  return posit_quantize(input,nsize=8, es=1)
  # input_cpu = input.cpu().numpy()
  # epsilon = 1e-12  # To avoid log(0)
  # log2_weights = np.log2(np.abs(input_cpu) + epsilon)
  # counts, bins = np.histogram(log2_weights, bins=100)
  # max_bin_index = np.argmax(counts)
  # x_with_max_frequency = (bins[max_bin_index] + bins[max_bin_index + 1]) / 2  # Bin center
  # print(f"x_with_max_frequency for {name}: {x_with_max_frequency:.2f}")
  # scale = 2 ** (-x_with_max_frequency)
  # return posit_quantize(input,nsize=6, es=1, scale = scale)

def other_activation(input):
  return posit_quantize(input,nsize=8, es=1)
  # input_cpu = input.cpu().numpy()
  # epsilon = 1e-12  # To avoid log(0)
  # log2_weights = np.log2(np.abs(input_cpu) + epsilon)
  # counts, bins = np.histogram(log2_weights, bins=100)
  # max_bin_index = np.argmax(counts)
  # x_with_max_frequency = (bins[max_bin_index] + bins[max_bin_index + 1]) / 2  # Bin center
  # print(f"x_with_max_frequency for {name}: {x_with_max_frequency:.2f}")
  # scale = 2 ** (-x_with_max_frequency)
  # return posit_quantize(input,nsize=8, es=1, scale = scale)

def forward_pre_hook_linear(m, input):
    return (linear_activation(input[0]),)

def forward_hook(m, input,output):
    return other_activation(output)

def forward_pre_hook_other(m,input):
  if isinstance(input[0], torch.Tensor):
    if (input[0].dtype == torch.float32):
      return (other_activation(input[0]),)
    else:
      return input
  else:
    return input

layer_count = 0
op_count = 0
#assign hooks to preprocess and post-process layers.
for name, module in model.named_modules():
  if isinstance(module, nn.Conv2d) or isinstance(module, nn.Linear):
    module.weight.data = linear_weight(module.weight.data)
    layer_count = layer_count + 1
    module.register_forward_pre_hook(forward_pre_hook_linear)
    module.register_forward_hook(forward_hook)
    print ("Use posit(81) weight + activation for layer ", name)
  else: #use posit8 for other layers 'weight
    if hasattr(module, 'weight'):
      module.register_forward_pre_hook(forward_pre_hook_other)
      module.weight.data = other_weight(module.weight.data)
      module.register_forward_hook(forward_hook)

print ("total processed dense layer %d \n" %(layer_count))
print ("------------------")

# do the same thing for FP32
imgs = ['https://ultralytics.com/images/zidane.jpg']  # batch of images

# Inference
results_posit = model(imgs)

# Results
results_posit.print()
results_posit.save()  # or .show()

results_posit.xyxy[0]  # img1 predictions (tensor)
results_posit.pandas().xyxy[0]  # img1 predictions (pandas)

Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2024-12-4 Python-3.10.12 torch-2.5.1+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


Use posit(81) weight + activation for layer  model.model.model.0.conv
Use posit(81) weight + activation for layer  model.model.model.1.conv
Use posit(81) weight + activation for layer  model.model.model.2.cv1.conv
Use posit(81) weight + activation for layer  model.model.model.2.cv2.conv
Use posit(81) weight + activation for layer  model.model.model.2.cv3.conv
Use posit(81) weight + activation for layer  model.model.model.2.m.0.cv1.conv
Use posit(81) weight + activation for layer  model.model.model.2.m.0.cv2.conv
Use posit(81) weight + activation for layer  model.model.model.3.conv
Use posit(81) weight + activation for layer  model.model.model.4.cv1.conv
Use posit(81) weight + activation for layer  model.model.model.4.cv2.conv
Use posit(81) weight + activation for layer  model.model.model.4.cv3.conv
Use posit(81) weight + activation for layer  model.model.model.4.m.0.cv1.conv
Use posit(81) weight + activation for layer  model.model.model.4.m.0.cv2.conv
Use posit(81) weight + activation 

  with amp.autocast(autocast):
image 1/1: 720x1280 2 persons, 1 tie, 1 cell phone
Speed: 2332.0ms pre-process, 22.6ms inference, 1.5ms NMS per image at shape (1, 3, 384, 640)
Saved 1 image to [1mruns/detect/exp3[0m


Unnamed: 0,xmin,ymin,xmax,ymax,confidence,class,name
0,749.894836,48.575165,1143.026855,718.524048,0.856165,0,person
1,117.795044,192.027344,852.184692,718.722656,0.584223,0,person
2,441.910309,438.640381,496.982635,707.279053,0.546922,27,tie
3,594.23175,378.403473,632.595642,437.721527,0.303981,67,cell phone


In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Citation

[![DOI](https://zenodo.org/badge/264818686.svg)](https://zenodo.org/badge/latestdoi/264818686)


## Contact


**Issues should be raised directly in https://github.com/ultralytics/yolov5.** For business inquiries or professional support requests please visit [https://ultralytics.com](https://ultralytics.com) or email Glenn Jocher at [glenn.jocher@ultralytics.com](mailto:glenn.jocher@ultralytics.com).


&nbsp;