# Training YOLOv7 on a Custom Dataset

This notebook is based on the [YOLOv7 repository](https://github.com/WongKinYiu/yolov7) by WongKinYiu. Many thanks to WongKinYiu and AlexeyAB for putting this repository together.

### **Steps Covered in this Notebook**

To train our detector we take the following steps:

* Install YOLOv7 dependencies
* Load custom dataset from Roboflow in YOLOv7 format
* Run YOLOv7 training
* Evaluate YOLOv7 performance
* Run YOLOv7 inference on test images
* OPTIONAL: Deployment
* OPTIONAL: Active Learning


### Preparing a Custom Dataset

In this notebook, we will utilize an open source computer vision dataset from one of the 90,000+ available on [Roboflow Universe](https://universe.roboflow.com).

If you already have your own images (and, optionally, annotations), you can convert your dataset using [Roboflow](https://roboflow.com), a set of tools developers use to build better computer vision models quickly and accurately. 100k+ developers use roboflow for (automatic) annotation, converting dataset formats (like to YOLOv7), training, deploying, and improving their datasets/models.

Follow [the getting started guide here](https://docs.roboflow.com/quick-start) to create and prepare your own custom dataset.

# Install Dependencies

_(Remember to choose GPU in Runtime if not already selected. Runtime --> Change Runtime Type --> Hardware accelerator --> GPU)_

In [1]:
# Download YOLOv7 repository and install requirements
!git clone https://github.com/WongKinYiu/yolov7
%cd yolov7
%pip install -r requirements.txt

Cloning into 'yolov7'...
remote: Enumerating objects: 1197, done.[K
remote: Total 1197 (delta 0), reused 0 (delta 0), pack-reused 1197[K
Receiving objects: 100% (1197/1197), 74.23 MiB | 10.22 MiB/s, done.
Resolving deltas: 100% (519/519), done.
/home/invigilo/evan/hooper/cv/yolov7


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


Collecting matplotlib>=3.2.2
  Downloading matplotlib-3.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting numpy<1.24.0,>=1.18.5
  Downloading numpy-1.23.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.1/17.1 MB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting opencv-python>=4.1.1
  Using cached opencv_python-4.9.0.80-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (62.2 MB)
Collecting Pillow>=7.1.2
  Using cached pillow-10.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.5 MB)
Collecting PyYAML>=5.3.1
  Using cached PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (705 kB)
Collecting requests>=2.23.0
  Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Collectin

In [1]:
# Check for gpu usage
!nvidia-smi

Fri Feb 16 15:02:52 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:3E:00.0  On |                  N/A |
|  0%   40C    P8    14W / 170W |    402MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Download Correctly Formatted Custom Data

Next, we'll download our dataset in the right format. Use the `YOLOv7 PyTorch` export. Note that this model requires YOLO TXT annotations, a custom YAML file, and organized directories. The roboflow export writes this for us and saves it in the correct spot.


In [4]:
# REPLACE with your custom code snippet generated above

%pip install roboflow

from roboflow import Roboflow
rf = Roboflow(api_key="rz0A8V6lWKzJHGDrFe89")
project = rf.workspace("healthhack").project("hop-tiny")
dataset = project.version(1).download("yolov7")

Note: you may need to restart the kernel to use updated packages.
loading Roboflow workspace...
loading Roboflow project...


Downloading Dataset Version Zip in hop-tiny-1 to yolov7pytorch:: 100%|██████████| 3119/3119 [00:01<00:00, 2133.47it/s]





Extracting Dataset Version Zip to hop-tiny-1 in yolov7pytorch:: 100%|██████████| 190/190 [00:00<00:00, 9082.20it/s]


# Begin Custom Training

We're ready to start custom training.

NOTE: We will only modify one of the YOLOv7 training defaults in our example: `epochs`. We will adjust from 300 to 100 epochs in our example for speed. If you'd like to change other settings, see details in [our accompanying blog post](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/).

In [1]:
# download COCO starting checkpoint
%cd /yolov7
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt

[Errno 2] No such file or directory: '/Users/evanyan13/hooper/computer-vision/yolov7'
/Users/evanyan13/hooper/cv


--2024-02-16 13:47:56--  https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
Resolving github.com (github.com)... 20.205.243.166
Connecting to github.com (github.com)|20.205.243.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/511187726/ba7d01ee-125a-4134-8864-fa1abcbf94d5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20240216%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240216T054756Z&X-Amz-Expires=300&X-Amz-Signature=29c701a9c89ad284faf7a7a7e7d764475ccc2ca928aad3b95b1758baa44a7d8e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=511187726&response-content-disposition=attachment%3B%20filename%3Dyolov7-tiny.pt&response-content-type=application%2Foctet-stream [following]
--2024-02-16 13:47:56--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/511187726/ba7d01ee-125a-4134-8864-fa1abcbf94d5?X-Amz-A

In [3]:
# run this cell to begin training
%cd yolov7/
!python train.py --batch 16 --epochs 3 --data /hop-tiny-1/data.yaml --weights 'yolov7-tiny.pt' --device 0

/home/invigilo/evan/hooper/cv/yolov7


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


YOLOR 🚀 v0.1-128-ga207844 torch 2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12038.6875MB)

Namespace(weights='yolov7-tiny.pt', cfg='', data='./hop-tiny-1/data.yaml', hyp='data/hyp.scratch.p5.yaml', epochs=3, batch_size=16, img_size=[640, 640], rect=False, resume=False, nosave=False, notest=False, noautoanchor=False, evolve=False, bucket='', cache_images=False, image_weights=False, device='0', multi_scale=False, single_cls=False, adam=False, sync_bn=False, local_rank=-1, workers=8, project='runs/train', entity=None, name='exp', exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, upload_dataset=False, bbox_interval=-1, save_period=-1, artifact_alias='latest', freeze=[0], v5_metric=False, world_size=1, global_rank=-1, save_dir='runs/train/exp4', total_batch_size=16)
[34m[1mtensorboard: [0mStart with 'tensorboard --logdir runs/train', view at http://localhost:6006/
[34m[1mhyperparameters: [0mlr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, war

# Evaluation

We can evaluate the performance of our custom training using the provided evalution script.

Note we can adjust the below custom arguments. For details, see [the arguments accepted by detect.py](https://github.com/WongKinYiu/yolov7/blob/main/detect.py#L154).

In [4]:
# Run evaluation
!python detect.py --weights runs/train/exp2/weights/best.pt --conf 0.1 --source hop-tiny-1/test/images


Namespace(weights=['runs/train/exp2/weights/best.pt'], source='hop-tiny-1/test/images', img_size=640, conf_thres=0.1, iou_thres=0.45, device='', view_img=False, save_txt=False, save_conf=False, nosave=False, classes=None, agnostic_nms=False, augment=False, update=False, project='runs/detect', name='exp', exist_ok=False, no_trace=False)
YOLOR 🚀 v0.1-128-ga207844 torch 2.2.0+cu121 CUDA:0 (NVIDIA GeForce RTX 3060, 12038.6875MB)

Fusing layers... 
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 200 layers, 6009343 parameters, 0 gradients, 13.0 GFLOPS
 Convert model to Traced-model... 
 traced_script_module saved! 
 model is traced! 

Done. (3.0ms) Inference, (0.7ms) NMS
 The image with the result is saved in: runs/detect/exp5/IMG_0665_JPG.rf.3bbac9ffbb80dd2f3a4bc934c70c7260.jpg
1 Paper, Done. (2.6ms) Inference, (0.8ms) NMS
 The image with the result is saved in: runs/detect/exp5/IMG_0681_JPG.rf.4f424eec2400d62d419495c9e50bff0c.jpg
Done. (2.6ms) Inferen

In [6]:
#display inference on first 10 test images

import glob
from IPython.display import Image, display

i = 0
limit = 10 # max images to print
for imageName in glob.glob('runs/detect/exp5/*.jpg'): #assuming JPG
    if i < limit:
      display(Image(filename=imageName))
      print("\n")
    i = i + 1


<IPython.core.display.Image object>





<IPython.core.display.Image object>





<IPython.core.display.Image object>





<IPython.core.display.Image object>





### Upload Custom-Trained Weights back to Roboflow


In [9]:
%pip install ultralytics

Note: you may need to restart the kernel to use updated packages.


In [11]:
version = project.version(1)
version.deploy(model_type="yolov7", model_path="/home/invigilo/evan/hooper/cv/yolov7/runs/train/exp4")

NameError: name 'project' is not defined

# Reparameterize for Inference

https://github.com/WongKinYiu/yolov7/blob/main/tools/reparameterization.ipynb

# OPTIONAL: Deployment

To deploy, you'll need to export your weights and save them to use later.

In [None]:
# optional, zip to download weights and results locally

!zip -r export.zip runs/detect
!zip -r export.zip runs/train/exp/weights/best.pt
!zip export.zip runs/train/exp/*

# OPTIONAL: Active Learning Example

Once our first training run is complete, we should use our model to help identify which images are most problematic in order to investigate, annotate, and improve our dataset (and, therefore, model).

To do that, we can execute code that automatically uploads images back to our hosted dataset if the image is a specific class or below a given confidence threshold.


In [None]:
# setup access to your workspace
rf = Roboflow(api_key="rz0A8V6lWKzJHGDrFe89")
inference_project = rf.workspace("healthhack").project("hop-vbavn")
model = inference_project.version(1).model

upload_project = rf.workspace().project("healthhack")

print("inference reference point: ", inference_project)
print("upload destination: ", upload_project)

In [None]:
# example upload: if prediction is below a given confidence threshold, upload it

confidence_interval = [10,80]                                   # [lower_bound_percent, upper_bound_percent]

for prediction in predictions:                                  # predictions list to loop through
  if(prediction['confidence'] * 100 >= confidence_interval[0] and
          prediction['confidence'] * 100 <= confidence_interval[1]):

          # upload on success!
          print(' >> image uploaded!')
          upload_project.upload(image, num_retry_uploads=3)     # upload image in question

# Next steps

Congratulations, you've trained a custom YOLOv7 model! Next, start thinking about deploying and [building an MLOps pipeline](https://docs.roboflow.com) so your model gets better the more data it sees in the wild.