# How to Train YOLOv7 on a Custom Dataset

This tutorial is based on the [YOLOv7 repository](https://github.com/WongKinYiu/yolov7) by WongKinYiu. This notebook shows training on **your own custom objects**. Many thanks to WongKinYiu and AlexeyAB for putting this repository together.


### **Accompanying Blog Post**

We recommend that you follow along in this notebook while reading the blog post on [how to train YOLOv7](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/), concurrently.

### **Steps Covered in this Tutorial**

To train our detector we take the following steps:

* Install YOLOv7 dependencies
* Load custom dataset from Roboflow in YOLOv7 format
* Run YOLOv7 training
* Evaluate YOLOv7 performance
* Run YOLOv7 inference on test images
* OPTIONAL: Deployment
* OPTIONAL: Active Learning


### Preparing a Custom Dataset

In this tutorial, we will utilize an open source computer vision dataset from one of the 90,000+ available on [Roboflow Universe](https://universe.roboflow.com).

If you already have your own images (and, optionally, annotations), you can convert your dataset using [Roboflow](https://roboflow.com), a set of tools developers use to build better computer vision models quickly and accurately. 100k+ developers use roboflow for (automatic) annotation, converting dataset formats (like to YOLOv7), training, deploying, and improving their datasets/models.

Follow [the getting started guide here](https://docs.roboflow.com/quick-start) to create and prepare your own custom dataset.

#Install Dependencies

_(Remember to choose GPU in Runtime if not already selected. Runtime --> Change Runtime Type --> Hardware accelerator --> GPU)_

In [1]:
# Download YOLOv7 repository and install requirements
#!git clone https://github.com/WongKinYiu/yolov7
%cd yolov7
!pip install -r requirements.txt

/notebooks/yolov7
Collecting thop
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Installing collected packages: thop
Successfully installed thop-0.1.1.post2209072238
[0m

# Download Correctly Formatted Custom Data

Next, we'll download our dataset in the right format. Use the `YOLOv7 PyTorch` export. Note that this model requires YOLO TXT annotations, a custom YAML file, and organized directories. The roboflow export writes this for us and saves it in the correct spot.


In [None]:
# REPLACE with your custom code snippet generated above

!pip install roboflow

from roboflow import Roboflow
rf = Roboflow(api_key="OJ0RhHjHjRhN0EeTI6RT")
project = rf.workspace("capstoneroboboxer").project("robobox2")
dataset = project.version(4).download("yolov7")



Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting roboflow
  Downloading roboflow-0.2.25-py3-none-any.whl (46 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.5/46.5 KB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting python-dotenv
  Downloading python_dotenv-0.21.0-py3-none-any.whl (18 kB)
Collecting cycler==0.10.0
  Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Collecting pyparsing==2.4.7
  Downloading pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.8/67.8 KB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting urllib3==1.26.6
  Downloading urllib3-1.26.6-py2.py3-none-any.whl (138 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.5/138.5 KB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting requests-toolbelt
  Downloading requests_toolbelt-0.10.1-py2.py3-none-any.whl (54 kB)
[2K   

loading Roboflow workspace...
loading Roboflow project...
Downloading Dataset Version Zip in RoboBox2-4 to yolov7pytorch: 100% [106928744 / 106928744] bytes


Extracting Dataset Version Zip to RoboBox2-4 in yolov7pytorch:: 100%|██████████| 1996/1996 [00:01<00:00, 1858.06it/s]


testing new sus data


In [2]:
%cd yol

!pip install roboflow

from roboflow import Roboflow
rf = Roboflow(api_key="OJ0RhHjHjRhN0EeTI6RT")
project = rf.workspace("roboboxer3").project("boxing-4")
dataset = project.version(3).download("yolov7")

Collecting roboflow
  Downloading roboflow-0.2.34-py3-none-any.whl (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.2/50.2 kB[0m [31m12.5 MB/s[0m eta [36m0:00:00[0m
Collecting wget
  Downloading wget-3.2.zip (10 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting python-dotenv
  Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Collecting certifi==2022.12.7
  Downloading certifi-2022.12.7-py3-none-any.whl (155 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m155.3/155.3 kB[0m [31m30.3 MB/s[0m eta [36m0:00:00[0m
Collecting cycler==0.10.0
  Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Collecting pyparsing==2.4.7
  Downloading pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.8/67.8 kB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting chardet==4.0.0
  Downloading chardet-4.0.0-py2.py3-none-any.whl (178 kB)
[2K     [90m━━━━━━━━━━━

Extracting Dataset Version Zip to boxing-4-1 in yolov7pytorch:: 100%|██████████| 10950/10950 [00:08<00:00, 1325.03it/s]


# Begin Custom Training

We're ready to start custom training.

NOTE: We will only modify one of the YOLOv7 training defaults in our example: `epochs`. We will adjust from 300 to 100 epochs in our example for speed. If you'd like to change other settings, see details in [our accompanying blog post](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/).

In [None]:
# download COCO starting checkpoint
%cd /content/yolov7
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt

/content/yolov7
--2023-01-14 21:50:17--  https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/511187726/13e046d1-f7f0-43ab-910b-480613181b1f?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230114%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230114T215017Z&X-Amz-Expires=300&X-Amz-Signature=6e14aab59761c8625883d8ee4bd809f5d1ec9e97fca4fbfd543e9ba14f97d496&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=511187726&response-content-disposition=attachment%3B%20filename%3Dyolov7_training.pt&response-content-type=application%2Foctet-stream [following]
--2023-01-14 21:50:17--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/511187726/13e046d1-f7f0-43ab-910b-

Using YOLO v7 Tiny


In [3]:
# download YOLOv7_tiny
%cd /content/yolov7
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt

[Errno 2] No such file or directory: '/content/yolov7'
/notebooks/yolov7
--2023-03-11 04:11:22--  https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/511187726/ba7d01ee-125a-4134-8864-fa1abcbf94d5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230311%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230311T041122Z&X-Amz-Expires=300&X-Amz-Signature=00af159caa8be1b18e057e0d63149f44812d45066de93dc504975da19054fe2d&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=511187726&response-content-disposition=attachment%3B%20filename%3Dyolov7-tiny.pt&response-content-type=application%2Foctet-stream [following]
--2023-03-11 04:11:22--  https://objects.githubusercontent.com/github-production-releas

In [6]:
%ls

LICENSE.md    [0m[01;34mdata[0m/      hubconf.py        [01;34mscripts[0m/      [01;34mutils[0m/
README.md     [01;34mdeploy[0m/    [01;34minference[0m/        test.py       yolov7-tiny.pt
[01;34m__pycache__[0m/  detect.py  [01;34mmodels[0m/           [01;34mtools[0m/
[01;34mboxing-4-1[0m/   export.py  [01;34mpaper[0m/            train.py
[01;34mcfg[0m/          [01;34mfigure[0m/    requirements.txt  train_aux.py


For regular YOLO

In [None]:
# run this cell to begin training

%cd /content/yolov7
!python train.py --batch 16 --epochs 200 --data {dataset.location}/data.yaml --weights 'yolov7_training.pt' 


For Tiny YOLO

In [1]:
# run this cell to begin training

%cd /notebooks/yolov7

#!python train.py --batch 16 --epochs 200 --data {dataset.location}/data.yaml --weights 'yolov7-tiny.pt'

#NEED TO UPLOAD CUSTOM CFG FILE FOR OUR CUSTOM CLASSES
!python train.py --batch 32 --epochs 20  --device 0 \
--data /notebooks/yolov7/boxing-4-3/data.yaml \
--cfg /notebooks/yolo_tiny_2c_deploy.yaml \
--weights 'yolov7-tiny.pt' --name frfrfr_yolov7_tiny_box_2c_deploy \
--hyp /notebooks/yolov7/data/hyp.scratch.tiny.yaml

#!python train.py --epochs 100 --workers 4 --device 0 --batch-size 32 \
#--data data/pothole.yaml --img 640 640 --cfg /content/yolo_tiny_model_2c.yaml \
#--weights 'yolov7-tiny.pt' --name yolov7_tiny_pothole_fixed_res --hyp data/hyp.scratch.tiny.yaml


/notebooks/yolov7
YOLOR 🚀 v0.1-122-g3b41c2c torch 1.12.1+cu116 CUDA:0 (NVIDIA RTX A4000, 16117.3125MB)

Namespace(weights='yolov7-tiny.pt', cfg='/notebooks/yolo_tiny_2c_deploy.yaml', data='/notebooks/yolov7/boxing-4-3/data.yaml', hyp='/notebooks/yolov7/data/hyp.scratch.tiny.yaml', epochs=20, batch_size=32, img_size=[640, 640], rect=False, resume=False, nosave=False, notest=False, noautoanchor=False, evolve=False, bucket='', cache_images=False, image_weights=False, device='0', multi_scale=False, single_cls=False, adam=False, sync_bn=False, local_rank=-1, workers=8, project='runs/train', entity=None, name='frfrfr_yolov7_tiny_box_2c_deploy', exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, upload_dataset=False, bbox_interval=-1, save_period=-1, artifact_alias='latest', freeze=[0], v5_metric=False, world_size=1, global_rank=-1, save_dir='runs/train/frfrfr_yolov7_tiny_box_2c_deploy', total_batch_size=32)
[34m[1mtensorboard: [0mStart with 'tensorboard --logdir runs/train'

Testing Script

In [3]:
%cd /notebooks/yolov7
!python3 test.py --weights /notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy11/weights/best.pt --data /notebooks/yolov7/boxing-4-3/data.yaml --task test --name yolo_bs_reduce_2

/notebooks/yolov7
Namespace(weights=['/notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy11/weights/best.pt'], data='/notebooks/yolov7/boxing-4-3/data.yaml', batch_size=32, img_size=640, conf_thres=0.001, iou_thres=0.65, task='test', device='', single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project='runs/test', name='yolo_bs_reduce_2', exist_ok=False, no_trace=False, v5_metric=False)
YOLOR 🚀 v0.1-122-g3b41c2c torch 1.12.1+cu116 CUDA:0 (NVIDIA RTX A4000, 16117.3125MB)

Fusing layers... 
Model Summary: 200 layers, 6009343 parameters, 0 gradients
 Convert model to Traced-model... 
 traced_script_module saved! 
 model is traced! 

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[34m[1mtest: [0mScanning 'boxing-4-3/test/labels.cache' images and labels... 833 found, 0 [0m
               Class      Images      Labels           P           R      mAP@.5
                 all         8

Testing Batch Size reduction

In [None]:
%cd /notebooks/yolov7

#!python train.py --batch 16 --epochs 200 --data {dataset.location}/data.yaml --weights 'yolov7-tiny.pt'

#NEED TO UPLOAD CUSTOM CFG FILE FOR OUR CUSTOM CLASSES
!python train.py --batch 64 --epochs 50  --device 0 \
--data /notebooks/yolov7/boxing-4-3/data.yaml \
--cfg /notebooks/yolo_tiny_2c_deploy.yaml \
--weights '/notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/weights/last.pt' --name frfr_resume_yolov7_tiny_box_2c_deploy \
--hyp /notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/hyp.yaml


/notebooks/yolov7
YOLOR 🚀 v0.1-122-g3b41c2c torch 1.12.1+cu116 CUDA:0 (NVIDIA RTX A4000, 16117.3125MB)

Namespace(weights='/notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/weights/last.pt', cfg='/notebooks/yolo_tiny_2c_deploy.yaml', data='/notebooks/yolov7/boxing-4-3/data.yaml', hyp='/notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/hyp.yaml', epochs=50, batch_size=64, img_size=[640, 640], rect=False, resume=False, nosave=False, notest=False, noautoanchor=False, evolve=False, bucket='', cache_images=False, image_weights=False, device='0', multi_scale=False, single_cls=False, adam=False, sync_bn=False, local_rank=-1, workers=8, project='runs/train', entity=None, name='frfr_resume_yolov7_tiny_box_2c_deploy', exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, upload_dataset=False, bbox_interval=-1, save_period=-1, artifact_alias='latest', freeze=[0], v5_metric=False, world_size=1, global_rank=-1, save_dir='runs/train/frfr_resume_yolov7_t

# Evaluation

We can evaluate the performance of our custom training using the provided evalution script.

Note we can adjust the below custom arguments. For details, see [the arguments accepted by detect.py](https://github.com/WongKinYiu/yolov7/blob/main/detect.py#L154).

In [6]:
# Run evaluation
# ~~~~~CHANGE THE EXP NUMBER~~~~~
# Head to yoyov7/runs/train/ and see what the highest exp number is, make sure there are files in there

#!python detect.py --weights runs/train/exp/weights/best.pt --conf 0.1 --source {dataset.location}/test/images
%cd /notebooks/yolov7
!python detect.py --weights /notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/weights/best.pt --conf 0.5 --source boxing-4-3/test/images


/notebooks/yolov7
Namespace(weights=['/notebooks/yolov7/runs/train/frfr_resume_yolov7_tiny_box_2c_deploy9/weights/best.pt'], source='boxing-4-3/test/images', img_size=640, conf_thres=0.5, iou_thres=0.45, device='', view_img=False, save_txt=False, save_conf=False, nosave=False, classes=None, agnostic_nms=False, augment=False, update=False, project='runs/detect', name='exp', exist_ok=False, no_trace=False)
YOLOR 🚀 v0.1-122-g3b41c2c torch 1.12.1+cu116 CUDA:0 (NVIDIA RTX A4000, 16117.3125MB)

Fusing layers... 
Model Summary: 200 layers, 6009343 parameters, 0 gradients
 Convert model to Traced-model... 
 traced_script_module saved! 
 model is traced! 

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
2 Faces, Done. (5.5ms) Inference, (1.1ms) NMS
 The image with the result is saved in: runs/detect/exp7/-I1-MS09uaqsLdGTFkgnS0Rcg1mmPyAj95ySg_eckoM_jpeg_jpg.rf.463c5b6963e14a75b918609e4a48c55a.jpg
1 Face, Done. (5.7ms) Inference, (0.7ms) NMS
 The image with the result is sa

In [19]:
#display inference on ALL test images

import glob
from IPython.display import Image, display

i = 0
limit = 10000 # max images to print

# Head to yoyov7/runs/detect/ and see what the highest exp number is, make sure there are files in there

for imageName in glob.glob('/content/yolov7/runs/detect/exp/*.jpg'): #assuming JPG
    if i < limit:
      display(Image(filename=imageName))
      print("\n")
    i = i + 1    

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
%cp /content/drive/MyDrive/Capstone_Test/WIN_20230112_18_22_47_Pro.mp4 /content/yolov7

In [None]:
!python detect.py --weights runs/train/yolov7_tiny_box_2c/weights/best.pt --conf 0.1 --source /content/yolov7/WIN_20230112_18_22_47_Pro.mp4

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.1, device='', exist_ok=False, img_size=640, iou_thres=0.45, name='exp', no_trace=False, nosave=False, project='runs/detect', save_conf=False, save_txt=False, source='/content/yolov7/WIN_20230112_18_22_47_Pro.mp4', update=False, view_img=False, weights=['runs/train/yolov7_tiny_box_2c/weights/best.pt'])
YOLOR 🚀 v0.1-121-g2fdc7f1 torch 1.13.0+cu116 CUDA:0 (Tesla T4, 15109.75MB)

Fusing layers... 
IDetect.fuse
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 208 layers, 6010302 parameters, 0 gradients, 13.0 GFLOPS
 Convert model to Traced-model... 
 traced_script_module saved! 
 model is traced! 

video 1/1 (1/671) /content/yolov7/WIN_20230112_18_22_47_Pro.mp4: 1 Boxing_Glove, 2 Faces, Done. (6.6ms) Inference, (1.5ms) NMS
video 1/1 (2/671) /content/yolov7/WIN_20230112_18_22_47_Pro.mp4: 1 Boxing_Glove, 2 Faces, Done. (18.9ms) Inference, (1.3ms) NMS
video 1/1 (3/671) /content/yolov7/W

# Reparameterize for Inference

https://github.com/WongKinYiu/yolov7/blob/main/tools/reparameterization.ipynb

# OPTIONAL: Deployment

To deploy, you'll need to export your weights and save them to use later.

In [5]:
# optional, zip to download weights and results locally


!zip export_deploy.zip /notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/*
!zip export_deploy.zip /notebooks/yolov7/runs/detect/exp5/*
!zip export_deploy.zip /notebooks/yolov7/runs/test/yolo_200_test3/*

  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/F1_curve.png (deflated 10%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/PR_curve.png (deflated 17%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/P_curve.png (deflated 13%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/R_curve.png (deflated 11%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/confusion_matrix.png (deflated 35%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/events.out.tfevents.1678991400.nvcf96z9vc.62.0 (deflated 70%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/hyp.yaml (deflated 45%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/opt.yaml (deflated 49%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/results.png (deflated 9%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_box_2c_deploy/results.txt (deflated 75%)
  adding: notebooks/yolov7/runs/train/yolov7_tiny_b

# OPTIONAL: Active Learning Example

Once our first training run is complete, we should use our model to help identify which images are most problematic in order to investigate, annotate, and improve our dataset (and, therefore, model).

To do that, we can execute code that automatically uploads images back to our hosted dataset if the image is a specific class or below a given confidence threshold.


In [None]:
# # setup access to your workspace
# rf = Roboflow(api_key="YOUR_API_KEY")                               # used above to load data
# inference_project =  rf.workspace().project("YOUR_PROJECT_NAME")    # used above to load data
# model = inference_project.version(1).model

# upload_project = rf.workspace().project("YOUR_PROJECT_NAME")

# print("inference reference point: ", inference_project)
# print("upload destination: ", upload_project)

In [None]:
# # example upload: if prediction is below a given confidence threshold, upload it 

# confidence_interval = [10,70]                                   # [lower_bound_percent, upper_bound_percent]

# for prediction in predictions:                                  # predictions list to loop through
#   if(prediction['confidence'] * 100 >= confidence_interval[0] and 
#           prediction['confidence'] * 100 <= confidence_interval[1]):
        
#           # upload on success!
#           print(' >> image uploaded!')
#           upload_project.upload(image, num_retry_uploads=3)     # upload image in question

# Next steps

Congratulations, you've trained a custom YOLOv7 model! Next, start thinking about deploying and [building an MLOps pipeline](https://docs.roboflow.com) so your model gets better the more data it sees in the wild.