# How to Train YOLOv7 on a Custom Dataset

This tutorial is based on the [YOLOv7 repository](https://github.com/WongKinYiu/yolov7) by WongKinYiu. This notebook shows training on **your own custom objects**. Many thanks to WongKinYiu and AlexeyAB for putting this repository together.


### **Accompanying Blog Post**

We recommend that you follow along in this notebook while reading the blog post on [how to train YOLOv7](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/), concurrently.

### **Steps Covered in this Tutorial**

To train our detector we take the following steps:

* Install YOLOv7 dependencies
* Load custom dataset from Roboflow in YOLOv7 format
* Run YOLOv7 training
* Evaluate YOLOv7 performance
* Run YOLOv7 inference on test images
* OPTIONAL: Deployment
* OPTIONAL: Active Learning


## Roboflow Universe

Need data for your project? Before spending time on annotating, check out Roboflow Universe, a repository of more than 200,000 open-source datasets that you can use in your projects. You'll find datasets containing everything from annotated cracks in concrete to plant images with disease annotations.


[![Roboflow Universe](https://media.roboflow.com/notebooks/template/uni-banner-frame.png?ik-sdk-version=javascript-1.4.3&updatedAt=1672878480290)](https://universe.roboflow.com/)

## Preparing a custom dataset

Building a custom dataset can be a painful process. It might take dozens or even hundreds of hours to collect images, label them, and export them in the proper format. Fortunately, Roboflow makes this process as straightforward and fast as possible. Let me show you how!

### Step 1: Creating project

Before you start, you need to create a Roboflow [account](https://app.roboflow.com/login). Once you do that, you can create a new project in the Roboflow [dashboard](https://app.roboflow.com/). Keep in mind to choose the right project type. In our case, Object Detection.

<div align="center">
  <img
    width="640"
    src="https://media.roboflow.com/preparing-custom-dataset-example/creating-project.gif?ik-sdk-version=javascript-1.4.3&updatedAt=1672929799852"
  >
</div>

### Step 2: Uploading images

Next, add the data to your newly created project. You can do it via API or through our [web interface](https://docs.roboflow.com/adding-data/object-detection).

If you drag and drop a directory with a dataset in a supported format, the Roboflow dashboard will automatically read the images and annotations together.

<div align="center">
  <img
    width="640"
    src="https://media.roboflow.com/preparing-custom-dataset-example/uploading-images.gif?ik-sdk-version=javascript-1.4.3&updatedAt=1672929808290"
  >
</div>

### Step 3: Labeling

If you only have images, you can label them in [Roboflow Annotate](https://docs.roboflow.com/annotate).

<div align="center">
  <img
    width="640"
    src="https://user-images.githubusercontent.com/26109316/210901980-04861efd-dfc0-4a01-9373-13a36b5e1df4.gif"
  >
</div>

### Step 4: Generate new dataset version

Now that we have our images and annotations added, we can Generate a Dataset Version. When Generating a Version, you may elect to add preprocessing and augmentations. This step is completely optional, however, it can allow you to significantly improve the robustness of your model.

<div align="center">
  <img
    width="640"
    src="https://media.roboflow.com/preparing-custom-dataset-example/generate-new-version.gif?ik-sdk-version=javascript-1.4.3&updatedAt=1673003597834"
  >
</div>

### Step 5: Exporting dataset

Once the dataset version is generated, we have a hosted dataset we can load directly into our notebook for easy training. Click `Export` and select the `YOLO v5 PyTorch` dataset format.

<div align="center">
  <img
    width="640"
    src="https://media.roboflow.com/preparing-custom-dataset-example/export.gif?ik-sdk-version=javascript-1.4.3&updatedAt=1672943313709"
  >
</div>


# Install Dependencies

In [None]:
# Create a virtual environment first!

# run these in your terminal:
# python -m venv env
# ./env/Scripts/activate.bat

# if the above doesn't work, try this one:
# ./env/Scripts/Activate.ps!
import sys
assert sys.prefix != sys.base_prefix, "Please activate your virual environment. See the comment in the code cell."

# Download YOLOv7 repository and install requirements
# A custom version of YoloV7 that works with pytorch>1.12.1 and numpy>1.20.1,
# and supports SnapML
!git clone https://github.com/gyfen/yolov7-snapml.git

%pip install -r requirements.txt
# %pip install albumentations
# %pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
# %pip install -r yolov7-snapml/requirements.txt
# %pip install protobuf
# %pip install onnx>=1.9.0
# %pip install onnx-simplifier>=0.3.6
# %pip install roboflow
# %pip install pyyaml

# Download Correctly Formatted Custom Data

Next, we'll download our dataset in the right format. Use the `YOLOv7 PyTorch` export. Note that this model requires YOLO TXT annotations, a custom YAML file, and organized directories. The roboflow export writes this for us and saves it in the correct spot.

You can find our project here: https://app.roboflow.com/diminishedreality/food-products-jnugg/17


In [None]:
# REPLACE with your custom code snippet generated above

API_KEY = "YOUR_KEY_HERE"

from roboflow import Roboflow
rf = Roboflow(api_key=API_KEY)
project = rf.workspace("diminishedreality").project("food-products-jnugg")
dataset = project.version(17).download("yolov7")

print("Dataset downloaded!")

# Augment images

In [None]:
import albumentations as A
from pathlib import Path
import cv2

# amt of augmentations per image
N = 1

# locations
train_loc = Path(dataset.location) / "train"
images_loc = train_loc / "images"
labels_loc = train_loc / "labels"

# transformations to apply
transform = A.Compose(
    [
        A.Rotate(limit=15, border_mode=cv2.BORDER_CONSTANT, p=0.5),
        A.RandomScale(scale_limit=(-0.3, 0.2), p=0.5),
        A.RandomBrightnessContrast(p=0.5),
        A.MotionBlur(blur_limit=3, p=0.2),
    ],
    bbox_params=A.BboxParams(
        format="yolo",
        label_fields=["category_ids"],
        min_visibility=0.7,
        clip=True,
        filter_invalid_bboxes=True,
    ),
)

img_paths = list(images_loc.glob("*.jpg"))

for img_path in img_paths:
    stem = img_path.stem

    label_path = labels_loc / f"{stem}.txt"

    img = cv2.imread(img_path)
    height, width = img.shape[:2]

    bboxes = []
    class_ids = []

    with open(label_path, "r") as f:
        for line in f.readlines():
            cls, x, y, w, h = map(float, line.strip().split())
            bboxes.append([x, y, w, h])
            class_ids.append(int(cls))

    for i in range(N):
        # Apply augmentation
        augmented = transform(image=img, bboxes=bboxes, category_ids=class_ids)
        aug_img = augmented["image"]
        aug_bboxes = augmented["bboxes"]
        # aug_classes = augmented["category_ids"]

        aug_img_name = f"{stem}_aug{i}.jpg"
        aug_label_name = f"{stem}_aug{i}.txt"

        # Save new image
        cv2.imwrite(images_loc / aug_img_name, aug_img)

        # Save new label file
        with open(labels_loc / aug_label_name, "w") as f:
            for cls_id, (x, y, w, h) in zip(class_ids, aug_bboxes):
                f.write(f"{cls_id} {x:.6f} {y:.6f} {w:.6f} {h:.6f}\n")

print("Augmentation done!")

# Begin Custom Training

We're ready to start custom training.
If you'd like to change any settings, see details in [our accompanying blog post](https://blog.roboflow.com/yolov7-custom-dataset-training-tutorial/).

In [None]:
# run this cell to begin training

IMG_SIZE = 640
EPOCHS = 1

import os
os.environ["WANDB_MODE"] = "disabled"

!python yolov7-snapml/train.py --workers 8 --device 0 --batch 16 --epochs {EPOCHS} --img {IMG_SIZE} {IMG_SIZE} --data "{dataset.location}/data.yaml" --weights yolov7-snapml/yolov7-tiny.pt --cfg yolov7-snapml/cfg/training/yolov7-tiny.yaml

# Evaluation

We can evaluate the performance of our custom training using the provided evalution script.

Note we can adjust the below custom arguments. For details, see [the arguments accepted by detect.py](https://github.com/WongKinYiu/yolov7/blob/main/detect.py#L154).

In [None]:
# Run evaluation
import yaml
with open(dataset.location + "\\data.yaml", 'r') as file:
    y = yaml.safe_load(file)

y['test'] = f'{dataset.location.split("\\")[-1]}/test/images'

with open(dataset.location + "\\data.yaml", 'w') as f:
    yaml.dump(y, f)

!python yolov7-snapml/test.py --weights runs/exp/weights/best.pt --data "{dataset.location}/data.yaml" --task test

# Export the model

In [None]:
!python export.py --weights runs/exp/weights/best.pt --grid --simplify --export-snapml --img-size {IMG_SIZE} {IMG_SIZE} --max-wh {IMG_SIZE}

# Next steps

Congratulations, you've trained a custom YOLOv7 model!
Now take the onnx file and bring it into LensStudio.