# How to Train YOLOv7 Instance Segmentation on a Custom Dataset
---
[![Roboflow](https://raw.githubusercontent.com/roboflow-ai/notebooks/main/assets/badges/roboflow-blogpost.svg)](https://blog.roboflow.com/train-yolov7-instance-segmentation-on-custom-data) [![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://www.youtube.com/watch?v=vFGxM2KLs10)

This tutorial is based on the [YOLOv7 repository](https://github.com/WongKinYiu/yolov7) by WongKinYiu. This notebook shows training on **your own custom objects**. Many thanks to WongKinYiu and AlexeyAB for putting this repository together. ðŸ™Œ


**Steps Covered in this Tutorial**

To train our segmentor we take the following steps:

* Before you start
* Install YOLOv7
* Install Requirements
* Inference with pre-trained COCO model
* Required data format
* Download dataset from Roboflow Universe
* Custom Training
* Evaluation

**Preparing a Custom Dataset**

In this tutorial, we will utilize an open source computer vision dataset from one of the 90,000+ available on [Roboflow Universe](https://universe.roboflow.com).

If you already have your own images (and, optionally, annotations), you can convert your dataset using [Roboflow](https://roboflow.com), a set of tools developers use to build better computer vision models quickly and accurately. 100k+ developers use roboflow for (automatic) annotation, converting dataset formats (like to YOLOv7), training, deploying, and improving their datasets/models.

Follow [the getting started guide here](https://docs.roboflow.com/quick-start) to create and prepare your own custom dataset. Make sure to select **Instance Segmentation** Option, If you want to create your own dataset on roboflow.

## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator` and set it to `GPU`.

In [1]:
!nvidia-smi

Wed Sep  6 09:29:57 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.19                 Driver Version: 536.19       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce RTX 3060      WDDM  | 00000000:01:00.0  On |                  N/A |
|  0%   43C    P8               9W / 170W |    978MiB / 12288MiB |     21%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [2]:
ls

 Volume in drive C is OS
 Volume Serial Number is 1441-B12A

 Directory of c:\Users\ubayd\models\yolov7

09/04/2023  01:47 PM    <DIR>          .
08/31/2023  04:33 PM    <DIR>          ..
08/30/2023  11:28 AM    <DIR>          .git
08/30/2023  11:12 AM    <DIR>          cls
09/04/2023  11:58 AM                 0 crackdet_stream.py
08/30/2023  11:18 AM    <DIR>          det
08/31/2023  05:46 PM    <DIR>          images
08/30/2023  11:12 AM               320 README.md
09/04/2023  10:06 AM    <DIR>          seg
09/04/2023  02:03 PM             1,963 stream.py
09/04/2023  02:25 PM             2,786 test1.py
08/31/2023  06:17 PM    <DIR>          video
09/04/2023  12:41 PM    <DIR>          videos
               4 File(s)          5,069 bytes
               9 Dir(s)  48,654,540,800 bytes free


### Install YOLOv7

In [2]:
cd ..

c:\Users\ubayd\models


In [3]:
ls

 Volume in drive C is OS
 Volume Serial Number is 1441-B12A

 Directory of c:\Users\ubayd\models

09/05/2023  03:05 PM    <DIR>          .
09/01/2023  05:35 PM    <DIR>          ..
08/09/2023  04:36 PM    <DIR>          gstreamer-vaapi
09/05/2023  03:05 PM    <DIR>          runs
08/30/2023  11:25 AM    <DIR>          test
08/30/2023  05:59 PM         1,443,531 train-yolov7-instance-segmentation-on-custom-data.ipynb
08/30/2023  05:05 PM             7,558 Untitled-1.ipynb
08/17/2023  12:55 PM    <DIR>          yolov5
09/04/2023  03:58 PM    <DIR>          yolov7
08/31/2023  04:32 PM    <DIR>          yolov722
               2 File(s)      1,451,089 bytes
               8 Dir(s)  28,586,614,784 bytes free


In [4]:
import os
HOME = os.getcwd()
print(HOME)

c:\Users\ubayd\models


In [7]:
# clone YOLOv7 repository
%cd {HOME}
#!git clone https://github.com/WongKinYiu/yolov7

# navigate to yolov7 directory and checkout u7 branch of YOLOv7 - this is hash of lates commit from u7 branch as of 12/21/2022
%cd {HOME}
#!git checkout 44f30af0daccb1a3baecc5d80eae22948516c579

c:\Users\ubayd\models
c:\Users\ubayd\models


### Install Requirements

In [5]:
%cd {HOME}/yolov7/seg
!pip install --upgrade pip
!pip install -r requirements.txt

c:\Users\ubayd\models\yolov7\seg


### Inference with pre-trained COCO model

In [None]:
# download COCO starting checkpoint to yolov7/seg directory
%cd {HOME}/yolov7/seg
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-seg.pt

WEIGHTS_PATH = f"{HOME}/yolov7/seg/yolov7-seg.pt"

In [None]:
# download example image to yolov7/seg directory
#%cd {HOME}/yolov7/seg
#!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1sPYHUcIW48sJ67kh5MHOI3GfoXlYNOfJ' -O dog.jpeg

IMAGE_PATH = f"{HOME}/yolov7/seg/dog.jpeg"

In [None]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights $WEIGHTS_PATH --source $IMAGE_PATH --name coco

**NOTE:** For each experiment, YOLOv7 creates a separate result directory. By default the result directories are named `exp`, `exp2`, `exp3`.... and so on. We can change this name using the `--name` parameter passed to `segment/predict.py` script.

In [None]:
RESULT_IMAGE_PATH = f"{HOME}/yolov7/seg/runs/predict-seg/coco6/dog.jpeg"

In [None]:
from IPython.display import Image, display

display(Image(filename=RESULT_IMAGE_PATH))

### Required data format

For YOLOv7 segmentation models, we will use the YOLO v7 PyTorch format.

**NOTE:** If you want to learn more about annotation formats visit [Computer Vision Annotation Formats](https://roboflow.com/formats) where we talk about each of them in detail.

1. Dataset directory structure 

Dataset directory contains images and labels divided into three parts - train, test and validation sub-sets. In addition, there should be a `data.yaml` file in the dataset root directory.

```
HOME/
â””â”€â”€ dataset-name/
    â”œâ”€â”€ test/
    â”‚   â”œâ”€â”€ images/
    â”‚   â”‚   â”œâ”€â”€ image-0.jpg
    â”‚   â”‚   â”œâ”€â”€ image-1.jpg
    â”‚   â”‚   â””â”€â”€ ...
    â”‚   â””â”€â”€ labels/
    â”‚       â”œâ”€â”€ image-0.txt
    â”‚       â”œâ”€â”€ image-1.txt
    â”‚       â””â”€â”€ ...
    â”œâ”€â”€ test/
    â”‚   â”œâ”€â”€ images/
    â”‚   â”‚   â””â”€â”€ ...
    â”‚   â””â”€â”€ labels/
    â”‚       â””â”€â”€ ...
    â”œâ”€â”€ valid/
    â”‚   â”œâ”€â”€ images/
    â”‚   â”‚   â””â”€â”€ ...
    â”‚   â””â”€â”€ labels/
    â”‚       â””â”€â”€ ...
    â””â”€â”€ data.yaml
```

2. Label file structure

Each label file should be in `.txt` format, and have the same name (except for the extension) as the corresponding image. Take a peek below at an example of a label file content.

```
0 0.03686995913461539 0.9808467740384615 0.03245967788461539 0.9595654110576923 0.030569555288461538 0.9517249110576922 0.03497983894230769 0.9438844086538462 0.0375 0.9304435480769231 0.038130040865384615 0.9203629038461538 0.053251007211538456 0.9091621875 0.057031250000000006 0.9002016129807693 0.05955141105769231 0.8822804663461539 0.060181451923076924 0.8699596778846155 0.06585181490384616 0.85987903125 0.07089213701923078 0.8520385312500001 0.07341229807692308 0.8408378125 0.07341229807692308 0.8307571682692309 0.07782258173076924 0.8195564519230769 0.0765625 0.8128360216346154 0.08034274278846154 0.7926747307692307 0.08790322596153846 0.7769937283653847 0.09294354807692308 0.7691532259615385 0.09420362980769231 0.753472221153846 0.10050403125 0.7310707884615385 0.11499495913461538 0.7153897860576923 0.12444556490384615 0.7086693557692307 0.13641633173076922 0.6974686370192308 0.14523689423076924 0.6851478485576923 0.15090725721153847 0.6717069903846155 0.15342741826923076 0.6593861995192308 0.1572076610576923 0.6448252692307693 0.16224798317307693 0.6302643365384615 0.16980846875 0.6313844086538462 0.17736895192307692 0.6347446225961538 0.18492943509615384 0.6369847668269231 0.1912298389423077 0.6381048389423077 0.1962701610576923 0.6381048389423077 0.20698084615384615 0.6381048389423077 0.22336189423076924 0.6369847668269231 0.22714213701923078 0.6280241947115385 0.22084173317307693 0.6280241947115385 0.2107610889423077 0.6302643365384615 0.20446068509615384 0.6313844086538462 0.19942036298076923 0.6313844086538462 0.18744959615384615 0.6291442644230769 0.1761088701923077 0.6213037644230769 0.16602822596153846 0.6179435480769231 0.15783770192307692 0.6224238341346153 0.1565776201923077 0.6291442644230769 0.1521673389423077 0.6414650528846154 0.14964717788461537 0.6493055552884616 0.14712701682692308 0.6661066298076924 0.1401965721153846 0.6784274182692307 0.13515625 0.684027778846154 0.12822580528846153 0.6896281370192308 0.12192540384615384 0.6952284951923077 0.11814516105769231 0.6985887091346155 0.1131048389423077 0.7041890673076924 0.10617439423076923 0.7097894254807693 0.10050403125 0.71875 0.0935735889423077 0.7310707884615385 0.09042338701923078 0.7422715048076923 0.09042338701923078 0.7545922932692307 0.08853326682692307 0.7657930096153845 0.08286290384615384 0.7736335120192308 0.07719254086538462 0.7837141586538461 0.07467237980769231 0.7926747307692307 0.07215221875 0.8072356634615385 0.06900201682692307 0.8184363798076923 0.06837197596153846 0.8329973125 0.06837197596153846 0.8441980288461538 0.06396169471153847 0.8509184591346154 0.057031250000000006 0.857638889423077 0.053251007211538456 0.8677195336538461 0.053251007211538456 0.8744399639423077 0.055141129807692306 0.884520608173077 0.05451108894230769 0.892361110576923 0.050730846153846154 0.9024417572115385 0.04254032211538462 0.9102822572115384 0.03182963701923077 0.9170026875 0.03182963701923077 0.9270833341346154 0.03182963701923077 0.9338037644230769 0.026789314903846152 0.9405241947115385 0.025529233173076923 0.9483646947115385 0.02741935576923077 0.9662858413461539 0.03245967788461539 0.9819668461538461 0.035609879807692306 0.9920474903846154 0.04632056490384616 0.9998879927884615 0.04191028125 0.9954077067307693 0.03686995913461539 0.9808467740384615
0 0.005997983173076923 0.533938173076923 0.012928427884615384 0.5428987451923077 0.01922883173076923 0.545138889423077 0.03434979807692308 0.5496191754807692 0.045690524038461536 0.55857975 0.053251007211538456 0.55857975 0.06459173317307693 0.5596998197115385 0.07278225721153846 0.5596998197115385 0.07782258173076924 0.5619399639423077 0.08664314423076923 0.5686603942307692 0.09231350721153847 0.5753808245192308 0.09861391105769231 0.5843413990384616 0.10680443509615384 0.5899417572115385 0.1131048389423077 0.5899417572115385 0.12255544471153847 0.5899417572115385 0.13011592788461537 0.5888216850961538 0.14145665384615386 0.5933019711538461 0.14712701682692308 0.6000224014423077 0.1565776201923077 0.6033826153846154 0.16791834615384615 0.6078629038461538 0.16854838701923078 0.6000224014423077 0.1572076610576923 0.5899417572115385 0.14460685576923077 0.5888216850961538 0.13956653125000001 0.5809811826923077 0.13074596875 0.579861110576923 0.11688508173076924 0.5832213269230769 0.10806451682692307 0.5832213269230769 0.10239415384615384 0.57762096875 0.0935735889423077 0.5653001802884615 0.08538306490384615 0.5596998197115385 0.07719254086538462 0.552979391826923 0.06774193509615384 0.5518593197115385 0.05388104807692307 0.5518593197115385 0.04191028125 0.5462589615384615 0.033089718750000004 0.5417786730769231 0.025529233173076923 0.53953853125 0.018598790865384615 0.5384184591346154 0.009778225961538461 0.5316980288461539 0.001587701923076923 0.5216173846153846 0.005997983173076923 0.533938173076923
0 0.23722278125 0.5989023293269231 0.2491935480769231 0.5921818990384615 0.25864415384615386 0.5877016129807692 0.2706149182692308 0.58546146875 0.2794354831730769 0.5787410384615385 0.2895161298076923 0.5709005384615384 0.2989667331730769 0.5675403221153846 0.31030745913461544 0.5686603942307692 0.3159778221153846 0.5709005384615384 0.32605846875 0.57762096875 0.3317288317307692 0.579861110576923 0.34432963701923075 0.591061826923077 0.36260080528846156 0.591061826923077 0.3764616947115385 0.591061826923077 0.3922127019230769 0.5888216850961538 0.4098538317307692 0.5921818990384615 0.42749495913461544 0.6000224014423077 0.43190524278846154 0.6011424735576922 0.4489163317307692 0.6011424735576922 0.46340725721153847 0.6011424735576922 0.4772681442307692 0.6011424735576922 0.49616935576923077 0.6022625456730769 0.5169606850961539 0.5977822572115384 0.5264112908653846 0.5944220432692308 0.5478326610576923 0.5944220432692308 0.5654737908653846 0.5955421153846154 0.5881552427884615 0.5955421153846154 0.6045362908653846 0.5955421153846154 0.6215473798076924 0.5989023293269231 0.6385584687500001 0.6011424735576922 0.6517893149038462 0.6011424735576922 0.6706905240384615 0.6089829759615385 0.6977822572115385 0.6123431899038462 0.7135332668269231 0.5933019711538461 0.7374747980769231 0.5865815408653846 0.7595262091346153 0.57762096875 0.7689768149038462 0.5675403221153846 0.7822076610576923 0.5619399639423077 0.7929183461538462 0.5675403221153846 0.8023689519230769 0.5765008966346155 0.8118195552884615 0.5765008966346155 0.8231602812500001 0.5877016129807692 0.8338709687500001 0.5921818990384615 0.8477318557692308 0.5989023293269231 0.8653729831730769 0.6101030456730769 0.8767137091346153 0.6123431899038462 0.8905745961538462 0.6190636201923077 0.9019153221153846 0.6145833341346154 0.9107358870192308 0.6168234759615385 0.9176663317307692 0.6201836923076923 0.9321572572115385 0.62354390625 0.9460181442307692 0.6123431899038462 0.9365675408653846 0.6190636201923077 0.9308971778846153 0.6168234759615385 0.9208165312500001 0.6101030456730769 0.9101058461538462 0.6056227596153846 0.8949848798076924 0.6033826153846154 0.8880544350961539 0.6067428317307693 0.8748235889423076 0.6056227596153846 0.8489919350961539 0.5944220432692308 0.8382812500000001 0.5832213269230769 0.8294606850961539 0.5709005384615384 0.8187500000000001 0.5675403221153846 0.8055191538461539 0.5630600360576923 0.7985887091346153 0.5596998197115385 0.7866179447115385 0.5518593197115385 0.7752772187500001 0.5518593197115385 0.7658266129807692 0.5540994615384616 0.7582661298076924 0.55857975 0.7525957668269231 0.5619399639423077 0.7406250000000001 0.5686603942307692 0.7292842740384615 0.5742607524038461 0.7229838701923076 0.5753808245192308 0.7185735889423076 0.5787410384615385 0.7066028221153846 0.58546146875 0.6990423389423076 0.5944220432692308 0.6921118942307692 0.6022625456730769 0.6820312500000001 0.6000224014423077 0.6725806442307692 0.5977822572115384 0.6536794350961539 0.591061826923077 0.6417086682692308 0.5877016129807692 0.6221774182692308 0.5899417572115385 0.6076864927884615 0.58546146875 0.5982358870192308 0.5832213269230769 0.5856350817307692 0.5809811826923077 0.5768145168269231 0.58546146875 0.5623235889423077 0.58546146875 0.5434223798076923 0.5821012548076923 0.5327116947115385 0.5821012548076923 0.5251512091346154 0.5843413990384616 0.5100302427884615 0.5888216850961538 0.5005796370192308 0.5921818990384615 0.48608870913461544 0.5966621875 0.46529737980769226 0.5933019711538461 0.4558467740384616 0.5899417572115385 0.44324596875 0.5899417572115385 0.43190524278846154 0.591061826923077 0.41489415384615386 0.5809811826923077 0.40418346875 0.5809811826923077 0.39536290384615386 0.5809811826923077 0.3815020168269231 0.579861110576923 0.3670110889423077 0.579861110576923 0.34936995913461544 0.5753808245192308 0.3424395168269231 0.572020608173077 0.32857862980769226 0.56642025 0.3178679447115385 0.5619399639423077 0.3096774182692308 0.55857975 0.29833669471153845 0.5552195336538461 0.29203629086538463 0.55857975 0.2863659278846154 0.5630600360576923 0.27754536298076926 0.5686603942307692 0.2699848798076923 0.5731406802884615 0.25864415384615386 0.5742607524038461 0.2517137091346154 0.5753808245192308 0.23722278125 0.5843413990384616 0.2296622980769231 0.5921818990384615 0.21832157211538464 0.6089829759615385 0.22714213701923078 0.6022625456730769 0.23722278125 0.5989023293269231
```

Each row in the labels file has the same structure: `class_index x1 y1 x2 y2 x3 y3 ...`

3. `data.yaml` file structure

```
names:
- class_1
- ...
- class_n
nc: n
train: dataset-name/train/images
val: dataset-name/valid/images
```

### Download dataset from Roboflow Universe

You will need your `API_KEY`. You can find it by clicking on your profile in the upper right corner of the Roboflow app, then `Settings`. You will be redirected to `Roboflow: Settings` page. Now on the left, below `WORKSPACES` click in `Roboflow` -> `Roboflow API`. Copy the `Private API Key`. Run the cell below with `Shift + Enter`. Paste your `API_KEY` in the prompt. 

In [None]:
from getpass import getpass

In [None]:
# copy your API KEY from 
getpass('Enter YOUR_API_KEY secret value:')

In [None]:
%cd {HOME}/yolov7/seg

!pip install roboflow --quiet

from roboflow import Roboflow

rf = Roboflow(api_key=api_key)
project = rf.workspace("university-bswxt").project("crack-bphdr")
dataset = project.version(2).download("yolov7")

### Custom Training

In [None]:
%cd {HOME}/yolov7/seg
!python segment/train.py --batch 16 \
 --epochs 10 \
 --data {dataset.location}/data.yaml \
 --weights $WEIGHTS_PATH \
 --device 0 \
 --name custom

In [None]:
from IPython.display import Image, display

display(Image(filename=f"{HOME}/yolov7/seg/runs/train-seg/custom/val_batch0_labels.jpg"))

### Evaluation

We can evaluate the performance of our custom training using the provided evalution script.

In [12]:
print(HOME)

c:\Users\ubayd\models


In [25]:
%cd {HOME}/yolov7/seg

c:\Users\ubayd\models\yolov7\seg


In [44]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source ../images/frame0001.jpg

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=../images/frame0001.jpg, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
image 1/1 C:\Users\ubayd\models\yolov7\images\frame0001.jpg: 384x640 1 crack, 347.9ms
Speed: 0.0ms pre-process, 347.9ms inference, 1.0ms NMS per image at shape (1, 3, 640, 640)
Results saved to [1mruns\predict-seg\exp5[0m


In [38]:
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source ../images

[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=../images, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
image 1/6 C:\Users\ubayd\models\yolov7\images\frame0001.jpg: 384x640 1 crack, 354.5ms
image 2/6 C:\Users\ubayd\models\yolov7\images\frame0118.jpg: 384x640 3 cracks, 354.5ms
image 3/6 C:\Users\ubayd\models\yolov7\images\frame0205.jpg: 384x640 4 cracks, 319.8ms
image 4/6 C:\Users\ubayd\models\yolov7\images\frame0329.jpg: 384x640 3 cr

In [None]:
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source ../video

In [6]:
!python --version

Python 3.10.12


No we can display results some of the results

In [7]:
import glob
from IPython.display import Image, display

for imageName in glob.glob('/content/yolov7/seg/runs/predict-seg/exp/*.jpg')[:2]:
      display(Image(filename=imageName))
      print("\n")

In [20]:
print(HOME)

c:\Users\ubayd\models


Congratulations on your first instance segmentation YOLOv7 model! If you're still hungry for knowledge, visit the [Roboflow Notebooks](https://github.com/roboflow/notebooks) repository. There you'll find plenty of Computer Vision tutorials - from ResNet to the latest Transformers. 

In [9]:
%cd {HOME}/yolov7


c:\Users\ubayd\models\yolov7


In [10]:
!python stream.py

Using cache found in C:\Users\ubayd/.cache\torch\hub\ultralytics_yolov5_v6.0
YOLOv5  2023-9-4 torch 2.0.1+cpu CPU

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients
Adding AutoShape... 


In [14]:
!python test1.py

Traceback (most recent call last):
  File "c:\Users\ubayd\models\yolov7\test1.py", line 1, in <module>
    import gi
ModuleNotFoundError: No module named 'gi'


In [23]:
!python test2.py

Traceback (most recent call last):
  File "c:\Users\ubayd\models\yolov7\test2.py", line 3, in <module>
    from utils.general import increment_path
ModuleNotFoundError: No module named 'utils'


In [31]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source ../video

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=../video, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
video 1/1 (1/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 314.2ms
video 1/1 (2/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 338.6ms
video 1/1 (3/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 350.6ms
video 1/1 (4/600) C:\Users\ubayd\models\yolov7\vi

In [15]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source https://www.youtube.com/watch?v=ylgzNKpgh4s

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=https://www.youtube.com/watch?v=ylgzNKpgh4s, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
1/1: https://www.youtube.com/watch?v=ylgzNKpgh4s...  Success (432 frames 1280x720 at 24.00 FPS)

0: 384x640 (no detections), 315.7ms
0: 384x640 (no detections), 305.0ms
0: 384x640 1 crack, 288.6ms
0: 384x640 1 crack, 291.7ms
0: 384x640 1 crack, 319.7ms
0: 384x640 1 crack, 322.6ms
0: 384x640 1 crack

In [23]:
import cv2
import urllib.request

# Replace this with your video stream URL
stream_url = "http://localhost:8080/YUN_0006.mp4"

# Open the video stream
cap = cv2.VideoCapture(stream_url)

frame_number = 0  # Initialize a frame counter

while True:
    ret, frame = cap.read()

    if not ret:
        break

    # Save each frame as an image
    frame_number += 1
    image_name = f"frame_{frame_number:04d}.jpg"  # Use 4 digits for frame numbering
    cv2.imwrite(image_name, frame)

    # Press 'q' to exit the video stream
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the video capture object
cap.release()
cv2.destroyAllWindows()


In [42]:
# Import necessary libraries
import cv2

# # Set the preferable backend and target for OpenCV's DNN module
# cv2.dnn.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
# cv2.dnn.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

# Change directory to the YOLOv7 directory
#%cd {HOME}/yolov7

# Create a VideoCapture object to read the video stream
video_capture = cv2.VideoCapture('http://localhost:8080/YUN_0006.mp4')

# Load YOLOv7 model for object segmentation
net = cv2.dnn.readNet('./yolov7/seg/runs/train-seg/custom/weights/best.pt')

# Define a list of class names (you may need to adjust this based on your specific classes)
class_names = ['class1', 'class2', 'class3']

while True:
    # Read a frame from the video stream
    ret, frame = video_capture.read()

    if not ret:
        break

    # Perform object segmentation using YOLOv7
    blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)
    net.setInput(blob)
    layer_names = net.getUnconnectedOutLayersNames()
    detections = net.forward(layer_names)

    # Process and display the segmented objects
    for detection in detections:
        # Extract class ID, confidence, bounding box coordinates
        # and calculate the corresponding box position
        for obj in detection:
            scores = obj[5:]
            class_id = scores.argmax()
            confidence = scores[class_id]

            if confidence > 0.25:  # Adjust this threshold as needed
                center_x = int(obj[0] * frame.shape[1])
                center_y = int(obj[1] * frame.shape[0])
                width = int(obj[2] * frame.shape[1])
                height = int(obj[3] * frame.shape[0])
                x = int(center_x - width / 2)
                y = int(center_y - height / 2)

                # Draw bounding box and label on the frame
                cv2.rectangle(frame, (x, y), (x + width, y + height), (0, 255, 0), 2)
                cv2.putText(frame, class_names[class_id], (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Display the frame with segmentation results
    cv2.imshow('YOLOv7 Object Segmentation', frame)

    # Exit the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the VideoCapture and close the OpenCV window
video_capture.release()
cv2.destroyAllWindows()

error: OpenCV(4.8.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\dnn_read.cpp:56: error: (-2:Unspecified error) Cannot determine an origin framework of files: ./yolov7/seg/runs/train-seg/custom/weights/best.pt in function 'cv::dnn::dnn4_v20230620::readNet'


In [44]:
import cv2

# Change directory to the YOLOv7 directory
# %cd {HOME}/yolov7

# Create a VideoCapture object to read the video stream
video_capture = cv2.VideoCapture('http://localhost:8080/YUN_0006.mp4')

# Load YOLOv7 model for object segmentation and set backend and target
# Load YOLOv7 model for object segmentation and set backend and target
net = cv2.dnn.readNet('./yolov7/seg/runs/train-seg/custom/weights/best.pt', './yolov7/seg/runs/train-seg/custom/opt.yaml')

# Define a list of class names (you may need to adjust this based on your specific classes)
class_names = ['class1', 'class2', 'class3']

while True:
    # Read a frame from the video stream
    ret, frame = video_capture.read()

    if not ret:
        break

    # Perform object segmentation using YOLOv7
    blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)
    net.setInput(blob)
    layer_names = net.getUnconnectedOutLayersNames()
    detections = net.forward(layer_names)

    # Process and display the segmented objects
    for detection in detections:
        # Extract class ID, confidence, bounding box coordinates
        # and calculate the corresponding box position
        for obj in detection:
            scores = obj[5:]
            class_id = scores.argmax()
            confidence = scores[class_id]

            if confidence > 0.25:  # Adjust this threshold as needed
                center_x = int(obj[0] * frame.shape[1])
                center_y = int(obj[1] * frame.shape[0])
                width = int(obj[2] * frame.shape[1])
                height = int(obj[3] * frame.shape[0])
                x = int(center_x - width / 2)
                y = int(center_y - height / 2)

                # Draw bounding box and label on the frame
                cv2.rectangle(frame, (x, y), (x + width, y + height), (0, 255, 0), 2)
                cv2.putText(frame, class_names[class_id], (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Display the frame with segmentation results
    cv2.imshow('YOLOv7 Object Segmentation', frame)

    # Exit the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the VideoCapture and close the OpenCV window
video_capture.release()
cv2.destroyAllWindows()


error: OpenCV(4.8.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\dnn_read.cpp:56: error: (-2:Unspecified error) Cannot determine an origin framework of files: ./yolov7/seg/runs/train-seg/custom/weights/best.pt, ./yolov7/seg/runs/train-seg/custom/opt.yaml in function 'cv::dnn::dnn4_v20230620::readNet'


In [50]:
import subprocess
import cv2

# Define the URL of the HTTP video stream
video_stream_url = "http://localhost:8080/YUN_0006.mp4"

# Run the YOLOv7 segmentation script using subprocess
yolov7_command = f"python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source {video_stream_url}"
subprocess.Popen(yolov7_command, shell=True)

# Open the video stream
cap = cv2.VideoCapture(video_stream_url)

# Check if the video stream was successfully opened
if not cap.isOpened():
    print("Error: Could not open video stream.")
else:
    # Loop to read frames from the video stream
    while True:
        ret, frame = cap.read()
        
        if not ret:
            break
        
        # Process the frame here, for example, run YOLOv7 segmentation on it.
        
        # Display the processed frame (replace this with your processing code)
        cv2.imshow("Frame", frame)
        
        # Break the loop when 'q' key is pressed
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    # Release the video capture object and close any open windows
    cap.release()
    cv2.destroyAllWindows()


In [51]:
import subprocess
import cv2

# Define the URL of the HTTP video stream
video_stream_url = "http://localhost:8080/YUN_0006.mp4"

# Run the YOLOv7 segmentation script using subprocess
yolov7_command = f"python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source {video_stream_url}"
subprocess.Popen(yolov7_command, shell=True)

# Open the video stream
cap = cv2.VideoCapture(video_stream_url)

# Check if the video stream was successfully opened
if not cap.isOpened():
    print("Error: Could not open video stream.")
else:
    # Loop to read frames from the video stream
    while True:
        ret, frame = cap.read()
        
        if not ret:
            break
        
        # Process the frame here, for example, run YOLOv7 segmentation on it.
        
        # Display the processed frame (replace this with your processing code)
        cv2.imshow("Frame", frame)
        
        # Break the loop when 'q' key is pressed
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    # Release the video capture object and close any open windows
    cap.release()
    cv2.destroyAllWindows()


In [54]:
import subprocess
import cv2

# Define the URL of the HTTP video stream
video_stream_url = "http://localhost:8080/YUN_0006.mp4"

# Run the YOLOv7 segmentation script using subprocess
yolov7_command = f"python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source {video_stream_url}"
subprocess.Popen(yolov7_command, shell=True)

# Open the video stream
cap = cv2.VideoCapture(video_stream_url)

# Check if the video stream was successfully opened
if not cap.isOpened():
    print("Error: Could not open video stream.")
else:
    # Loop to read frames from the video stream
    while True:
        ret, frame = cap.read()
        
        if not ret:
            break
        
        # Process the frame here, for example, run YOLOv7 segmentation on it.
        # Ensure that the segmented frames are available in the 'frame' variable.
        
        # Display the segmented frame
        cv2.imshow("Segmented Frame", frame)it i]
    cv2.destroyAllWindows()


In [55]:
import subprocess
import cv2

# Define the URL of the HTTP video stream
video_stream_url = "http://localhost:8080/YUN_0006.mp4"

# Run the YOLOv7 segmentation script using subprocess
yolov7_command = f"python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source {video_stream_url}"
subprocess.Popen(yolov7_command, shell=True)

# Open the video stream
cap = cv2.VideoCapture(video_stream_url)

# Create a window for displaying segmented frames
cv2.namedWindow("Segmented Frames", cv2.WINDOW_NORMAL)

# Check if the video stream was successfully opened
if not cap.isOpened():
    print("Error: Could not open video stream.")
else:
    # Loop to read frames from the video stream
    while True:
        ret, frame = cap.read()
        
        if not ret:
            break
        
        # Process the frame using YOLOv7 to obtain segmentation results
        # Replace this with your actual YOLOv7 code for segmentation
        segmented_frame = frame  # Replace this with the actual segmented result
        
        # Display the segmented frame
        cv2.imshow("Segmented Frames", segmented_frame)
        
        # Break the loop when 'q' key is pressed
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    # Release the video capture object and close the display window
    cap.release()
    cv2.destroyAllWindows()


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=http://localhost:8080/YUN_0006.mp4, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
Found http://localhost:8080/YUN_0006.mp4 locally at YUN_0006.mp4
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
video 1/1 (1/3770) C:\Users\ubayd\models\yolov7\seg\YUN_0006.mp4: 384x640 1 crack, 326.9ms
video 1/1 (2/3770) C:\Users\ubayd\models\yolov7\seg\YUN_0006.mp4: 384x640 1 crack, 348.8ms
video 1/1 (3/3770) C:\Users\ubayd\models\yolov7\seg\YUN_0006.

In [61]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source https://www.youtube.com/watch?v=-BlvPpQLchw

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=https://www.youtube.com/watch?v=-BlvPpQLchw, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
1/1: https://www.youtube.com/watch?v=-BlvPpQLchw...  Success (1885 frames 1280x720 at 25.00 FPS)

0: 384x640 1 crack, 316.5ms
0: 384x640 1 crack, 303.5ms
0: 384x640 (no detections), 293.2ms
0: 384x640 (no detections), 299.0ms
0: 384x640 (no detections), 294.0ms
0: 384x640 (no detections), 290.7ms
0

In [63]:

%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source rtsp://127.0.0.1:8554/live

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=rtsp://127.0.0.1:8554/live, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
[ERROR:0@35.101] global cap.cpp:166 cv::VideoCapture::open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.8.0) D:\a\opencv-python\opencv-python\opencv\modules\videoio\src\cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): rtsp://127.0.0.1:8554/live in 

In [47]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source http://localhost:8080/YUN_0006.mp4

c:\Users\ubayd\models\yolov7\seg
^C


In [36]:
%cd {HOME}/yolov7/seg
!python test1.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source ../video

c:\Users\ubayd\models\yolov7\seg


[34m[1mtest1: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=../video, data=..\data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=..\runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
video 1/1 (1/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 313.2ms
video 1/1 (2/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 301.1ms
video 1/1 (3/600) C:\Users\ubayd\models\yolov7\video\crack_video.mp4: 384x640 1 crack, 304.2ms
video 1/1 (4/600) C:\Users\ubayd\models\yolov7\video\

In [73]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf-thres 0.25 --source ../videos/YUN_0006.MP4

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=../videos/YUN_0006.MP4, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
video 1/1 (1/3770) C:\Users\ubayd\models\yolov7\videos\YUN_0006.MP4: 384x640 1 crack, 381.1ms
video 1/1 (2/3770) C:\Users\ubayd\models\yolov7\videos\YUN_0006.MP4: 384x640 1 crack, 348.7ms
video 1/1 (3/3770) C:\Users\ubayd\models\yolov7\videos\YUN_0006.MP4: 384x640 1 crack, 366.0ms
video 1/1 (4/3770) C:\Users\ubayd\mode

In [59]:
%cd {HOME}/yolov7/seg
!python teststream.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --source ../video/crack_video.mp4

c:\Users\ubayd\models\yolov7\seg


YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
Traceback (most recent call last):
  File "c:\Users\ubayd\models\yolov7\seg\teststream.py", line 59, in <module>
    main(opt)
  File "c:\Users\ubayd\models\yolov7\seg\teststream.py", line 39, in main
    pred, _ = model(img)
  File "c:\Users\ubayd\anaconda3\envs\ar310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "c:\Users\ubayd\models\yolov7\seg\models\common.py", line 537, in forward
    b, ch, h, w = im.shape  # batch, channel, height, width
ValueError: not enough values to unpack (expected 4, got 3)


In [14]:
%cd {HOME}/yolov7/seg
!python segment/predict.py --weights {HOME}/yolov7/seg/runs/train-seg/custom/weights/best.pt --conf 0.25 --source rtsp://localhost:5000/live

c:\Users\ubayd\models\yolov7\seg


[34m[1msegment\predict: [0mweights=['c:\\Users\\ubayd\\models/yolov7/seg/runs/train-seg/custom/weights/best.pt'], source=rtsp://localhost:5000/live, data=data\coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\predict-seg, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2023-8-30 Python-3.10.12 torch-2.0.1+cpu CPU

Fusing layers... 
Model summary: 325 layers, 37842476 parameters, 0 gradients, 141.9 GFLOPs
[ERROR:0@36.382] global cap.cpp:166 cv::VideoCapture::open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.8.0) D:\a\opencv-python\opencv-python\opencv\modules\videoio\src\cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): rtsp://localhost:5000/live in 