YOLOv7 on Cityscapes with bbox cropping

Introduction

In this project, we aimed to enhance the quality of the dashcam and monitor videos without costly upgrades. Using object detection and super-resolution techniques, we explored identifying and improving the visual details of cars or persons within low-quality frames.

We trained a YOLOv7 model on the Cityscapes dataset (convert to COCO format using cityscapes-to-coco-conversion) to detect objects of interest. Additionally, we incorporated Latent Diffusion Models (LDM) for super-resolution to further enhance the cropped regions.

Environment

Python 3.10.11
Pytorch 1.13.1
Torchvision 0.14.1
CUDA 11.7

Setup

Clone the project and its submodules

$ git clone --recurse-submodules https://github.com/ghnmqdtg/yolov7-on-cityscapes-with-bbox-cropping.git

Go into the project folder

$ cd yolov7-on-cityscapes-with-bbox-cropping

Run ./scripts/setup_env.sh to setup the env.
```
$ sh scripts/setup_env.sh
```
- Create a conda env named yolov7_with_cropping with python 3.10.11.
- Install pytorch with cuda 11.7.
- Install the dependencies.
(Optional) Change VSCode interpreter path with ~/.conda/envs/yolov7_with_cropping/bin/python.
Modify the ./scripts/setup_dataset.sh line 5 with your cityscapes username and password.
Run ./scripts/setup_dataset.sh to setup the env; this takes some time.
```
$ sh scripts/setup_dataset.sh
```
- Download the dataset.
- Use cityscapes-to-coco-conversion to generate bbox annotations of Cityscapes dataset using segmentation annotations. (Cityscapes has no bbox annotations).
- Convert annotations from COCO format to YOLO format.

Download the pretrained model and put it to ./yolov7 folder.

$ wget https://github.com/ghnmqdtg/yolov7-on-cityscapes-with-bbox-cropping/releases/download/v0.1/yolov7_cityscapes.pt \
    -O ./yolov7/yolov7_cityscapes.pt

Test Interface

We provide web interface to test the model. You can use the following command to start the web server.

Put your street view video in ./www, and rename it to street_view.mp4.
Start the backend server on a terminal
```
$ cd yolov7
$ python detect-web.py
```
Start the front-end on the other terminal
```
$ cd www
$ sh launch.sh
```
Go to http://localhost:30700/

Train and evaluate the YOLOv7 model

You should cd to yolov7 folder first
```
$ cd yolov7
```

Train the model with cityscapes

$ python -m torch.distributed.launch \
    --nproc_per_node 1 \
    --master_port 9527 \
    train.py \
    --workers 2 \
    --device 0 \
    --sync-bn \
    --epochs 100 \
    --batch-size 32 \
    --data data/cityscape.yaml \
    --img 640 640 \
    --cfg cfg/training/yolov7.yaml \
    --weights ./yolov7.pt \
    --hyp data/hyp.scratch.p5.yaml

The output will be saved in runs/train.

Click to toggle contents of YOLOv7 Model Training Results

Training & Evaluation Report
mAP@50: 0.61266	mAP@50:95 : 0.38005)

Confusion Matrix

F1 curve	PR curve

P curve	R curve

Evaluation

$ python test.py \
    --data data/cityscape.yaml \
    --img 640 \
    --batch 32 \
    --conf 0.001 \
    --iou 0.65 \
    --device 0 \
    --weights yolov7_cityscapes.pt \
    --name cityscapes_yolo_cityscapes

The output will be saved in runs/test.

Run inference

On single image

Only save the cropped region of width or height greater than 32px. Because if the region is too small, it will lead super resolution to generate the obvious artifact. The output will be saved in runs/detect.

$ python detect.py \
    --weights yolov7_cityscapes.pt \
    --conf 0.25 \
    --img-size 640 \
    --source customdata/images/test/bonn/bonn_000004_000019_leftImg8bit.png \
    --sr
    --sr-step 100

--sr: Enable super resolution 4x.
--sr-step: Control the effect of super-resolution, the larger, the better.

Performance of Cropping & Super Resolution
Crop	Crop & SR 4x	Crop	Crop & SR 4x

If you want to test super resolution only, you can use utils/custom_features.py at yolov7/ to do super resolution. If the width or height is larger than 150px, it will be resized to 150px and keep the aspect ratio first, then do super resolution.

$ python utils/custom_features.py \
    --input-img inference/images/cropped_car.jpg \
    --sr-step 100

On a video

Nope, I haven't tried it yet.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
cityscapes-to-coco-conversion @ d3d625d		cityscapes-to-coco-conversion @ d3d625d
imgs		imgs
scripts		scripts
www		www
yolov7 @ 702427c		yolov7 @ 702427c
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLOv7 on Cityscapes with bbox cropping

Introduction

Environment

Setup

Test Interface

Train and evaluate the YOLOv7 model

Run inference

About

Releases 1

Packages

Languages

ghnmqdtg/yolov7-on-cityscapes-with-bbox-cropping

Folders and files

Latest commit

History

Repository files navigation

YOLOv7 on Cityscapes with bbox cropping

Introduction

Environment

Setup

Test Interface

Train and evaluate the YOLOv7 model

Run inference

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages