Skip to content

YOLO-CROWD is a lightweight crowd counting and face detection model that is based on Yolov5s and can run on edge devices, as well as fixing the problems of face occlusion, varying face scales, and other challenges of crowd counting

License

Notifications You must be signed in to change notification settings

zaki1003/YOLO-CROWD

Repository files navigation

YOLO-CROWD

YOLO-CROWD is a lightweight crowd counting and face detection model that is based on Yolov5s and can run on edge devices, as well as fixing the problems of face occlusion, varying face scales, and other challenges of crowd counting

Description

Deep learning-based algorithms for face and crowd identification have advanced significantly. These algorithms can be broadly categorized into two groups: one-stage detectors like YOLO and two-stage detectors like Faster R-CNN. One-stage detectors have been widely employed in many applications due to the better balance between accuracy and speed, but as we are all aware, YOLO algorithms are significantly impacted by occlusion in crowd scenarios. In our project, we propose a real-time crowd counter and face detector called YOLO-CROWD, which has an inference speed of 10.1 ms and contains 461 layers and 18388982 parameters. It is based on the one-stage detector YOLOv5. In order to improve the receptive field of small faces, we use a Receptive Field Enhancement module termed RFE. We then use NWD Loss to compensate for the sensitivity of IoU to the position deviation of small objects. We also employ Repulsion Loss to address face occlusion and utilize an attention module called SEAM.

Demo

Images

test-yolo-crowd

Screenshot from 2023-04-07 15-49-11

Screenshot from 2023-04-07 15-48-52

Videos

without showing label

inference.with.yolo-crowd.mp4

with showing label (name + conf)

usmba2.after.compression.mp4

Comparison Between Yolov5s And YOLO-CROWD

mAp@0.5 mAp@0.5-0.95 Precision Recall Box loss Object loss Inference Time (ms)
Yolov5s 39.4 0.15 0.754 0.382 0.120 0.266 7
YOLO-CROWD 43.6 0.158 0.756 0.424 0.091 0.158 10.1

Environment Requirments

Create a Python Virtual Environment.

conda create -n {name} python=x.x

Enter Python Virtual Environment.

conda activate {name}
!pip install install torch==1.11 torchvision==0.12 torchtext==0.12 torchaudio==0.11

Install other python package.

pip install -r requirements.txt

Step-Through Example

Installation

Get the code.

git clone https://github.com/zaki1003/YOLO-CROWD.git

Dataset

Download our Dataset crowd-counting-dataset-w3o7w, while exporting the dataset select YOLO v5 PyTorch Format.

our-dataset

Preweight

The link is yolov5s.pt

Training

Train your model on crowd-counting-dataset-w3o7w dataset.

python train.py --img 416
                --batch 16
                --epochs 200
                --data {dataset.location}/data.yaml
                --cfg models/yolo-crowd.yaml    
                --weights yolov5s.pt      
                --name yolo_crowd_results
                --cache

Postweight

The link is yolo-crowd.pt If you want to have more inference speed try to install TensorRt and use this vesion yolo-crowd.engine

Test

python detect.py --weights yolo-crowd.pt --source 0                               # webcam
                                                  img.jpg                         # image
                                                  vid.mp4                         # video
                                                  screen                          # screenshot
                                                  path/                           # directory
                                                  list.txt                        # list of images
                                                  list.streams                    # list of streams
                                                  'path/*.jpg'                    # glob
                                                  'https://youtu.be/Zgi9g1ksQHc'  # YouTube
                                                  'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream

Results

results-yolo-crowd

Finetune

see in ultralytics/yolov5#607

# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve

# Multi-GPU
for i in 0 1 2 3 4 5 6 7; do
  sleep $(expr 30 \* $i) &&  # 30-second delay (optional)
  echo 'Starting GPU '$i'...' &&
  nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --device $i --evolve > evolve_gpu_$i.log &
done

# Multi-GPU bash-while (not recommended)
for i in 0 1 2 3 4 5 6 7; do
  sleep $(expr 30 \* $i) &&  # 30-second delay (optional)
  echo 'Starting GPU '$i'...' &&
  "$(while true; do nohup python train.py... --device $i --evolve 1 > evolve_gpu_$i.log; done)" &
done

Reference

https://github.com/ultralytics/yolov5

https://github.com/deepcam-cn/yolov5-face

https://github.com/open-mmlab/mmdetection

https://github.com/dongdonghy/repulsion_loss_pytorch

Contact

We use code's license is MIT License. The code can be used for business inquiries or professional support requests.

About

YOLO-CROWD is a lightweight crowd counting and face detection model that is based on Yolov5s and can run on edge devices, as well as fixing the problems of face occlusion, varying face scales, and other challenges of crowd counting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published