Skip to content

DASH-Lab/VFP290K

Repository files navigation

VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection

This repository is the official documentation & implementation of VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection.

VFP290K

Requirements

Our pretrained models except YOLO are based on MMdetection2 detection framework. You can donwload coco-pretrained models for the transfer learning.

Download our VFP290K dataset in here: VFP290K.

Environments

CUDA=11.1
python=3.7
CUDNN=7.6.5

MMdetection-based models

1. Install Pytorch

We use pytorch=1.8.0 from this link.

2. Install MMDetection & pkgs

pip install openmim
mim install mmdet
pip install future tensorboard
pip install -r requirements.txt

3. Prepare all preprocessed data for training and inference.

python make_label.py --data_root_dir <VFP directory>

4. Download the checkpoints files

You can find checkpoints files in the official repository and put them into ./checkpoints

5. Set a config file.

We prepare all config files used in our experiments in "configs/VFP290K".
Set your labels.txt and VFP290K data directory.
(classes= "<YOUR labels.txt DIRECTORY>", data_root = "<YOUR DATA DIRECTORY>")

6. Running Benchmark or desired experiment

To train and evaluate the model(s) in the paper, run this command:

  • single gpu training

    python tools/train.py <config> --gpu-ids <device> 
    

    and indicate path of the config file and gpu id, respectively. This example is for train faster R-CNN model on gpu 0.
    ex) python tools/train.py configs/VFP290K/faster_rcnn_r50_1x_benchmark.py --gpu-ids 0

  • multi gpu training

    bash ./tools/dist_train.sh <config> <num_gpu> 
    

    <num_gpu> is a number of gpus to use. This example is for train faster R-CNN model with 4 gpus.
    ex) bash ./tools/dist_train.sh configs/VFP290K/faster_rcnn_r50_1x_benchmark.py 4

  • test After train the model, you can evaluate the result.

    python tools/test.py <config> <weight> --eval bbox --gpu-ids <device>
    

    is the path of the trained model weight.
    ex) python tools/test.py configs/VFP290K/faster_rcnn_r50_1x_benchmark.py work_dirs/faster_rcnn_r50_1x_benchmark/latest.pth --eval bbox --gpu-ids 1

YOLOv5

1. Generate .txt files for yolo.

cd yolov5
python configs/data_refactoring.py --data_root_dir <{YOUR DATA ROOT DIRECTORY}/yolov5>

2. Change the configuration.

E.g) configs/benchmark.yaml
train: /media/data1/VFP290K/VFP290K/yolov5/benchmark/train/image
val: /media/data1/VFP290K/VFP290K/yolov5/benchmark/val/image

2. Training.

Training process is exactly same with official code. E.g)

python train.py --img-size 640 --epochs 100 --data configs/benchmark.yaml --batch-size 48 --cfg ./models/yolov5x.yaml --device 0,1 --workers 8 

You can train your model by using this script.

5. Testing.

Along with the training, test process is also needed to evaluate our model.

python test.py --weights runs/train/exp<your_exp_num>/weights/best.pt --data data/test.yaml --batch-size 48 --img-size 640 --conf-thres 0.5 --iou-thres 0.5 --device 0,1

Results

Our model achieves the following performance on benchmark:

Method Two-Stage One-Stage Transformer
-based
Model Faster R-CNN Cascade R-CNN DetectoRS RetinaNet YOLO3 YOLO5 DETR
mAP 0.732 0.751 0.746 0.750 0.590 0.741 0.605
AP_50 0.873 0.874 0.866 0.910 0.813 0.838 0.868
AP_75 0.799 0.811 0.797 0.811 0.670 0.784 0.687

Our model achieves the following performance on Background:

Backbone Training
Street Park Building Street Park Building Street Park Building
Test Street Park Building
Faster R-CNN mAP
AP_50
AP_75
0.742
0.910
0.829
0.732
0.860
0.809
0.616
0.828
0.723
0.620
0.786
0.690
0.706
0.857
0.768
0.517
0.705
0.588
0.748
0.876
0.813
0.847
0.957
0.920
0.702
0.821
0.791
RetinaNet mAP
AP_50
AP_75
0.770
0.922
0.843
0.743
0.861
0.804
0.654
0.811
0.730
0.664
0.830
0.720
0.737
0.888
0.791
0.587
0.752
0.647
0.828
0.932
0.901
0.851
0.960
0.918
0.804
0.915
0.875
YOLOv3 mAP
AP_50
AP_75
0.610
0.817
0.689
0.510
0.664
0.600
0.284
0.400
0.336
0.416
0.578
0.468
0.537
0.759
0.632
0.282
0.421
0.315
0.610
0.817
0.689
0.664
0.824
0.784
0.671
0.831
0.790
YOLOv5 mAP
AP_50
AP_75
0.669
0.783
0.729
0.671
0.745
0.719
0.226
0.335
0.266
0.398
0.465
0.428
0.692
0.776
0.727
0.209
0.335
0.266
0.675
0.743
0.727
0.802
0.848
0.836
0.606
0.707
0.679

Our model achieves the following performance on light conditions and camera heights:

Backbone Training
Day Night Day Night Low High Low High
Test Day Night Low High
Faster R-CNN mAP
AP_50
AP_75
0.767
0.917
0.843
0.632
0.826
0.808
0.523
0.714
0.572
0.559
0.783
0.609
0.700
0.898
0.808
0.573
0.760
0.669
0.561
0.749
0.636
0.729
0.896
0.817
RetinaNet mAP
AP_50
AP_75
0.779
0.932
0.848
0.667
0.856
0.741
0.534
0.747
0.567
0.566
0.785
0.620
0.702
0.903
0.792
0.610
0.818
0.695
0.596
0.780
0.669
0.739
0.909
0.817
YOLOv3 mAP
AP_50
AP_75
0.615
0.874
0.728
0.432
0.630
0.490
0.299
0.545
0.306
0.415
0.635
0.451
0.567
0.808
0.678
0.375
0.606
0.414
0.349
0.530
0.394
0.563
0.800
0.653
YOLOv5 mAP
AP_50
AP_75
0.794
0.888
0.842
0.343
0.447
0.384
0.392
0.517
0.416
0.414
0.561
0.442
0.590
0.752
0.680
0.412
0.542
0.465
0.350
0.448
0.394
0.718
0.843
0.781

Contributing

This repository is copyrighted under GPLv3 license

About

This repository is for *VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection* paper in NeurIPS 2021 (Datasets and Benchmarks)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages