Drone Human Detection & Counting System using VisDrone

This repository implements a complete computer vision pipeline for detecting and counting humans and cars in drone/aerial imagery using the VisDrone dataset and Ultralytics YOLO.

The project is designed for an AI/ML internship technical assessment and covers:

Dataset understanding and preprocessing.
VisDrone to YOLO conversion for a 2-class task.
YOLO training/fine-tuning.
Human and car detection.
Per-image and per-frame human counting.
Bounding-box visualization.
Evaluation with precision, recall, mAP, and speed.
Optional tracking with ByteTrack or BoT-SORT.

Dataset

Local dataset path:

D:\Antlings\archive\VisDrone_Dataset

Expected VisDrone-style folders include:

VisDrone2019-DET-train\images
VisDrone2019-DET-train\annotations or labels
VisDrone2019-DET-val\images
VisDrone2019-DET-val\annotations or labels
VisDrone2019-DET-test-dev\images
VisDrone2019-DET-test-dev\annotations or labels
VisDrone2019-DET-test-challenge\images

The converter supports both raw VisDrone annotation rows and YOLO-style VisDrone labels. For this assessment, classes are remapped to:

0: human  <- VisDrone pedestrian + people/person categories
1: car    <- VisDrone car category only

Other categories are ignored. Invalid boxes, ignored regions, zero/negative boxes, and boxes outside image bounds are skipped or clipped safely.

Repository Structure

D:\Antlings
  README.md
  REPORT.md
  requirements.txt
  .gitignore
  configs\
    dataset.yaml
    train.yaml
  src\
    convert_visdrone_to_yolo.py
    explore_dataset.py
    train.py
    infer.py
    evaluate.py
    visualize.py
    utils.py
    track.py
  notebooks\
    01_dataset_understanding.ipynb
  outputs\
    samples\
    predictions\
    metrics\
    tracking\
  runs\
    .gitkeep

Installation

Run from Windows PowerShell:

cd D:\Antlings

python -m venv .venv

.venv\Scripts\activate

pip install -r requirements.txt

Full Demo Workflow

cd D:\Antlings

python -m venv .venv

.venv\Scripts\activate

pip install -r requirements.txt

python src\convert_visdrone_to_yolo.py --dataset-root "D:\Antlings\archive\VisDrone_Dataset" --output-root "D:\Antlings\data\visdrone_human_car"

python src\explore_dataset.py --dataset-root "D:\Antlings\data\visdrone_human_car" --output-dir "D:\Antlings\outputs\samples"

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8n.pt --epochs 50 --imgsz 640 --batch 16 --project "D:\Antlings\runs\train" --name human_car_yolo

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

python src\evaluate.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --data "D:\Antlings\configs\dataset.yaml" --output-dir "D:\Antlings\outputs\metrics"

Convert VisDrone Annotations

Default conversion:

python src\convert_visdrone_to_yolo.py --dataset-root "D:\Antlings\archive\VisDrone_Dataset" --output-root "D:\Antlings\data\visdrone_human_car"

Custom paths:

python src\convert_visdrone_to_yolo.py --dataset-root "PATH_TO_VISDRONE" --output-root "PATH_TO_OUTPUT"

Generated dataset:

D:\Antlings\data\visdrone_human_car\images\train
D:\Antlings\data\visdrone_human_car\images\val
D:\Antlings\data\visdrone_human_car\images\test
D:\Antlings\data\visdrone_human_car\labels\train
D:\Antlings\data\visdrone_human_car\labels\val
D:\Antlings\data\visdrone_human_car\labels\test

The script also writes:

D:\Antlings\configs\dataset.yaml

Explore the Dataset

python src\explore_dataset.py --dataset-root "D:\Antlings\data\visdrone_human_car" --output-dir "D:\Antlings\outputs\samples"

Outputs:

outputs\samples\class_distribution.png
outputs\samples\objects_per_image_histogram.png
Annotated sample images with bounding boxes and counts.
outputs\metrics\dataset_summary.csv

Train YOLO

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8n.pt --epochs 50 --imgsz 640 --batch 16 --project "D:\Antlings\runs\train" --name human_car_yolo

Default model: yolov8n.pt, chosen because it is lightweight and suitable for Colab or modest GPUs. You can use a larger model when hardware allows:

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8s.pt --epochs 50
python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolo11n.pt --epochs 50

Best weights are saved under:

D:\Antlings\runs\train\human_car_yolo\weights\best.pt

Inference and Human Counting

Run on a folder:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Run on one image:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val\example.jpg" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Run on a video:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\demo_video.mp4" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Each output image/video displays bounding boxes and a readable counter:

Humans: N  Cars: M

CSV outputs:

outputs\predictions\predictions.csv
outputs\predictions\counts.csv

Evaluate

python src\evaluate.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --data "D:\Antlings\configs\dataset.yaml" --output-dir "D:\Antlings\outputs\metrics"

Outputs:

outputs\metrics\metrics_summary.json
outputs\metrics\metrics_summary.csv
outputs\metrics\evaluation_notes.md

Metrics printed include precision, recall, mAP50, mAP50-95, and estimated FPS when Ultralytics exposes speed values.

Optional Tracking

For a demo video:

python src\track.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\demo_video.mp4" --output-dir "D:\Antlings\outputs\tracking" --tracker bytetrack.yaml

The tracking script saves:

Tracked video with IDs.
outputs\tracking\tracking_counts.csv

If tracker dependencies or configs are unavailable, the script prints a clear message explaining how to enable tracking.

Demo Video Script Outline

Use this structure for a 3-5 minute demo:

Project goal: detect humans and cars from drone images and count humans.
Dataset: show D:\Antlings\archive\VisDrone_Dataset, annotation format, and class remapping.
Preprocessing: run or explain convert_visdrone_to_yolo.py and show configs\dataset.yaml.
Dataset understanding: show class distribution, object-count histogram, and sample annotated images.
Training: show the YOLO training command and explain why yolov8n.pt is used.
Inference: run infer.py, show predicted images, bounding boxes, and Humans: N.
Evaluation: show metrics JSON/CSV and discuss precision, recall, mAP, speed, limitations.
Bonus: briefly show tracking output if a video is available.

Notes for Assessment Discussion

Strengths:

End-to-end pipeline from dataset conversion to inference and evaluation.
Clear human counting logic from detection class IDs.
Works with raw VisDrone annotations and YOLO-style VisDrone label archives.
Uses small YOLO model by default for practical training and demo speed.

Limitations:

Small aerial humans are difficult, especially under occlusion.
Counting accuracy depends on detection recall.
Dense scenes can create missed detections and duplicate boxes.
The optional tracker is most useful on video, not independent still images.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drone Human Detection & Counting System using VisDrone

Dataset

Repository Structure

Installation

Full Demo Workflow

Convert VisDrone Annotations

Explore the Dataset

Train YOLO

Inference and Human Counting

Evaluate

Optional Tracking

Demo Video Script Outline

Notes for Assessment Discussion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
notebooks		notebooks
outputs		outputs
runs		runs
src		src
.gitignore		.gitignore
README.md		README.md
REPORT.md		REPORT.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Drone Human Detection & Counting System using VisDrone

Dataset

Repository Structure

Installation

Full Demo Workflow

Convert VisDrone Annotations

Explore the Dataset

Train YOLO

Inference and Human Counting

Evaluate

Optional Tracking

Demo Video Script Outline

Notes for Assessment Discussion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages