Skip to content

Hamza-32/Antlings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Drone Human Detection & Counting System using VisDrone

This repository implements a complete computer vision pipeline for detecting and counting humans and cars in drone/aerial imagery using the VisDrone dataset and Ultralytics YOLO.

The project is designed for an AI/ML internship technical assessment and covers:

  • Dataset understanding and preprocessing.
  • VisDrone to YOLO conversion for a 2-class task.
  • YOLO training/fine-tuning.
  • Human and car detection.
  • Per-image and per-frame human counting.
  • Bounding-box visualization.
  • Evaluation with precision, recall, mAP, and speed.
  • Optional tracking with ByteTrack or BoT-SORT.

Dataset

Local dataset path:

D:\Antlings\archive\VisDrone_Dataset

Expected VisDrone-style folders include:

VisDrone2019-DET-train\images
VisDrone2019-DET-train\annotations or labels
VisDrone2019-DET-val\images
VisDrone2019-DET-val\annotations or labels
VisDrone2019-DET-test-dev\images
VisDrone2019-DET-test-dev\annotations or labels
VisDrone2019-DET-test-challenge\images

The converter supports both raw VisDrone annotation rows and YOLO-style VisDrone labels. For this assessment, classes are remapped to:

0: human  <- VisDrone pedestrian + people/person categories
1: car    <- VisDrone car category only

Other categories are ignored. Invalid boxes, ignored regions, zero/negative boxes, and boxes outside image bounds are skipped or clipped safely.

Repository Structure

D:\Antlings
  README.md
  REPORT.md
  requirements.txt
  .gitignore
  configs\
    dataset.yaml
    train.yaml
  src\
    convert_visdrone_to_yolo.py
    explore_dataset.py
    train.py
    infer.py
    evaluate.py
    visualize.py
    utils.py
    track.py
  notebooks\
    01_dataset_understanding.ipynb
  outputs\
    samples\
    predictions\
    metrics\
    tracking\
  runs\
    .gitkeep

Installation

Run from Windows PowerShell:

cd D:\Antlings

python -m venv .venv

.venv\Scripts\activate

pip install -r requirements.txt

Full Demo Workflow

cd D:\Antlings

python -m venv .venv

.venv\Scripts\activate

pip install -r requirements.txt

python src\convert_visdrone_to_yolo.py --dataset-root "D:\Antlings\archive\VisDrone_Dataset" --output-root "D:\Antlings\data\visdrone_human_car"

python src\explore_dataset.py --dataset-root "D:\Antlings\data\visdrone_human_car" --output-dir "D:\Antlings\outputs\samples"

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8n.pt --epochs 50 --imgsz 640 --batch 16 --project "D:\Antlings\runs\train" --name human_car_yolo

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

python src\evaluate.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --data "D:\Antlings\configs\dataset.yaml" --output-dir "D:\Antlings\outputs\metrics"

Convert VisDrone Annotations

Default conversion:

python src\convert_visdrone_to_yolo.py --dataset-root "D:\Antlings\archive\VisDrone_Dataset" --output-root "D:\Antlings\data\visdrone_human_car"

Custom paths:

python src\convert_visdrone_to_yolo.py --dataset-root "PATH_TO_VISDRONE" --output-root "PATH_TO_OUTPUT"

Generated dataset:

D:\Antlings\data\visdrone_human_car\images\train
D:\Antlings\data\visdrone_human_car\images\val
D:\Antlings\data\visdrone_human_car\images\test
D:\Antlings\data\visdrone_human_car\labels\train
D:\Antlings\data\visdrone_human_car\labels\val
D:\Antlings\data\visdrone_human_car\labels\test

The script also writes:

D:\Antlings\configs\dataset.yaml

Explore the Dataset

python src\explore_dataset.py --dataset-root "D:\Antlings\data\visdrone_human_car" --output-dir "D:\Antlings\outputs\samples"

Outputs:

  • outputs\samples\class_distribution.png
  • outputs\samples\objects_per_image_histogram.png
  • Annotated sample images with bounding boxes and counts.
  • outputs\metrics\dataset_summary.csv

Train YOLO

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8n.pt --epochs 50 --imgsz 640 --batch 16 --project "D:\Antlings\runs\train" --name human_car_yolo

Default model: yolov8n.pt, chosen because it is lightweight and suitable for Colab or modest GPUs. You can use a larger model when hardware allows:

python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolov8s.pt --epochs 50
python src\train.py --data "D:\Antlings\configs\dataset.yaml" --model yolo11n.pt --epochs 50

Best weights are saved under:

D:\Antlings\runs\train\human_car_yolo\weights\best.pt

Inference and Human Counting

Run on a folder:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Run on one image:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\data\visdrone_human_car\images\val\example.jpg" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Run on a video:

python src\infer.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\demo_video.mp4" --output-dir "D:\Antlings\outputs\predictions" --conf 0.25 --save-csv

Each output image/video displays bounding boxes and a readable counter:

Humans: N  Cars: M

CSV outputs:

  • outputs\predictions\predictions.csv
  • outputs\predictions\counts.csv

Evaluate

python src\evaluate.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --data "D:\Antlings\configs\dataset.yaml" --output-dir "D:\Antlings\outputs\metrics"

Outputs:

  • outputs\metrics\metrics_summary.json
  • outputs\metrics\metrics_summary.csv
  • outputs\metrics\evaluation_notes.md

Metrics printed include precision, recall, mAP50, mAP50-95, and estimated FPS when Ultralytics exposes speed values.

Optional Tracking

For a demo video:

python src\track.py --weights "D:\Antlings\runs\train\human_car_yolo\weights\best.pt" --source "D:\Antlings\demo_video.mp4" --output-dir "D:\Antlings\outputs\tracking" --tracker bytetrack.yaml

The tracking script saves:

  • Tracked video with IDs.
  • outputs\tracking\tracking_counts.csv

If tracker dependencies or configs are unavailable, the script prints a clear message explaining how to enable tracking.

Demo Video Script Outline

Use this structure for a 3-5 minute demo:

  1. Project goal: detect humans and cars from drone images and count humans.
  2. Dataset: show D:\Antlings\archive\VisDrone_Dataset, annotation format, and class remapping.
  3. Preprocessing: run or explain convert_visdrone_to_yolo.py and show configs\dataset.yaml.
  4. Dataset understanding: show class distribution, object-count histogram, and sample annotated images.
  5. Training: show the YOLO training command and explain why yolov8n.pt is used.
  6. Inference: run infer.py, show predicted images, bounding boxes, and Humans: N.
  7. Evaluation: show metrics JSON/CSV and discuss precision, recall, mAP, speed, limitations.
  8. Bonus: briefly show tracking output if a video is available.

Notes for Assessment Discussion

Strengths:

  • End-to-end pipeline from dataset conversion to inference and evaluation.
  • Clear human counting logic from detection class IDs.
  • Works with raw VisDrone annotations and YOLO-style VisDrone label archives.
  • Uses small YOLO model by default for practical training and demo speed.

Limitations:

  • Small aerial humans are difficult, especially under occlusion.
  • Counting accuracy depends on detection recall.
  • Dense scenes can create missed detections and duplicate boxes.
  • The optional tracker is most useful on video, not independent still images.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors