ObjectDectionMethodsComparison

An exercise in using Machine Learning on a Raspberry PI on the edge to detect objects in images from high resolution cameras. I tested five ways of feeding a 4K video stream into a fixed YOLO11m model running on a Raspberry Pi 5 with a Hailo8 accelerator. The examples run on a Pi with or without an accelerator, or a host with or without CUDA. Motion-based Region-of-Interest (ROI) selection improved F1 score by ~0.26 over full-frame resizing, even after controlling for confidence thresholds, duplicate suppression, and target distance. Target distance dominated performance: far targets reduced F1 by ~0.45 compared to near targets, regardless of method. The takeaway: how you prepare images matters as much as the model itself, especially on constrained hardware.

The environment should look like this,

Project Root Directory

.venv
benchmark_output
images (This is the folder that the Create_GT_Helper script outputs to by default)
recordings
The ground truth folders, one for each video used for testing. Each folder contains the yolo text files, one for each frame in the video. Use the Create_GT_Helper script to create the first draft, then use YoloLabel to complete them.

A graph showing the results of the best case parameters against a set of videos with a person and a medium sized dog at <150ft (Near), 150 to 300 ft (Mid) and 300 to 400 ft (far).

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
code		code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ObjectDectionMethodsComparison

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ObjectDectionMethodsComparison

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages