Snapback — On-device Vision Prototype (OpenCV + MediaPipe + VLM)

A small prototype that runs a real-time desk camera loop and overlays user state.

Capture/overlay: OpenCV
On-device inference: MediaPipe FaceLandmarker (Tasks API)
Coaching layer (optional): Ollama + qwen2.5vl:3b (VLM)

This repo is used as a concrete “on-device vision pipeline” artifact (capture → inference → overlay → latency measurement).

What it does

Grabs webcam frames
Runs face landmark detection (and derives simple features like EAR)
Tracks a coarse state (focused/distracted/etc.)
(Optional) calls a VLM periodically and shows a short coaching message

Setup

This project uses uv (recommended).

# inside the repo
uv sync

If you don’t use uv, you can still install from pyproject.toml using your preferred tool.

Note: On first run, detector.py downloads face_landmarker.task from the official MediaPipe model bucket.

Run

uv run python main.py

Press q to quit.

Benchmark (latency)

Quick end-to-end-ish benchmark for the FaceLandmarker inference step:

uv run python bench_mediapipe.py --frames 200

It prints average / p50 / p95 milliseconds per frame.

Files

main.py — OpenCV loop + overlay + glue
detector.py — MediaPipe FaceLandmarker (Tasks API) + EAR/head signals
bench_mediapipe.py — simple latency benchmark runner
vlm_coach.py, vlm.py — VLM layer (Ollama + qwen2.5vl)

Notes

This is a prototype. It is OK to describe it as a prototype / exploration on a resume.
If you need a sharable artifact: add
- README run steps (already)
- one screenshot under assets/
- benchmark output (this repo includes a runner)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
DESIGN.md		DESIGN.md
README.md		README.md
bench_mediapipe.py		bench_mediapipe.py
capture.py		capture.py
coach.py		coach.py
detector.py		detector.py
main.py		main.py
pyproject.toml		pyproject.toml
state_engine.py		state_engine.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snapback — On-device Vision Prototype (OpenCV + MediaPipe + VLM)

What it does

Setup

Run

Benchmark (latency)

Files

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Snapback — On-device Vision Prototype (OpenCV + MediaPipe + VLM)

What it does

Setup

Run

Benchmark (latency)

Files

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages