Skip to content

Benchmarking potential solutions for faster QR decoding

Notifications You must be signed in to change notification settings

ReproNim/qr_decode_bm

Repository files navigation

QR Video Decoder

High-performance QR code extraction from video files with GPU acceleration support.

Features

  • GPU-accelerated video decoding via NVIDIA NVDEC (ffmpeg CUDA)
  • GPU-accelerated QR detection via QRDet (YOLOv8-based)
  • Multi-threaded processing with producer-consumer pattern
  • Frame skipping for faster processing
  • JSONL output compatible with reprostim tools

Installation

# Clone and setup with uv
cd qr_decode
uv sync

Usage

Recommended Commands

# Fastest configuration (OpenCV native + GPU QR detection, 2 workers) - 145.8 fps
uv run qr_video_decoder.py video.mkv -m opencv --gpu-qr -w 2

# Alternative fast (ffmpeg GPU + GPU QR detection, 2 workers) - 123.8 fps
uv run qr_video_decoder.py video.mkv -m ffmpeg-gpu --gpu-qr -w 2

# Fast CPU-only (OpenCV native + OpenCV QR, 8 workers) - 100.8 fps
uv run qr_video_decoder.py video.mkv -m opencv -w 8

# Output to file
uv run qr_video_decoder.py video.mkv -m opencv --gpu-qr -w 2 -o output.qrinfo.jsonl

# Run benchmark to compare configurations
uv run qr_video_decoder.py video.mkv --benchmark

CLI Options

Option Description
-m, --mode Video decoding mode: ffmpeg-gpu, ffmpeg, gpu, cpu, opencv
-w, --workers Number of QR detection workers (default: 4)
-s, --skip Process every Nth frame (default: 1 = all frames)
-o, --output Output file (default: stdout)
--gpu-qr Enable GPU-accelerated QR detection (YOLOv8-based)
--qr-model QRDet model size: n (nano), s (small), m (medium), l (large)
--video-name Override filename for timestamp parsing (useful for symlinks)
--benchmark Run benchmark comparing all modes

Benchmark Results

Tested on NVIDIA A100-PCIE-40GB with a 1920x1080 @ 60fps video (23,820 frames):

Configuration Time FPS QR Codes
OpenCV native + OpenCV QR (4 workers) 445.8s 53.4 47
OpenCV native + OpenCV QR (8 workers) 236.2s 100.8 47
ffmpeg CPU + OpenCV QR (4 workers) 445.1s 53.5 47
ffmpeg GPU + OpenCV QR (4 workers) 434.2s 54.9 47
ffmpeg CPU + OpenCV QR (8 workers) 244.4s 97.5 47
ffmpeg GPU + OpenCV QR (8 workers) 250.2s 95.2 47
OpenCV native + QRDet GPU (2 workers) 163.4s 145.8 46
ffmpeg GPU + QRDet GPU (1 worker) 276.8s 86.1 46
ffmpeg GPU + QRDet GPU (2 workers) 192.5s 123.7 46
ffmpeg GPU + QRDet GPU (4 workers) 263.3s 90.5 46

Key Findings

  1. Best performance: OpenCV native + QRDet GPU with 2 workers achieves 145.8 fps - 45% faster than the best CPU-only setup
  2. OpenCV native is faster than ffmpeg subprocess: Native OpenCV VideoCapture outperforms ffmpeg piping for video decoding
  3. Optimal worker count for GPU QR: 2 workers is optimal; more workers cause GPU contention
  4. Detection accuracy: QRDet finds slightly fewer codes (46 vs 47) due to different detection thresholds
  5. Video decoding is not the bottleneck: QR detection dominates processing time, not video decoding

Output Format

Output is in JSONL format with QrRecord entries followed by a ParseSummary:

{"type": "QrRecord", "index": 0, "frame_start": 2452, "frame_end": 2520, "isotime_start": "...", "isotime_end": "...", "time_start": 40.867, "time_end": 42.0, "duration": 1.133, "data": {...}}
{"type": "ParseSummary", "qr_count": 47, "parsing_duration": 199700.0, "video_frame_count": 23820, ...}

Requirements

  • Python 3.11+
  • NVIDIA GPU with CUDA support (for GPU acceleration)
  • ffmpeg with CUDA/NVDEC support
  • Dependencies managed via uv (see pyproject.toml)

Dependencies

  • opencv-python-headless - Video processing and QR decoding
  • qrdet - GPU-accelerated QR detection (YOLOv8-based)
  • torch - PyTorch for GPU inference
  • av - PyAV for alternative video decoding
  • numpy - Array operations

This project was developed with Claude Code using Claude Opus 4.5 (claude-opus-4-5-20251101).

About

Benchmarking potential solutions for faster QR decoding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages