Skip to content

perf(viewer): pace Qt6 video frame delivery to scene-render capacity#3006

Merged
vpetersson merged 4 commits into
masterfrom
fix-pi4-video-pacing-2987
Jun 7, 2026
Merged

perf(viewer): pace Qt6 video frame delivery to scene-render capacity#3006
vpetersson merged 4 commits into
masterfrom
fix-pi4-video-pacing-2987

Conversation

@vpetersson

Copy link
Copy Markdown
Contributor

Issues Fixed

Part of #2987 (the Pi 4 symptoms: 50/60 fps clips jerky and "freezing / cut short towards the end").

Description

On Pi 4, 1080p60 H.264 measured 22.6 fps presented (DRM vblank tracepoint) with the GUI thread saturated and — worse — position-ms advancing at ~0.6× realtime: QtMultimedia fell behind, dropped ~70% of frames before the sink (dropped=480+ per clip), and clips got killed by the slot duration before reaching their end. Root cause: every sink delivery schedules a QML scene render, and at 60 deliveries/s the GUI thread (which sustains a render only every ~40 ms at 1080p with the decoder running) collapses under the backlog.

QMediaPlayer now renders into an intermediate QVideoSink owned by VideoView. Frames forward to the VideoOutput's sink only when the scene graph has composited the previous one (QQuickWindow::afterRendering); a frame arriving mid-render parks in a single-slot mailbox and forwards the instant the render finishes, so renders chain at capacity with at most one frame of latency. If the render signal is ever un-wired, the gate falls back to unpaced forwarding.

Measured on the Pi 4 testbed (BBB 1080p60 H.264, official-image lineage build 4a30bb6-pi4-64):

metric before after
position-ms rate ~0.6× realtime (clips cut short) 1.0× realtime
decoder-side drops per clip ~500 and climbing 3 (startup)
sink deliveries ~18/s (pipeline collapse) 60/s (full rate)
presentation 22.6 fps, increasingly stale frames ~23 fps, always the freshest frame, even pacing
1080p30 control 30.0 fps 30 forwarded/s (unchanged)

~23 fps is this render architecture's ceiling for 60 fps sources on Pi 4 (render cost rises to ~40 ms while the decoder saturates the memory bus); the user-visible wins are eliminating the slow-motion/cut-short behaviour and the frame staleness. Stats lines gain a frames-forwarded field so the gate is observable in playback-stats.log.

Affects all Qt6 boards (pi3-64/pi4-64/pi5/x86/arm64); validated on pi4-64, the board from the report.

Checklist

  • I have performed a self-review of my own code.
  • New and existing unit tests pass locally and on CI with my changes. (QtTest suite extended: pacing-sink chain + gate behaviour)
  • I have done an end-to-end test for Raspberry Pi devices. (Pi 4 testbed, measurements above; left soaking on the new build)
  • I have tested my changes for x86 devices. (same code path; not separately measured)
  • I added a documentation for the changes I have made (when necessary).

🤖 Generated with Claude Code

vpetersson and others added 2 commits June 6, 2026 21:18
Part of issue #2987: 1080p60 content on Pi 4 presented at 22.6 fps
with the playback position falling to ~0.6x realtime (clips ended
early), because every sink delivery scheduled a QML scene render on
a GUI thread that sustains ~45 renders/s at 1080p — overload made
throughput collapse below even the 30 fps a 1080p30 clip achieves.

QMediaPlayer now renders into an intermediate QVideoSink; frames
forward to the VideoOutput's sink only once the scene graph has
composited the previous one (QQuickWindow::afterRendering). 30 fps
sources pass untouched; 60 fps sources settle into an even ~half
cadence instead of irregular drops. If the render signal is not
wired the gate falls back to unpaced forwarding.

Stats lines gain a frames-forwarded field between frames-delivered
and frames-rendered so the gate is observable in playback-stats.log.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 1-deep gate was stop-and-wait: render -> re-arm -> idle until the
next sink delivery -> render, measuring only ~23 presented fps on a
GUI thread that renders faster back-to-back. Park the newest frame
that arrives mid-render in a single-slot mailbox and forward it the
moment afterRendering fires, so renders chain at capacity with at
most one frame of latency.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from a team as a code owner June 6, 2026 21:36
@vpetersson vpetersson self-assigned this Jun 6, 2026
@vpetersson vpetersson requested a review from Copilot June 6, 2026 21:36

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses jerky/high-FPS video playback on Qt6 boards by pacing QtMultimedia frame forwarding to the QML scene graph’s actual render capacity, preventing GUI-thread render backlog collapse and improving playback position accuracy.

Changes:

  • Introduces an intermediate QVideoSink (pacingSink) and a render-gated forwarding path that only forwards the newest frame when the previous one has been composited (afterRendering).
  • Adds a single-slot mailbox (pendingFrame) plus new frames-forwarded stats to make the pacing behavior observable.
  • Extends QtTest coverage to assert the sink chaining and gating/drop behavior under the offscreen test platform.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/anthias_webview/tests/test_videoview.cpp Adds tests asserting pacing-sink chaining and gate behavior (mailbox/drop until scene renders).
src/anthias_webview/src/videoview.h Adds pacing-gate state (pacingSink, pendingFrame, sceneReadyForFrame) and a forwarded-frame counter.
src/anthias_webview/src/videoview.cpp Implements render-paced forwarding via afterRendering, updates stats logging, and rewires the player to the intermediate sink.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/anthias_webview/src/videoview.cpp
vpetersson and others added 2 commits June 6, 2026 21:41
A frame parked in the mailbox at stop() time could be forwarded by a
later afterRendering (stale-frame flash on the next reveal) and kept
its decoder buffer alive between assets. Clear the mailbox, re-arm
the gate, and push an empty frame to the VideoOutput so the last
displayed buffer is released too (review feedback).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sonarqubecloud

sonarqubecloud Bot commented Jun 7, 2026

Copy link
Copy Markdown

@vpetersson

Copy link
Copy Markdown
Contributor Author

Independent on-device re-validation (Pi 4, 2026-06-07)

Re-measured this PR from scratch on a physical Pi 4, with the exact branch point (66ed89c) as the before-image and a PR-head-identical build (be63202, git diff against the PR head for src/anthias_webview is empty) as the after-image. git diff 66ed89c be63202 touches only this PR's 3 files, so nothing else can explain the delta. Presented fps measured kernel-side via the drm_vblank_event_delivered tracepoint (3×10 s windows) rather than the playback log.

1080p60 H.264 (BBB, AC3 audio, 60 s clip)

metric before (66ed89c, no gate) after (this PR)
wall time to play the 60 s clip 48.3 s (END_FILE elapsed_ms=48350) — early EOS full 60 s (loops at exact 60 s intervals)
playback rate 1.24× fast-forward 0.998× realtime (19,950 ms position over 20,000 ms wall)
sink deliveries 74/s (uncontrolled firehose) 60/s (exactly container rate)
fresh video frames on screen 642/clip = 13.3/s (frames-rendered) 24.45/s (frames-forwarded = frames-rendered)
presented fps (vblank tracepoint) 15.2 / 16.9 / 19.0 — 30–40% of flips re-present a stale frame 24.1 / 24.2 / 24.1 — every flip a fresh frame

1080p30 control

PR build: exactly 300 vblanks/10 s = 30.0 fps, 1.0× realtime, END_FILE elapsed_ms=59,916–59,928, dropped 1–2 per clip. The baseline also played 30p fine (1,762/1,796 rendered) — confirming the defect is 60 fps-specific.

The baseline failure is literally the report in #2987: a 60-second clip "finishing" in 48 s at 13 fresh fps is "jerky and cut short towards the end". A second datapoint on older master (1568e9e) was worse still — the same clip in 54 s, then 44.9 s, down to 10.4 fresh fps.

Two observations beyond the PR description

  1. The collapse direction depends on the active audio sink. The PR body describes ~0.6× slow-motion (measured with the HDMI sink, sysdefault:CARD=vc4hdmi0); in today's runs the device had picked the 3.5 mm jack sink (plughw:CARD=Headphones) and the baseline ran 1.24–1.44× fast instead (early EOS). Both builds used the same sink, so the comparison is fair. Same root cause — the GUI thread swamped by ungated 60/s deliveries breaks QtMultimedia's A/V sync, which then collapses in whichever direction the audio clock pushes. The gate holds 0.998× realtime in both modes.
  2. dropped= reads 0 in the fast-forward failure mode (the frames vanish pre-sink), so that counter cannot detect this in the field. frames-forwarded vs expected — added by this PR — can.

Verdict: the PR does what it claims on the board it claims it for. Device restored to the PR build for continued soak.

@vpetersson vpetersson merged commit e882417 into master Jun 7, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants