feat: add vision mixer block with PVW/PGM workflow#463
Merged
Conversation
New broadcast production block for video source selection and transitions.
Takes 2-10 video inputs, outputs a high-res PGM stream and a multiview
monitor with cairo-drawn overlays (borders, labels, clock).
Block architecture:
- Inputs tee'd to two compositors (distribution + multiview)
- GPU (glvideomixer) primary with CPU (compositor) fallback
- Multiview: 2-5-5 grid layout with PVW/PGM large panels and thumbnails
- Cairo overlay draws colored borders, labels, and local wall clock
- PVW/PGM swap on take (standard broadcast workflow)
API:
- POST /api/flows/{id}/blocks/{id}/preview - select preview source
- Reuses existing transition endpoint for take (cut/fade/dip)
- VisionMixerStateChanged WebSocket event for multi-client sync
Web control UI:
- /player/vision-mixer/{flow_id} serves a switcher control page
- WHEP multiview player + PGM/PVW bus buttons + transition controls
- Keyboard shortcuts: 1-9,0 = PVW select, Space = AUTO, X = CUT
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add "Vision Mixer" action button in block inspector (opens control page) - Fix button scaling: use flex layout with number-only labels and hover tooltips - Add connection health indicators (Video/WS status dots with green/yellow/red) - Add video stall detection (frozen frame warning) - Add WHEP auto-reconnect on disconnect - Remove focus outline on buttons to prevent sticky selection ring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ck to open
Vision mixer improvements:
- DSK (Downstream Keyer): 0-4 graphics overlay inputs on dist compositor
with high zorder, 1-based API (POST .../dsk {"dsk": 1, "enabled": true}),
toggle buttons in web UI with clear ON/OFF state
- Pipeline restructure: multiview PGM big display now fed from tee_pgm
(dist compositor output), so transitions and DSK are visible in multiview
- PVW candidate pads shifted to offset N+1 to accommodate PGM tee pad
- Alpha cleanup during transitions skips DSK pads (index >= num_video_inputs)
- Double-click vision mixer block in graph opens control page in browser
- DSK link order fix: DSK pads created after video inputs for correct indexing
- Push transition type added to web UI
- Default input labels changed to "In 1", "In 2", etc.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- FTB (Fade to Black): toggle all mixer pads (video + DSK) to alpha=0, F keyboard shortcut, pulsing red button when active, auto-un-FTB before transitions, "FTB" indicator in multiview PGM via cairo overlay - VisionMixerFtbChanged WebSocket event for multi-client FTB sync - VisionMixerDskChanged WebSocket event for multi-client DSK sync - Fix transition alpha bug: use server's authoritative PGM state instead of client-provided from_input to prevent stale alpha values - Fix button text centering: all buttons use flexbox centering - DSK initial state: buttons correctly show ON at startup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Track DSK enabled states in VisionMixerOverlayState (Vec<AtomicBool>) - Inject ftb_active and dsk_states into page config JSON from overlay state - JS initializes dskEnabled and ftbActive from server state, not hardcoded - Call updateFtbButton() on init to reflect correct FTB button state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…anup - DSK inputs start disabled (alpha=0) instead of enabled - FTB is now an animated fade (ease-in-out) instead of instant cut, with 500ms default duration in web UI - FTB un-fade respects each DSK's individual enabled state - Transitions auto-cancel FTB server-side, avoiding client race condition - All pad properties (alpha, xpos, ypos, width, height) and control bindings are reset before each transition, fixing push/slide leaving pads off-screen - Capsfilters pin pixel-aspect-ratio=1/1 to prevent autovideoconvert from negotiating non-square pixels on GPU systems - Multiview clock shows timezone abbreviation (e.g. CET), refreshed every 60s to handle DST transitions, with zero heap allocations - Thumbnail layout: video at top of slot, label below, border wraps full slot area Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace unsafe libc::localtime_r with chrono for Windows compatibility. Remove orphaned dsk_comp from CPU pipeline that double-linked mixer.src, making CPU path match GPU path topology. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cairo-rs crate requires libcairo2-dev at build time, which is not pulled in by the GStreamer dev packages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nversion Add queue_pgm_mv to decouple the distribution and multiview compositors onto separate threads (both GPU and CPU paths). Replace CPU videoconvert with glcolorconvert + GL memory capsfilter in the GPU multiview output chain. This forces BGRA format conversion to happen on the GPU before gldownload, eliminating the per-frame CPU videoconvert. Cairo's ARGB32 maps to GStreamer's BGRA on little-endian. Also update openapi_snapshot.json to match current API schema. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-frame cairooverlay with an appsrc-based overlay that pushes RGBA frames into the MV compositor as a separate GPU-composited pad. The overlay only re-renders on state changes (~1/sec for clock, instant on PGM/PVW/FTB switch), eliminating all per-frame CPU overlay work. Pipeline threading improvements: - Add queues after each input tee (3 per input) to decouple from compositor backpressure - Add queue after mv_comp to decouple from downstream - Add queue for overlay appsrc (solves GL caps negotiation timing) Overlay rendering: - Cairo renders with R/B-swapped colors, producing correct RGBA output without per-pixel byte swapping (1-2ms vs 15-24ms at 720p) - OverlayRenderer pushes via appsrc with push_sample (caps deferred until pipeline PLAYING, matching WHIP input pattern) - 1Hz timer for clock + trigger_overlay_update for instant state changes Bug fixes: - Fix capsfilter lookup in transition reset (capsfilter -> capsfilter_dist) which caused all pads to reset to 1920x1080 after every transition - Fix hardcoded 1920x1080 in TransitionController, now reads from capsfilter_dist dynamically - Extract dist_canvas_size() helper to eliminate resolution fallback duplication, defaults from DEFAULT_PGM_RESOLUTION - Fix resolution fallback in properties::parse_resolution to use the provided default parameter instead of hardcoded 1920x1080 Reduce default compositor latency from 200ms to 20ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move default latency values (20ms) to strom-types constants (DEFAULT_LATENCY_MS, DEFAULT_MIN_UPSTREAM_LATENCY_MS) instead of hardcoding in builder.rs and definition.rs separately. Lower overlay timing logs from info to debug level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move all hardcoded magic numbers to strom-types vision_mixer constants: - MV_THUMBNAIL_ZORDER, MV_BIG_DISPLAY_ZORDER, DIST_DSK_BASE_ZORDER, MV_OVERLAY_ZORDER - OVERLAY_FRAMERATE - TIMEZONE_REFRESH_SECS - TRANSITION_KEYFRAMES Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…YFRAMES constant Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r control UI Overlay text, borders, and padding were hardcoded for 720p. Add a scale factor (canvas_height / 720) and apply it to border widths, label padding, and remove restrictive font size clamps so the overlay scales correctly from tiny canvases up to 4K. Vision mixer control page: remove unused webrtc.css import that leaked body padding (gray border), move DSK buttons below input rails, add minimize mode (M key) with localStorage persistence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extend the vision mixer to support ordered groups of 1-4 sources for both PVW and PGM. Groups are composited as split-screen layouts: fullscreen (1), side-by-side (2), 2-top+1-bottom (3), or 2x2 grid (4). Source groups use a packed AtomicU64 representation for lock-free reads from the overlay renderer thread. The distribution compositor positions each group member at a sub-rectangle, and the multiview PVW display shows the group layout within the PVW area. Frontend: hold G + press number keys to toggle sources in/out of the PVW group. Release G for normal single-select mode. Transitions between groups use cross-fade; single-to-single transitions use the existing optimized path unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a background source layer that sits fullscreen behind PGM group sources at a lower z-order. Visible through gaps in split-screen layouts (pillarboxing, empty quadrants). Background is exclusive: a source used as background cannot be selected for PVW/PGM groups. Controls: hold B + number key to set background (toggle), shown as yellow border + BG badge on multiview thumbnails and control UI buttons. Also resync full vision mixer state (PGM/PVW groups, background, FTB, DSK) from the server on every WebSocket reconnect, preventing stale UI after connection drops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cairo is now a build dependency for the vision mixer multiview overlay. Add it to README, DEVELOPMENT, CONTRIBUTING, and cross-compile docs. Also add vision mixer to the Features and Blocks lists in README. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The install script downloads pre-built binaries and only needs runtime GStreamer libraries, not development headers and pkg-config files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The GStreamer 1.x libnice plugin package on Fedora is libnice-gstreamer1, not libnice-gstreamer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/vision-mixer.html) with keyboard shortcuts, WHEP preview, and WebSocket state synchronizationstrom-types, OpenAPI schema registered, WebSocket events withToSchemaannotationsTest plan
cargo test— all 17 tests pass including OpenAPI snapshot🤖 Generated with Claude Code