Skip to content

Ship 8b-2a: Yuva420p family scalar prep (Yuva420p / Yuva420p9 / Yuva420p10 / Yuva420p16)#35

Merged
uqio merged 2 commits intomainfrom
feat/ship8b-2a-yuva420p-family-scalar
Apr 27, 2026
Merged

Ship 8b-2a: Yuva420p family scalar prep (Yuva420p / Yuva420p9 / Yuva420p10 / Yuva420p16)#35
uqio merged 2 commits intomainfrom
feat/ship8b-2a-yuva420p-family-scalar

Conversation

@uqio
Copy link
Copy Markdown
Collaborator

@uqio uqio commented Apr 27, 2026

Summary

First sub-PR of Ship 8b-2 (4:2:0 source-side YUVA mass-apply). Adds scalar reference + Frame types + walker scaffolding + sinker integration + 7 dispatchers for the 4 Yuva420p formatsYuva420p (8-bit), Yuva420p9, Yuva420p10, Yuva420p16.

Mirrors PR #32 (Ship 8b-1a) structurally, but covers 4 formats instead of 1. Two sub-PRs follow:

  • 8b-2b: u8 SIMD across all 5 backends + dispatcher wiring + ~25 SIMD equivalence tests
  • 8b-2c: u16 SIMD across all 5 backends + dispatcher wiring + ~25 SIMD equivalence tests

Changes

Strategy B refactor — extend 5 4:2:0 scalar templates with ALPHA_SRC

Each existing yuv_420*_to_rgb_or_rgba_*_row<…, ALPHA> scalar template grows a third const-generic, ALPHA_SRC: bool, plus an a_src: Option<&[…]> parameter. The five templates extended:

Template (existing) Used by New shape Alpha plane type
yuv_420_to_rgb_or_rgba_row<ALPHA> Yuv420p u8 <ALPHA, ALPHA_SRC> Option<&[u8]>
yuv_420p_n_to_rgb_or_rgba_row<BITS, ALPHA> Yuv420p9/10/12/14 u8 <BITS, ALPHA, ALPHA_SRC> Option<&[u16]>
yuv_420p_n_to_rgb_or_rgba_u16_row<BITS, ALPHA> Yuv420p9/10/12/14 u16 <BITS, ALPHA, ALPHA_SRC> Option<&[u16]>
yuv_420p16_to_rgb_or_rgba_row<ALPHA> Yuv420p16 u8 <ALPHA, ALPHA_SRC> Option<&[u16]>
yuv_420p16_to_rgb_or_rgba_u16_row<ALPHA> Yuv420p16 u16 <ALPHA, ALPHA_SRC> Option<&[u16]>

Per-pixel store branched on three (ALPHA, ALPHA_SRC) combinations, identical to Ship 8b-1's pattern. Const-asserted !ALPHA_SRC || ALPHA. Existing _to_rgb_* / _to_rgba_* public wrappers backward-compat (pass ALPHA_SRC = false, None).

5 new public scalar wrappers yuv_420*_to_rgba*_with_alpha_src_row<…> — Strategy B Yuva path consumed by the new dispatchers.

Both PR #32 review fixes pre-applied upfront

The Codex review on Ship 8b-1a (PR #32) surfaced two must-fix issues that are now baked into the canonical Strategy B template:

  1. Source alpha must be masked with bits_mask::<BITS>() before depth conversion — try_new admits unchecked u16 samples; without masking an overrange 1024 at BITS=10 would shift to 256 and cast to u8 zero, silently turning over-range alpha into transparent output. Applied to all 5 scalar template extensions in this PR. Pinned by 2 yuva420pN_rgba_overrange_alpha_masked regression tests. (8-bit Yuva420p doesn't need the mask — u8 output already fits the source u8 alpha.)
  2. Sinker process() must wire alpha-drop paths for with_rgb / with_rgb_u16 / with_luma / with_hsv (declared on the generic MixedSinker<F> impl) — initial implementations only wrote RGBA, leaving the others as silent stale-buffer bugs. Applied to all 4 new sinker impls in this PR. Pinned by 4 yuva420pN_with_rgb_alpha_drop_matches_yuv420pN regression tests.

New Frame types (src/frame.rs +785)

  • Yuva420pFrame<'a> (8-bit) — extra a slice + a_stride. Alpha is full-width × full-height (4:2:0 only subsamples chroma).
  • Yuva420pFrame16<'a, const BITS: u32> — const-asserted BITS == 9 || 10 || 16 (FFmpeg has no yuva420p12 / yuva420p14).
  • Type aliases: Yuva420p9Frame, Yuva420p10Frame, Yuva420p16Frame.
  • try_new validates dimensions + plane lengths; try_new_checked (high-bit) additionally validates every active sample range.
  • Yuva420pFrameError + Yuva420pFrame16Error with Yuva420pFrame16Plane::A discriminator.

4 new walker modules (src/yuv/yuva420p{,9,10,16}.rs)

Each with: marker type, Row type carrying the alpha slice, Sink trait, walker fn. Wired into src/yuv/mod.rs.

7 new public dispatchers in src/row/mod.rs

yuva420p_to_rgba_row (u8 only — 8-bit input has no native-u16 RGBA path); yuva420p9/10/16_to_rgba_row + _u16_row (each high-bit variant has both u8 and u16 dispatchers). All stub let _ = use_simd; for SIMD wiring in 8b-2b/2c. Doc strings include # ⚠ Scalar-only as of Ship 8b-2a headings naming the follow-up sub-PRs.

Sinker integration (src/sinker/mixed/yuva_4_2_0.rs +752)

4 MixedSinker<F> impls with full Strategy A combine paths + alpha-drop wiring. The 9/10/16-bit process bodies share a generic yuva420p_high_bit_process<BITS, ...> helper to avoid 3× duplication.

Tests (src/sinker/mixed/tests.rs +701)

29 new sinker tests — gray-to-gray pass-through, opaque/zero alpha, depth conversion, alpha-drop equivalence vs Yuv420pN, Strategy A combine, overrange-alpha masking, buffer-too-short errors. 617 tests pass on host (was 588).

Test plan

  • cargo test --lib: 617 pass on aarch64-darwin, 0 fail
  • cargo check --tests --lib clean across host, x86_64-unknown-freebsd, wasm32-unknown-unknown
  • RUSTFLAGS=\"-Dwarnings\" cargo clippy --lib --tests clean
  • Zero dead_code warnings — every new _with_alpha_src_row wrapper is consumed by a dispatcher

Codex adversarial review

Verdict: approve. No material findings. Quote: "SIMD backend changes are limited to scalar-tail call-site adaptation, while new YUVA420p RGBA paths are explicitly scalar-only."

Out of scope (deferred to follow-up sub-PRs)

  • u8 RGBA SIMD across all 5 backends → Ship 8b-2b
  • u16 RGBA SIMD across all 5 backends → Ship 8b-2c
  • 4:2:2 Yuva family (Yuva422p / Yuva422p9/10/16) → Ship 8b-3 (sinker-only — reuses 4:2:0 dispatchers per existing tranche-6 pattern)
  • Remaining 4:4:4 Yuva variants (Yuva444p / Yuva444p9 / Yuva444p16) → Ship 8b-4

🤖 Generated with Claude Code

@al8n al8n requested a review from Copilot April 27, 2026 12:23
@al8n al8n changed the title update Ship 8b-2a: Yuva420p family scalar prep (Yuva420p / Yuva420p9 / Yuva420p10 / Yuva420p16) Apr 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end support for YUVA 4:2:0 planar sources (8-bit and selected high-bit depths) across the frame validators, YUV walkers, row conversion dispatchers, and MixedSinker output wiring, including source-derived alpha in RGBA outputs.

Changes:

  • Introduces new YUVA420 frame types (Yuva420pFrame, Yuva420pFrame16) and per-format YUV walker modules for 8/9/10/16-bit.
  • Adds scalar row-kernel plumbing and public row dispatchers to convert YUVA420 → RGBA with alpha sourced from the input A plane (u8 and native-depth u16 outputs where applicable).
  • Wires MixedSinker implementations for YUVA420 formats, plus adds extensive test coverage for alpha passthrough/masking and buffer sizing errors.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/yuv/yuva420p.rs New 8-bit YUVA420 walker + row type and sink trait.
src/yuv/yuva420p9.rs New 9-bit YUVA420 walker + row type and sink trait.
src/yuv/yuva420p10.rs New 10-bit YUVA420 walker + row type and sink trait.
src/yuv/yuva420p16.rs New 16-bit YUVA420 walker + row type and sink trait.
src/yuv/mod.rs Registers and re-exports the new YUVA420 modules/types.
src/frame.rs Adds validated YUVA420 frame structs + error types (+ checked constructors for Frame16).
src/row/scalar.rs Extends shared 4:2:0 scalar kernels to optionally source per-pixel alpha from an input plane; adds wrappers.
src/row/mod.rs Adds public YUVA420 → RGBA row dispatchers (scalar-only for now).
src/row/arch/* Updates scalar-tail calls for 8-bit 4:2:0 SIMD implementations to new scalar signature.
src/sinker/mixed/mod.rs Adds RowSlice variants for alpha rows and registers the new YUVA420 sinker module.
src/sinker/mixed/yuva_4_2_0.rs New MixedSinker impls for YUVA420 (8/9/10/16) including RGBA with source alpha and alpha-drop paths.
src/sinker/mixed/tests.rs Adds tests for YUVA420 alpha passthrough/masking + buffer-too-short errors and RGB alpha-drop equivalence.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/sinker/mixed/yuva_4_2_0.rs Outdated
Comment thread src/frame.rs Outdated
- yuva_4_2_0.rs: replace `impl SourceFormat` APIT with explicit
  `F: SourceFormat` generic on `yuva420p_high_bit_process` for
  consistency with the surrounding generic-heavy template (compiles
  identically; idiomatic style fix).
- frame.rs: correct `Yuva420pFrameError::{U,V}PlaneTooShort` doc
  comments to use `height.div_ceil(2)` so the docs match the
  implementation (`chroma_height = height.div_ceil(2)`), matching the
  already-correct `Yuva420pFrame16Error` siblings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@uqio uqio merged commit 2e88564 into main Apr 27, 2026
43 checks passed
@uqio uqio deleted the feat/ship8b-2a-yuva420p-family-scalar branch April 27, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants