feat(be-tier8): BE support for Rgb48/Bgr48/Rgba64/Bgra64/X2Rgb10/X2Bgr10 row kernels#87
Open
feat(be-tier8): BE support for Rgb48/Bgr48/Rgba64/Bgra64/X2Rgb10/X2Bgr10 row kernels#87
Conversation
…r10 row kernels Add <const BE: bool> to all 6 packed-RGB-16bit and 10-bit format row kernels (dispatchers, scalars, all 5 arch backends, sinkers) so big-endian pixel sources can decode each format without a separate code path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Tier 8 scalar BE-load helpers used `if BE { x.swap_bytes() } else { x }`,
which is unconditional w.r.t. host endianness — wrong on big-endian hosts.
The companion SIMD `load_endian_u16x*` / `load_endian_u32x4` helpers are
target-endian aware (`#[cfg(target_endian = ...)]`), so a host-byte-order
mismatch between scalar and SIMD would corrupt s390x rows and break the
"SIMD matches scalar" parity property the dispatch tests rely on.
Replace the swap-on-BE pattern with the target-endian-aware primitives:
- `if BE { v.swap_bytes() } else { v }` → `if BE { u16::from_be(v) } else { u16::from_le(v) }`
- The fast-path `copy_from_slice` else-branches in `rgb48_to_rgb_u16_row`
and `rgba64_to_rgba_u16_row` are likewise replaced with a per-element
`u16::from_le` loop so the LE source path is also correct on BE hosts.
`from_be`/`from_le` are no-ops when the source byte order matches the host
and a `swap_bytes` otherwise, mirroring the SIMD `load_le_*` / `load_be_*`
semantics and keeping the scalar reference correct on every target.
Note: the X2Rgb10/X2Bgr10 (u32) scalar paths in `packed_rgb.rs` already use
`u32::from_be_bytes` / `u32::from_le_bytes` on raw `&[u8]` input, which are
target-endian aware by definition, so no fix is needed there.
Test fixtures (`byte_swap_*` / `to_be_bytes` helpers in `tests/`) are
intentionally left untouched — they synthesise BE-encoded byte buffers
from LE inputs and are correct as-is.
Verified:
- `cargo test --target aarch64-apple-darwin --lib` (2159 tests pass)
- `cargo build --target x86_64-apple-darwin --tests` (0 warnings)
- `RUSTFLAGS="-C target-feature=+simd128" cargo build --target wasm32-unknown-unknown --tests`
- `cargo build --no-default-features`
- `cargo fmt --check`
- `cargo clippy --all-targets --all-features -- -D warnings`
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 — Tier 8 BE rollout. Adds
<const BE: bool>to all Rgb48 / Bgr48 / Rgba64 / Bgra64 / X2Rgb10 / X2Bgr10 row kernels across all 6 backends + dispatcher.Implementation:
swap_bytes()for u16 reads (Rgb48/Bgr48/Rgba64/Bgra64) and u32 reads (X2Rgb10/X2Bgr10)load_endian_u16xN::<BE>/load_endian_u32xN::<BE>from the merged BE infra (feat(be-infra): endian-aware SIMD loaders across 5 backends #81)<false>(sinker BE plumbing deferred to Phase 4)Test results: 2159 tests pass.
cargo build --all-features, clippy, fmt all clean.Test plan
cargo test --target aarch64-apple-darwin --all-featurescargo build --target x86_64-apple-darwin --testscargo clippy --all-features(no warnings)🤖 Generated with Claude Code