PR-X12 A1: CTU carrier + quad-tree partition#170
Conversation
Worker A1 of PR-X12 per .claude/knowledge/pr-x12-codec-x265-design.md
§ "Worker decomposition" line 195. Ships the structural foundation of
the cognitive-cell codec — the carrier type, quad-tree partition with
arena-backed children, leaf-CU mode taxonomy. Subsequent workers
(A2-A8: mode tags, predict, transform, quantise, RDO, rANS, stream)
consume only A1's `Ctu` type + `crate::hpc::linalg::*`.
Surface (`crate::hpc::codec::*`, feature-gated):
pub struct Ctu { block_row, block_col, tier, split_depth, arena }
pub struct CtuArena { Vec<CtuPartition>, capacity 85 (= 1+4+16+64) }
pub enum CtuPartition { Leaf(LeafCu), Split([NodeIdx; 4]) }
pub struct LeafCu { mode, basin_idx, delta?, merge_dir?, escape_idx? }
pub enum CellMode { Skip = 00, Merge = 01, Delta = 10, Escape = 11 }
pub enum MergeDir { North, East, West, South }
pub struct NodeIdx(pub u16);
impl Ctu {
pub fn new_skip(row, col, tier, basin) -> Self
pub fn split(node, depth) -> Result<[NodeIdx;4], SplitError>
pub fn merge(node) -> Result<(), MergeError>
}
Key design choices:
- **Arena-allocated quad-tree** (`CtuArena` over `Vec<CtuPartition>`
with `with_capacity(MAX_QUAD_TREE_NODES = 85)`) — matches the design
doc's stack-arena pattern + PR-X10 invariant 1 (zero-cost
abstractions, no `Box<dyn>` in hot paths). All node references are
`NodeIdx(u16)`, never raw pointers. No further heap allocation can
happen during split — the cap is structural (depth 3 × 4-way
branching = 85 max).
- **Repr-stable `CellMode` / `MergeDir`** — discriminants match the
on-wire 2-bit codes the A7 rANS encoder will emit. Test pins this
ABI (`cell_mode_discriminants_match_wire_codes`).
- **Per-mode `Option<…>` fields on `LeafCu`** — clarity-first; the
fixed per-mode bit budget collapse happens in A7 (out of scope).
- **Constructor pattern** — `LeafCu::{skip, delta, merge, escape}`
enforce "only the field for the current mode is `Some`" at the type
level so consumers can't accidentally produce a `Skip` leaf carrying
a stale `delta: Some(_)`.
- **Split / merge** — split refuses non-leaves (`NotALeaf`) and depth
beyond `MAX_SPLIT_DEPTH` (with cap returned in the error variant);
merge refuses non-Splits (`NotASplit`), inner-Split children
(`ChildNotLeaf`), and heterogeneous-mode children
(`ChildrenDiverge`). Merge does NOT compact the arena — orphaned
child nodes remain; a GC pass is out of scope for A1.
13 tests cover:
new_skip_creates_root_leaf, split_root_yields_four_children,
split_at_max_depth_rejects, split_already_split_node_rejects,
merge_homogeneous_children_collapses, merge_heterogeneous_children_rejects,
merge_split_child_rejects, merge_leaf_rejects,
leaf_constructors_set_correct_fields,
arena_capacity_bound_85 (depth-3 recursive split → exactly 85 nodes),
cell_mode_discriminants_match_wire_codes,
merge_dir_discriminants_match_wire_codes,
node_idx_root_is_zero.
Cargo.toml: new `codec = ["std"]` feature alongside `splat3d`.
src/hpc/mod.rs: `#[cfg(feature = "codec")] pub mod codec;`.
Verified locally:
cargo test -p ndarray --features codec --lib hpc::codec 13 passed
cargo check -p ndarray clean (no-codec build)
cargo fmt --check clean
cargo clippy -p ndarray --features codec -- -D warnings clean
cargo clippy --features approx,serde,rayon -- -D warnings clean
Out of scope (A2-A8 sprints):
- A2 mode.rs — 2-bit mode bit-pack/unpack helpers
- A3 predict.rs — intra/inter prediction
- A4 transform.rs — optional DCT for delta residuals
- A5 quantize.rs — 8-bit scalar quantiser
- A6 rdo.rs — λ-RDO loop
- A7 ans.rs — rANS entropy coder
- A8 stream.rs — byte-stream pack/unpack
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df963f8804
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if prev.mode != leaf.mode || prev.basin_idx != leaf.basin_idx { | ||
| return Err(MergeError::ChildrenDiverge); |
There was a problem hiding this comment.
Reject merges with differing leaf payloads
merge() only compares mode and basin_idx before collapsing children, so four Delta/Merge/Escape leaves with different delta/merge_dir/escape_idx values are treated as mergeable and one child payload is silently kept while the others are discarded. In any path that merges after child-level mode decisions, this loses encoded information and can produce incorrect reconstruction.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in followup commit. merge() now compares the full LeafCu via PartialEq rather than just mode + basin_idx — any divergence in delta / merge_dir / escape_idx now surfaces as ChildrenDiverge instead of silently dropping payload from three of the four children. Symmetric positive test (merge_identical_delta_payloads_collapses) ensures the path still works for legitimate same-payload merges.
New tests covering each mode:
merge_diverging_delta_payloads_rejectsmerge_diverging_merge_dirs_rejectsmerge_diverging_escape_idx_rejectsmerge_identical_delta_payloads_collapses
Generated by Claude Code
| /// | ||
| /// Panics if `tier == 0`. The cascade tiers are 1-indexed (L1..L4). | ||
| pub fn new_skip(block_row: u16, block_col: u16, tier: u8, basin_idx: u16) -> Self { | ||
| let tier = NonZeroU16::new(tier as u16).expect("Ctu::new_skip: tier must be 1..=4"); |
There was a problem hiding this comment.
Enforce tier upper bound in CTU constructor
Ctu::new_skip documents tier as 1..=4, but the constructor only checks non-zero via NonZeroU16::new, so values 5..=255 are accepted despite the invariant and panic message. This allows invalid tier metadata to enter the codec state and can break downstream assumptions that tiers are limited to four cascade levels.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in followup commit. Added explicit assert!((1..=4).contains(&tier), …) ahead of the NonZeroU16 wrap so values 5..=255 now panic with the documented message instead of silently entering the codec state. The NonZeroU16 conversion remains for the zero check but is now post-assert sugar.
New tests:
new_skip_rejects_tier_5(#[should_panic(expected = "tier must be in 1..=4")])new_skip_rejects_tier_0(same expected message)
Generated by Claude Code
| pub fn split(&mut self, node_idx: NodeIdx, current_depth: u8) -> Result<[NodeIdx; 4], SplitError> { | ||
| if current_depth >= MAX_SPLIT_DEPTH { | ||
| return Err(SplitError::MaxSplitDepthReached(MaxSplitDepthReached { | ||
| depth: current_depth, | ||
| cap: MAX_SPLIT_DEPTH, |
There was a problem hiding this comment.
Validate split depth from tree state
split() enforces MAX_SPLIT_DEPTH using only the caller-provided current_depth; if callers pass an incorrect value (for example always 0), descendants can be split deeper than depth 3 and the code eventually hits the arena assert! panic instead of returning MaxSplitDepthReached. This makes a public API contract easy to violate and turns a recoverable error into a runtime panic.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in followup commit. Dropped the current_depth parameter from split(); the depth is now computed from the tree itself via a new public Ctu::depth_of(target: NodeIdx) -> Option<u8> helper (BFS from root, O(N) where N ≤ 85). Targets unreachable from NodeIdx::ROOT (e.g. orphans left behind by merge) yield the new SplitError::NodeNotReachable variant instead of falling through to the arena overflow assert!.
API change: split(node, depth) → split(node). All in-tree callers (5 inline tests + the recursive_split helper in arena_capacity_bound_85) updated. current_depth was never wired into a public consumer, so this is a pre-1.0 breaking change with zero out-of-tree fallout.
New tests:
split_unreachable_node_rejects— split a merge-orphaned node →NodeNotReachablesplit_at_max_depth_rejects(rewritten) — navigate to a depth-3 leaf via 3 real splits, then verify the 4th fails with the computed depth inMaxSplitDepthReached.depth.
Generated by Claude Code
P1 — merge() loses payload info (line 393):
Prior implementation compared only `mode` + `basin_idx`, so four
Delta children with different δ values (or Merge with different
merge_dir, Escape with different escape_idx) collapsed into ONE
leaf — the diff was silently dropped. Now compares full LeafCu
via PartialEq, surfacing any divergence as ChildrenDiverge.
New tests:
- merge_diverging_delta_payloads_rejects
- merge_diverging_merge_dirs_rejects
- merge_diverging_escape_idx_rejects
- merge_identical_delta_payloads_collapses (symmetric positive)
P2 — Ctu::new_skip accepts tier > 4 (line 306):
Constructor docstring promised `1..=4` but only checked `tier != 0`.
Added explicit `assert!((1..=4).contains(&tier), …)` so values 5..=255
panic at construction instead of polluting downstream state.
New tests:
- new_skip_rejects_tier_5 #[should_panic]
- new_skip_rejects_tier_0 #[should_panic]
P2 — split() trusts caller-supplied depth (line 333):
Prior signature took `current_depth: u8` from the caller; a wrong
value let descendants split past MAX_SPLIT_DEPTH and hit the arena
overflow `assert!` panic instead of returning MaxSplitDepthReached.
Removed the `current_depth` parameter; `split()` now computes the
depth from the tree itself via a new public `Ctu::depth_of(target)`
helper (BFS from root, O(N) where N ≤ 85). Targets unreachable from
ROOT (orphans left behind by merge) return the new
`SplitError::NodeNotReachable` variant.
API change:
pub fn split(node, depth) → pub fn split(node)
All in-tree callers (5 inline tests + the recursive_split helper in
arena_capacity_bound_85) updated. `current_depth` was never wired
into a public consumer, so this is a pre-1.0 breaking change with
zero out-of-tree fallout.
New tests:
- split_unreachable_node_rejects (orphan after merge → NodeNotReachable)
- split_at_max_depth_rejects (rewritten: navigate to depth-3
leaf via 3 real splits, then
verify the 4th fails with the
computed depth in the error)
Verified:
cargo test -p ndarray --features codec --lib hpc::codec 20 passed
cargo fmt --check clean
cargo clippy -p ndarray --features codec -- -D warnings clean
cargo clippy --features approx,serde,rayon -- -D warnings clean
Resolves PR #170 review threads r3272585024, r3272585033, r3272585037.
Summary
PR-X12 Worker A1 per
.claude/knowledge/pr-x12-codec-x265-design.md§ "Worker decomposition" line 195. Ships the structural foundation of the cognitive-cell codec — the CTU carrier type, quad-tree partition with arena-backed children, leaf-CU mode taxonomy. Subsequent workers A2-A8 consume only A1'sCtutype +crate::hpc::linalg::*, so they can run in parallel after this lands.4 files, +666 / -0 vs
master.Surface (
crate::hpc::codec::*, feature-gated)Key design choices
CtuArenaoverVec<CtuPartition>withwith_capacity(MAX_QUAD_TREE_NODES = 85). Matches the design doc's stack-arena pattern + PR-X10 invariant 1 (zero-cost abstractions, noBox<dyn>in hot paths). All node references areNodeIdx(u16), never raw pointers. The 85-node cap is structural (depth 3 × 4-way branching) so no further heap allocation can happen during split.CellMode/MergeDir— discriminants match the on-wire 2-bit codes the A7 rANS encoder will emit. Pinned bycell_mode_discriminants_match_wire_codestest.Option<…>onLeafCu— clarity-first; the fixed per-mode bit budget collapse happens in A7 (out of scope).LeafCu::{skip, delta, merge, escape}enforce "only the field for the current mode isSome" at the type level.split()refuses non-leaves (NotALeaf) and depth pastMAX_SPLIT_DEPTH(with cap returned in the error variant);merge()refuses non-Splits (NotASplit), inner-Split children (ChildNotLeaf), and heterogeneous-mode children (ChildrenDiverge). Merge does NOT compact the arena — orphaned child nodes remain; a GC pass is out of scope for A1.Test coverage (13)
new_skip_creates_root_leafsplit_root_yields_four_childrensplit_at_max_depth_rejectssplit_already_split_node_rejectsmerge_homogeneous_children_collapsesmerge_heterogeneous_children_rejectsmerge_split_child_rejectsmerge_leaf_rejectsleaf_constructors_set_correct_fieldsarena_capacity_bound_85— depth-3 recursive split → exactly 85 nodescell_mode_discriminants_match_wire_codesmerge_dir_discriminants_match_wire_codesnode_idx_root_is_zeroVerified locally
cargo test -p ndarray --features codec --lib hpc::codec— 13 passedcargo check -p ndarray— clean (no-codec build path)cargo fmt --check— cleancargo clippy -p ndarray --features codec -- -D warnings— cleancargo clippy --features approx,serde,rayon -- -D warnings— cleanOut of scope (A2-A8 — follow-up sprints)
mode.rs— 2-bit mode bit-pack/unpack helperspredict.rs— intra/inter predictiontransform.rs— optional DCT for delta residualsquantize.rs— 8-bit scalar quantiser + rate modelrdo.rs— λ-RDO loop + mode selectionans.rs— rANS entropy coder + adaptive freq tablestream.rs— byte-stream pack/unpack + header + frame markersA2-A5 can spawn in parallel once this lands; A6+A7 in parallel after A2-A5; A8 sequential after A7.
🤖 Generated with Claude Code
Generated by Claude Code