Skip to content

feat(codec): PR-X12 A2 mode bit-pack + A3-intra prediction kernel#195

Merged
AdaWorldAPI merged 5 commits into
masterfrom
claude/pr-x12-a2-a3-intra-prediction
May 22, 2026
Merged

feat(codec): PR-X12 A2 mode bit-pack + A3-intra prediction kernel#195
AdaWorldAPI merged 5 commits into
masterfrom
claude/pr-x12-a2-a3-intra-prediction

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

@AdaWorldAPI AdaWorldAPI commented May 22, 2026

Summary

Two codec sub-cards on top of PR-X12 A1 (Ctu / LeafCu):

A2 — src/hpc/codec/mode.rs (~270 lines, 11 tests)

  • 16-bit header: 2-bit mode (Skip/Merge/Delta/Escape) + 12-bit basin_idx
  • MergeDir 2-bit pack/unpack (high bits masked)
  • Whole-leaf compact pack/unpack — Skip=2B, Merge=3B, Delta=3B, Escape=6B
  • packed_byte_len() const fn for buffer pre-sizing
  • MAX_BASIN_IDX (4095) + BASIN_NONE sentinel
  • Stream roundtrip test exercising 8 mixed-mode leaves into one buffer

A3-intra — src/hpc/codec/predict.rs (~330 lines, 12 tests)

  • IntraContext { basin_idx, delta_i32, [Option<&LeafCu>; 4] NESW }
  • IntraConfig { escape_next_idx: Option<u32> }
  • predict_intra() decision tree: Skip → Merge → Delta → Escape
    (monotone wire cost 2 → 3 → 3 → 6 bytes; cheapest-fit policy)
  • Merge match: same basin_idx + same δ as u8 (sign-tolerant wrapping cast)
  • Escape fallback: lossy i8 clamp when allocator absent (never panics, lossy)
  • End-to-end pack/unpack chain test through the decision

Deferred to follow-up: A3-inter (cross-tier neighbour scan from BlockedGrid L2/L3), A4 transform, A6 RDO, A7 rANS entropy coder, A8 stream framing.

Test plan

  • cargo check --features codec --lib — clean
  • cargo test --features codec --lib hpc::codec45 passed (22 ctu + 11 mode + 12 predict)
  • cargo fmt --all -- --check — clean
  • cargo clippy --features codec --lib -- -D warnings — clean
  • Reviewer: verify the Merge sign-tolerance contract matches the eventual A7 rANS decoder's u8 read (no sign-bit handling on the wire is intentional)
  • Reviewer: confirm the lossy Escape fallback (no allocator → i8 clamp) is acceptable as a misconfiguration guard, vs. returning Result and forcing callers to handle

Generated by Claude Code

Summary by CodeRabbit

  • New Features

    • Expanded codec surface with deterministic intra-prediction (Skip → Merge → Delta → Escape) and public mode/prediction APIs.
    • Compact, stable bit-packing for cell modes and metadata with predictable per-mode encoded sizes.
    • Optional escape allocator with lossy fallback when allocator is absent.
  • Tests

    • Extensive unit and round-trip tests covering packing, decoding, prediction precedence, masking, and buffer-size safety.
  • Documentation

    • Added cross-domain design notes linking codec primitives to other application areas and integration plans.

Review Change Stack

A2 — mode.rs (~270 lines, 11 tests):
- 16-bit header: 2-bit mode (Skip/Merge/Delta/Escape) + 12-bit basin_idx
- MergeDir 2-bit pack/unpack (high bits masked)
- Whole-leaf compact pack/unpack: Skip=2B, Merge=3B, Delta=3B, Escape=6B
- packed_byte_len() const fn for buffer pre-sizing
- MAX_BASIN_IDX (4095) + BASIN_NONE sentinel
- Stream roundtrip test for mixed-mode leaves

A3-intra — predict.rs (~330 lines, 12 tests):
- IntraContext { basin_idx, delta_i32, NESW neighbours }
- IntraConfig { escape_next_idx: Option<u32> }
- predict_intra() decision tree: Skip → Merge → Delta → Escape
  (monotone wire cost 2 → 3 → 3 → 6 bytes; cheapest-fit policy)
- Merge match: same basin_idx + same δ as u8 (sign-tolerant wrapping cast)
- Escape fallback: lossy i8 clamp when allocator absent (never panics)
- End-to-end pack/unpack chain test through the decision

Deferred to follow-up: A3-inter (cross-tier neighbour scan from
BlockedGrid L2/L3), A4 transform, A6 RDO, A7 rANS, A8 stream framing.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d61faf3c-9df4-478c-beaa-22ee708be1c2

📥 Commits

Reviewing files that changed from the base of the PR and between 01c77cc and 2423298.

📒 Files selected for processing (1)
  • src/hpc/codec/mode.rs

📝 Walkthrough

Walkthrough

This PR adds two new modules: mode.rs for compact leaf header/tail packing and roundtrip pack/unpack helpers, and predict.rs for an encoder-side intra prediction decision kernel; mod.rs is updated to expose these modules and re-export their APIs. It also adds cross-domain design docs.

Changes

Codec Leaf Serialization and Intra Prediction

Layer / File(s) Summary
Leaf serialization wire format
src/hpc/codec/mode.rs
Implements 16-bit little-endian header (top 2 bits = CellMode, bottom 12 bits = basin_idx), MAX_BASIN_IDX/BASIN_NONE, pack_header/unpack_header, pack_merge_dir/unpack_merge_dir, pack_leaf/unpack_leaf, packed_byte_len, and comprehensive unit tests including buffer-length validation and stream roundtrips.
Intra prediction mode-decision kernel
src/hpc/codec/predict.rs
Adds IntraContext and IntraConfig and implements predict_intra with deterministic Skip → Merge → Delta → Escape logic (i8-fit gating, wrapping-cast delta byte, NEWS neighbor scan for Merge, optional Escape cursor with post-increment, lossy clamp fallback). Includes merge_dir_from_index, is_no_basin, and thorough tests and pack/unpack roundtrips.
Codec module API exposure
src/hpc/codec/mod.rs
Updates module docs, declares pub mod mode; and pub mod predict;, and re-exports mode packing/unpacking helpers and constants plus predict APIs/types (IntraContext, IntraConfig, predict_intra, is_no_basin).
Design & knowledge docs
.claude/knowledge/*
Adds pr-x12-cross-domain-synergies.md with cross-domain mapping, plans, explorations, and debt; updates pr-x12-codec-x265-design.md with a see-also reference.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble headers, pack them neat,

tails tucked in bytes both small and sweet,
Skip hops first, Merge joins the crew,
Delta wraps, Escape adds its cue,
A leaf encoded — a rabbit’s view.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main changes: adding A2 mode bit-packing and A3 intra prediction kernel to the codec module, matching the core deliverables in src/hpc/codec/mode.rs and src/hpc/codec/predict.rs.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/pr-x12-a2-a3-intra-prediction

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/hpc/codec/mode.rs`:
- Around line 58-64: The public constants and functions (MAX_BASIN_IDX,
BASIN_NONE, unpack_header, pack_merge_dir, unpack_merge_dir, unpack_leaf,
packed_byte_len and the other public items referenced around lines 84-89,
105-116, 182-221) are missing doctest examples; add short /// ```rust examples
that demonstrate basic usage and assertions so they compile as doctests (or use
no_run when appropriate). For each symbol, add one-line examples showing
input->output (e.g., assert_eq!(MAX_BASIN_IDX, (1<<12)-1);
assert_eq!(BASIN_NONE, MAX_BASIN_IDX); call unpack_header with a sample packed
header and assert fields; call pack_merge_dir/unpack_merge_dir as a roundtrip
and assert equality; call unpack_leaf and packed_byte_len with a minimal valid
buffer and assert expected lengths). Ensure examples import any types needed and
are small and self-contained so rustdoc tests pass.
- Around line 153-156: pack_leaf currently rejects buffers smaller than a
hardcoded 6 bytes which ignores mode-specific sizes and causes valid
Skip/Merge/Delta buffers to be rejected; change the length check to compute the
required size using the mode-aware helper (e.g., call packed_byte_len(leaf.mode)
or the equivalent function already used elsewhere) and compare out.len() against
that required usize before returning None, ensuring packed_byte_len is in scope
and used to determine the minimum buffer length for LeafCu packing.

In `@src/hpc/codec/predict.rs`:
- Around line 67-118: Add runnable doc-tests for the public API items: provide
concise /// examples demonstrating basic usage for IntraContext, IntraConfig,
and the is_no_basin helper so they satisfy public-API docs. Specifically, add a
small doctest showing how to construct an IntraContext (populate basin_idx,
delta_i32, neighbours using Option<&LeafCu> stubs or dummy values), an example
for IntraConfig showing both Some(escape_next_idx) and None behavior, and an
example call to is_no_basin illustrating its return for a boundary/empty basin;
ensure examples compile by importing any types used or using minimal mock values
and mark them as code blocks in the doc comments for each symbol (IntraContext,
IntraConfig, is_no_basin).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4bbaaa0b-b86e-4568-854e-a5c95e160f2e

📥 Commits

Reviewing files that changed from the base of the PR and between 6ff6f48 and 26d987f.

📒 Files selected for processing (3)
  • src/hpc/codec/mod.rs
  • src/hpc/codec/mode.rs
  • src/hpc/codec/predict.rs

Comment thread src/hpc/codec/mode.rs Outdated
Comment thread src/hpc/codec/mode.rs
Comment thread src/hpc/codec/predict.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26d987f9db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/hpc/codec/predict.rs Outdated
// Multiple matches all collapse to the same coded leaf, so the
// first-hit policy is order-deterministic without affecting
// bitstream length.
let our_delta_u8 = ctx.delta_i32 as u8; // wrapping cast matches A2 pack
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent overflow deltas from taking Merge path

predict_intra computes our_delta_u8 with a wrapping cast and checks Merge candidates before verifying that delta_i32 fits in i8, so out-of-range deltas can incorrectly return Merge instead of Escape. For example, delta_i32 = 200 wraps to 0xC8 and can match a neighbor byte 0xC8 (which represents -56 in i8 space), producing a leaf that cannot reconstruct the original delta and silently corrupts decoded values.

Useful? React with 👍 / 👎.

Comment thread src/hpc/codec/mode.rs Outdated
Comment on lines +154 to +155
if out.len() < 6 {
return None;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow packing into mode-sized output buffers

pack_leaf currently returns None unless out.len() >= 6, even when the leaf mode only needs 2 bytes (Skip) or 3 bytes (Merge/Delta). This breaks valid callers that pre-size streams using packed_byte_len and then pass exact remaining slices, especially near the end of a buffer, causing spurious packing failures for otherwise encodable data.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

@AdaWorldAPI AdaWorldAPI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read base (ctu.rs) + both new files end-to-end. The shape is solid — clean module boundary, 45 tests, the decision tree is the right call (Skip → Merge → Delta → Escape ordered by wire cost), and A1's P1 codex fix (full LeafCu equality on merge) carries through cleanly. Two correctness / API issues are worth holding the merge on, plus a few P2s.

CI snapshot at this commit: format, clippy/1.95, MSRV, AVX-512 check, hpc-stream-parallel, no-std all green; tests/{stable,beta,1.95.0} + native-backend/stable were still in-progress when I started.

Posting as COMMENT because GitHub blocks REQUEST_CHANGES on PRs authored under the same login — treat the P0 below as a request-changes in spirit.


🔴 P0 — Merge scan can fire incorrectly for out-of-i8-range δ (predict.rs:153)

let our_delta_u8 = ctx.delta_i32 as u8;        // wrapping cast
for (i, nb_slot) in ctx.neighbours.iter().enumerate() {
    ...
    if nb.delta != Some(our_delta_u8) { continue; }
    return LeafCu::merge(ctx.basin_idx, dir);
}

The wrapping cast collapses mod 256. A neighbor's stored δ_u8 was always written by a Delta-mode encode, so it represents an i8 in [-128, 127]. But the current cell's δ_i32 is allowed to be any i32 — if |δ| > 127, the wrapping cast can alias a neighbor's u8 even though the underlying i32 values differ by 256+. Concrete:

our_delta_i32 = 200            → our_delta_u8 = 0xC8 (200)
nb stored Delta(0xC8)          ← came from δ_i32 = -56 (the only i8 that fits)
Merge fires (alias match).
Decode reconstructs basin + (-56), original was basin + 200.
→ silent data loss of 256 units.

The PR body itself says "verify the Merge sign-tolerance contract matches the eventual A7 rANS decoder's u8 read" — sign-tolerance is correct inside the i8 window. Outside it, the wrapping cast aliases and the protocol breaks.

Fix is one line — gate the scan on |δ| ≤ 127:

if (-128..=127).contains(&ctx.delta_i32) {
    let our_delta_u8 = ctx.delta_i32 as u8;
    for (i, nb_slot) in ctx.neighbours.iter().enumerate() { ... }
}

(Or skip Merge entirely when the i8 guard fails, with a comment naming the alias.) No test catches this today — merge_negative_delta_via_wrapping_cast uses δ = -17 (in range). Add a regression test where our δ_i32 = 200 and a neighbor has Delta(_, 0xC8), assert the produced leaf is not Merge.

Caveat for the author: if the basin-nearest-neighbour search guarantees |δ| ≤ 127 for all valid inputs, this is theoretical and the escape_next_idx allocator path is dead code too. The i32 field width and the explicit Escape branch say otherwise — please confirm against the design doc.


🟡 P1 — IntraConfig::escape_next_idx: Option<u32> is a static field, not an allocator (predict.rs:107)

Doc:

Stateless API today: encoder calls escape_next_idx once per Escape decision.

But the impl just reads cfg.escape_next_idx — every cell in a batch with Escape mode emits the same idx. A caller doing:

for cell in cells {
    let leaf = predict_intra(&ctx, &cfg);   // cfg.escape_next_idx = Some(N)
    if leaf.mode == Escape { escape_vec.push(cell_val); }
}

gets multiple Escape leaves all pointing at idx N → escape-vector collision on decode.

Pick one of:

  • (a) accept &mut dyn FnMut() -> u32 so each Escape call advances,
  • (b) rename the field to escape_idx_to_assign (or similar) so the constraint that the caller must rebuild IntraConfig per Escape is loud,
  • (c) return the next idx alongside the leaf so the caller knows when to bump.

Test coverage gap mirrors the bug: escape_when_delta_overflows_i8_and_allocator_present only exercises one Escape. Add a two-cell test that runs predict twice and asserts the second Escape sees a different idx.


🟡 P1 — NESW / NEWS doc-vs-discriminant mismatch (predict.rs:79)

/// `neighbours`: NESW (in [`MergeDir`] discriminant order) ...

But MergeDir in ctu.rs:81-89 is North=0, East=1, West=2, South=3 — that's NEWS, not NESW. merge_dir_from_index in predict.rs:191-197 faithfully reflects the discriminant order (slot 2 → West, slot 3 → South), so a caller who reads the docstring literally and supplies neighbours as [N, E, S, W] will get every south/west merge labeled with the wrong direction.

Two ways out: (a) fix the doc to say NEWS (one-character edit), or (b) reorder MergeDir's discriminants — but that breaks the on-wire 2-bit codes pinned by cell_mode_discriminants_match_wire_codes / merge_dir_discriminants_match_wire_codes in ctu.rs and would need a consumer audit. (a) is the right move. While there, drop an explicit table in IntraContext and merge_dir_from_index:

slot 0 → MergeDir::North   (discr 0)
slot 1 → MergeDir::East    (discr 1)
slot 2 → MergeDir::West    (discr 2)
slot 3 → MergeDir::South   (discr 3)

No test exercises slot 2 or slot 3 specifically (the closest, merge_picks_first_hit_in_nesw_order, only checks slots 0 and 1). Add at least one slot-3 = South case.


P2 — header ASCII diagram contradicts implementation (mode.rs:18-26)

The diagram puts M0/M1 at the MSB end (bits 15/14):

MSB                                                    LSB
┌──┬──┬──────────────────────────────┐
│M0│M1│         basin_idx (12)        │

But pack_header is (mode_bits & 0b11) << 12 | (basin & 0xFFF) — the mode lands at bits 12-13, not 14-15. The narrative one paragraph down ("remaining 2 bits at the top of the second byte are reserved for the encoder's future merge_dir overlap") is consistent with the impl, just contradicts the diagram. Roundtrip tests pass because pack/unpack are symmetric. Re-draw the diagram, or swap to << 14 if you actually wanted the top of the u16 (cleaner for future extension).


P2 — pack_leaf requires worst-case 6-byte buffer for every mode (mode.rs:142)

unpack_leaf does per-mode bounds checks (2 / 3 / 3 / 6). pack_leaf bails at out.len() < 6 regardless of mode. Documented, but it breaks the streaming pattern: a buffer pre-sized to sum(packed_byte_len(mode_for_cell)) (the natural shape, since packed_byte_len exists exactly for that) runs out of "headroom" before the last cell can pack a Skip into the remaining 2 bytes. The streaming test happens to dodge this because it sizes at strict worst-case 8 × 6.

Fix:

let needed = packed_byte_len(leaf.mode);
if out.len() < needed { return None; }

Symmetric to what unpack_leaf already does.


P2 — lossy Escape fallback emits CellMode::Delta (predict.rs:179-186)

Re: the PR body's open question #2 — yes, this is worth a design call. Current behavior: encoder silently clamps δ to i8 range and emits Delta(127) (or -128). Valid bytestream, lossy reconstruction.

  • If the basin lookup actually guarantees |δ| ≤ 127, the whole escape_next_idx / lossy-fallback branch is dead — drop it, return LeafCu directly (and P0 above also goes away).
  • If |δ| can genuinely exceed 127, the lossy clamp is a footgun in disguise — Result<LeafCu, EncodeError::EscapeAllocatorMissing> is the honest signature, and callers can opt into a silent-clamp wrapper if they want it.

The "never panics, lossy" stance is fine for a v1 prototype but at minimum the docstring should call out the lie of CellMode::Delta covering a value that overflows i8.


Nits

  • merge_skipped_when_neighbour_mode_is_not_delta puts the Escape neighbor at slot 3 but only asserts the result mode — doesn't probe whether slot 3 truly maps to South. Slot-order coverage gap (same as P1 #3).
  • A2 doc table says "MergeDir 2-bit" tail but pack_merge_dir writes a whole byte; clarify "1 byte (low 2 bits used; high 6 reserved)".
  • is_no_basin is a one-line eq check on a pub const; consumers can write basin == BASIN_NONE themselves. Keep or drop, no strong view — but if you keep it, a doc-test demonstrating the BASIN_NONE sentinel use case would justify the helper.

Suggested action

Block on P0 + P1 (escape allocator) + P1 (NEWS doc) before merge. P2s can land as a follow-up commit on the same branch or a fresh PR. A3-inter / A4-A8 deferred work is correctly scoped out.


Generated by Claude Code

claude added 2 commits May 22, 2026 06:42
… pack_leaf

P1 (codex) — overflow δ no longer aliases to Merge:
  predict_intra previously took `our_delta_u8 = δ as u8` BEFORE
  checking i8 fit, so e.g. δ=200 wrapped to 0xC8 and could match a
  neighbour byte 0xC8 (i8=-56), silently corrupting reconstruction.
  Now the i8 range check gates both the Merge scan and the Delta
  branch; out-of-range δ falls straight through to Escape (or the
  documented lossy clamp).
  + 2 regression tests:
    - overflow_delta_does_not_alias_to_merge
    - overflow_delta_with_allocator_takes_escape

P2 (coderabbit + codex) — pack_leaf accepts mode-sized buffers:
  pack_leaf used a 6-byte minimum for all modes; callers pre-sizing
  by packed_byte_len() got spurious None for Skip(2)/Merge,Delta(3).
  Length check now gates on packed_byte_len(leaf.mode).
  + 1 regression test: pack_leaf_accepts_mode_sized_buffers

P3 (coderabbit) — doctest examples on the public-API surface:
  Added /// runnable examples to MAX_BASIN_IDX, BASIN_NONE,
  unpack_header, pack_merge_dir, unpack_merge_dir, unpack_leaf,
  packed_byte_len, IntraContext, IntraConfig, is_no_basin.
  Removed unused LeafCu import from existing predict_intra doctest.

Gates:
  cargo test --features codec --lib hpc::codec → 48 passed
  cargo test --features codec --doc hpc::codec → 14 passed
  cargo fmt --all -- --check → clean
  cargo clippy --features codec --lib -- -D warnings → clean
…doc + diagram

P1 — escape allocator collision in batched encoding:
  Old: cfg.escape_next_idx: Option<u32> was a static field — every
  Escape leaf in a batch got the same idx, colliding on the escape
  vector slot at decode time.
  New: predict_intra now takes a third arg `escape_next: Option<&mut u32>`
  that the kernel post-increments when Escape fires. Sequential cells
  in the same batch see fresh, non-colliding idxs.
  IntraConfig becomes empty (reserved for future RDO knobs).
  + 1 regression test: escape_allocator_advances_across_batched_calls
  + escape_when_delta_overflows_i8_and_allocator_present extended
    to assert the cursor advances.

P1 — NESW vs NEWS doc/code mismatch:
  MergeDir discriminants are North=0, East=1, West=2, South=3 (NEWS),
  but the doc said "NESW". Fixed in IntraContext docstring + the
  inline Merge-scan comment. Added explicit slot-to-direction table.
  + 1 regression test: merge_slot_2_maps_to_west_and_slot_3_to_south

P2 — ASCII header diagram in mode.rs:
  Old diagram put M0/M1 at MSB bits 15-14, but pack_header uses
  `(mode << 12) | basin`, placing mode at bits 12-13 with bits 14-15
  reserved at zero. Redrew the diagram with explicit bit indices
  and labelled the reserved high bits for future encoder use.

Nits:
- A2 doc table now says "Merge tail: 1 byte (low 2 bits = MergeDir;
  high 6 reserved)" instead of the misleading "MergeDir 2-bit".
- Renamed merge_picks_first_hit_in_nesw_order → ..._in_news_order to
  match the corrected ordering.

Already-addressed (no-op this commit):
- P0 overflow Merge alias → fixed in b39a576
- P2 pack_leaf 6-byte minimum → fixed in b39a576

Gates:
  cargo test --features codec --lib hpc::codec → 50 passed (+2 new)
  cargo test --features codec --doc hpc::codec → 15 passed (+1 new)
  cargo fmt --all -- --check → clean
  cargo clippy --features codec --lib -- -D warnings → clean
Copy link
Copy Markdown
Owner Author

Thanks for the thorough review. Per-finding status after b39a5769 + b44fe59f:

P0 — overflow Merge alias (predict.rs:153) ✅ b39a5769

Gated both the Merge scan and the Delta branch on an explicit (-128..=127).contains(&δ) check. Out-of-range δ now falls straight through to Escape (or the lossy clamp). Regression test overflow_delta_does_not_alias_to_merge uses your exact example (δ=200 against neighbour Delta(_, 0xC8)) and asserts the result is not Merge.

"if the basin-nearest-neighbour search guarantees |δ| ≤ 127, this is theoretical"

Conservative read: the i32 field width + the explicit Escape branch are both intentional. Real Δ-from-quantised-basin can exceed 127 (e.g., when the cell's true value lands far from any basin in a sparse codebook). Keeping the guard.

P1 — IntraConfig::escape_next_idx collision ✅ b44fe59f

You're right — every batched Escape decision got the same idx. Took option (a) but spelled &mut u32 rather than &mut dyn FnMut() -> u32 (zero-cost, no boxing, keeps IntraConfig: Copy). New signature:

pub fn predict_intra(
    ctx: &IntraContext,
    cfg: &IntraConfig,
    escape_next: Option<&mut u32>,
) -> LeafCu

Kernel post-increments the cursor on Escape; None keeps the lossy Delta fallback. IntraConfig is now empty (reserved as a forward-compat anchor for A6 RDO knobs). Regression test escape_allocator_advances_across_batched_calls runs two Escape cells back-to-back and asserts (N, N+1) + cursor at N+2.

P1 — NESW vs NEWS doc/code mismatch ✅ b44fe59f

Doc fix. IntraContext docstring + the Merge-scan inline comment + the field doc all say NEWS now, with the explicit N=0, E=1, W=2, S=3 table inline. Regression test merge_slot_2_maps_to_west_and_slot_3_to_south exercises the south/west slots that had zero coverage before. Did not touch MergeDir's discriminants — cell_mode_discriminants_match_wire_codes pins them.

P2 — header ASCII diagram (mode.rs:18-26) ✅ b44fe59f

Redrew the diagram to show [0, 0, M1, M0, basin_11..0] with explicit bit indices, matching (mode << 12) | basin. Bits 14-15 labelled as reserved at zero for future encoder use. Did NOT change the bit positions (kept at 12-13) to avoid disturbing the existing wire format — but called the reserved bits out as a future extension surface.

P2 — pack_leaf worst-case 6-byte minimum ✅ b39a5769

out.len() < packed_byte_len(leaf.mode) now. Skip(2)/Merge/Delta(3) buffers accept correctly. Regression test pack_leaf_accepts_mode_sized_buffers verifies all three sizes + that a 3-byte buffer still rejects Escape.

P2 — lossy Escape fallback emits CellMode::Delta (predict.rs:179-186) ✅ docs hardened in b44fe59f

Kept the fallback (matches the "never panicking" stance for misconfigured encoders) but the docstring now states explicitly: "clamps δ to i8 range and emits a Delta leaf whose reconstruction is not bit-exact (caller must accept the loss)." The lossy-Delta path mode-tagging is acknowledged in the inline comment. Open to switching to Result<LeafCu, EncodeError> if you prefer — say the word.

Nits ✅ b44fe59f

  • Slot-3 South coverage gap → new test (above).
  • "MergeDir 2-bit" table entry → now "1 byte (low 2 bits = MergeDir; high 6 reserved)".
  • is_no_basin doctest already shipped in b39a5769.

Gates

  • cargo test --features codec --lib hpc::codec → 50 passed (+5 since opening: overflow regression × 2, batched escape, slot 2/3 South, mode-sized buffer)
  • cargo test --features codec --doc hpc::codec → 15 passed
  • cargo fmt --all -- --check clean
  • cargo clippy --features codec --lib -- -D warnings clean

Generated by Claude Code

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/hpc/codec/mode.rs (2)

194-210: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject malformed LeafCus instead of silently rewriting them.

The unwrap_or(...) fallbacks make pack_leaf non-bijective: an invalid Merge/Delta/Escape leaf gets serialized as a different valid leaf (North, 0, or 0). That hides upstream invariant breaks and can corrupt the stream while still returning Some(...).

Suggested fix
     let tail_len = match leaf.mode {
         CellMode::Skip => 0,
         CellMode::Merge => {
-            // Caller guarantees `merge_dir.is_some()` for `Merge` mode
-            // (LeafCu::merge constructor enforces this). Fall back to
-            // North if the invariant is violated, to keep encoder
-            // robustness — the decoder will still produce a valid leaf.
-            out[2] = pack_merge_dir(leaf.merge_dir.unwrap_or(MergeDir::North));
+            out[2] = pack_merge_dir(leaf.merge_dir?);
             1
         }
         CellMode::Delta => {
-            out[2] = leaf.delta.unwrap_or(0);
+            out[2] = leaf.delta?;
             1
         }
         CellMode::Escape => {
-            let idx = leaf.escape_idx.unwrap_or(0);
+            let idx = leaf.escape_idx?;
             out[2..6].copy_from_slice(&idx.to_le_bytes());
             4
         }
     };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hpc/codec/mode.rs` around lines 194 - 210, pack_leaf currently silently
substitutes defaults for missing fields (using unwrap_or) for
CellMode::Merge/Delta/Escape, making serialization non-bijective; change
pack_leaf (the match over leaf.mode) to validate required fields instead of
falling back: for CellMode::Merge require leaf.merge_dir.is_some() and return
None (or propagate error) if missing, using
pack_merge_dir(leaf.merge_dir.unwrap()) when present; for CellMode::Delta
require leaf.delta.is_some() and use leaf.delta.unwrap(); for CellMode::Escape
require leaf.escape_idx.is_some() and copy from
escape_idx.unwrap().to_le_bytes(); remove the unwrap_or fallbacks so malformed
LeafCu values are rejected rather than rewritten.

69-78: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

BASIN_NONE currently collides with a real 12-bit basin value.

MAX_BASIN_IDX is treated as encodable data, but BASIN_NONE is set to that same value. That makes basin 4095 ambiguous: it can round-trip through pack_header, while is_no_basin(4095) also treats it as “no basin”. This leaks a cross-layer contract bug into callers and prediction logic for the highest real basin index. Either reserve 4095 and tighten the valid range/docs/tests accordingly, or move the sentinel out of band.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hpc/codec/mode.rs` around lines 69 - 78, BASIN_NONE currently equals
MAX_BASIN_IDX, making the highest encodable basin (4095) ambiguous; change the
encoding so 4095 is a reserved sentinel by reducing the valid maximum basin
value and keeping BASIN_NONE==4095: set MAX_BASIN_IDX to (1<<12)-2 (4094) and
keep BASIN_NONE as (1<<12)-1 (4095), then update any code and tests that
validate basin ranges — specifically adjust checks in is_no_basin, pack_header,
and any asserts/examples that compared BASIN_NONE to MAX_BASIN_IDX or assumed
4095 was a valid encodable basin. Ensure documentation/comments reflect the
tightened valid range.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/hpc/codec/mode.rs`:
- Around line 194-210: pack_leaf currently silently substitutes defaults for
missing fields (using unwrap_or) for CellMode::Merge/Delta/Escape, making
serialization non-bijective; change pack_leaf (the match over leaf.mode) to
validate required fields instead of falling back: for CellMode::Merge require
leaf.merge_dir.is_some() and return None (or propagate error) if missing, using
pack_merge_dir(leaf.merge_dir.unwrap()) when present; for CellMode::Delta
require leaf.delta.is_some() and use leaf.delta.unwrap(); for CellMode::Escape
require leaf.escape_idx.is_some() and copy from
escape_idx.unwrap().to_le_bytes(); remove the unwrap_or fallbacks so malformed
LeafCu values are rejected rather than rewritten.
- Around line 69-78: BASIN_NONE currently equals MAX_BASIN_IDX, making the
highest encodable basin (4095) ambiguous; change the encoding so 4095 is a
reserved sentinel by reducing the valid maximum basin value and keeping
BASIN_NONE==4095: set MAX_BASIN_IDX to (1<<12)-2 (4094) and keep BASIN_NONE as
(1<<12)-1 (4095), then update any code and tests that validate basin ranges —
specifically adjust checks in is_no_basin, pack_header, and any asserts/examples
that compared BASIN_NONE to MAX_BASIN_IDX or assumed 4095 was a valid encodable
basin. Ensure documentation/comments reflect the tightened valid range.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: cc9f0222-a5ee-45ba-8e39-fba9c0741545

📥 Commits

Reviewing files that changed from the base of the PR and between b39a576 and b44fe59.

📒 Files selected for processing (2)
  • src/hpc/codec/mode.rs
  • src/hpc/codec/predict.rs

722-line knowledge doc capturing the architectural claim that PR-X12's
primitives (LeafCu, pack_leaf, predict_intra, Ctu, eventual rANS) are
not video-codec-specific — they serve four loads on a single
predictive-coder substrate:

  1. Cognitive cells (the original target)
  2. 3D Gaussian Splat coefficient compression
  3. Transformer attention sparsification
  4. Distributed-SGD gradient streaming

Sections:
- The 4-load isomorphism (§ 2) and 16-state (CellMode × MergeDir)
  classification table across all four domains
- Primitive → load mapping matrix (§ 3) with file:line refs
- 10 cross-domain epiphanies (§ 4) — including MergeDir-as-topology,
  predict_intra-as-attention-sparsifier, escape_next-as-allreduce-slot,
  mode-coded LoRA
- 6 integration plans (§ 5) with worker × week + dependency graph
- 8 exploration paths (§ 6) for sprint-scale research
- 6 holy grail outcomes (§ 7) — sub-1-bit/Gaussian 3DGS,
  bit-exact attention, federated SGD at 8-16×, Lance substrate identity
- 10 codec-side debts (§ 8) and 13 stack-side debts (§ 9)
- Sequencing summary (§ 10): critical path = A7 rANS

Companion to pr-x12-codec-x265-design.md (the mechanical design); a
"See also" breadcrumb added there points here.
AdaWorldAPI pushed a commit that referenced this pull request May 22, 2026
Lands `.claude/knowledge/pr-x12-codec-cognitive-substrate-mapping.md`
(~576 lines) as the survives-compaction companion to the as-shipped
`pr-x12-codec-x265-design.md`. Generalises the codec spec across the
four industries it implicitly unifies (HEVC video, Gaussian splat,
cognitive shaders / sparse attention, gradient compression for LLM
training).

# Why this doc

Triggered by the PR #195 review session. PR-X12's mode taxonomy
(Skip/Merge/Delta/Escape) shows up under different names in four
disconnected research communities; each treats its own corner as
the central knob (HEVC RDO, ZeRO buckets, splat sparsity reg,
attention pruning). Without explicit naming of the unification,
future agents will rediscover each corner independently and
reimplement what the codec already provides.

# Structure (citable by section number)

- §0 — the big claim (PR-X12 is the gradient-quantisation substrate
  GenAI training has been missing for two years)
- §1 — four-axis mapping table (x265 / splat / cognitive / gradient)
- §2-§7 — deep mappings (mode taxonomy, CTU quad-tree, palette/basin
  codebook, transform basis, rANS, λ-RDO)
- §8 — 15 numbered epiphanies (E-1..E-15)
- §9 — 7 holy grail claims (H-1..H-7)
- §10 — integration plan per sub-card (A4/A6/A7/A8) + 3 new PRs
  (splat, cognitive, gradient consumers)
- §11 — exploration paths ranked by confidence (15 entries)
- §12 — technical debt inventory (23 items T-1..T-23 across codec,
  ndarray substrate, lance-graph cognitive, cross-repo, PR #195)
- §13 — 6 blasgraph/MKL synergies the HEVC team couldn't reach in
  2013 (block-matched ME, batched DCT, partition tree as tropical-
  GEMM, CABAC replacement, learned deblocking, k-means at frame rate)
- §14 — cross-references (design docs, rules, code paths)
- §15 — how to use this doc (read order per use case)

# Holy grail claims (load-bearing, citable)

- H-1: PR-X12 + cam_pq is the HEVC SCC codec 2013 hardware couldn't
  build (4096-entry codebook at 60 fps)
- H-2: The transform IS the optimiser (DCT-II ↔ Adam ↔ KFAC ↔
  learned conv all share `Δ' = B·Δ`) — most underrated mapping
- H-3: CTU quad-tree is the universal hierarchical-attention
  substrate (Mistral sliding window is depth-3 leaf processing)
- H-4: rANS + k-means achieves Shannon-optimal lossless gradient
  compression (every published scheme is a special case)
- H-5: PR-X12 generalises ZeRO (Merge captures parameter sharing
  that ZeRO's 3-bucket scheme can't)
- H-6: 64×64 CTU is the right unit for both 4K video and 7B LLMs
- H-7: The codec is the substrate; everything else is a renaming

# Technical debt (23 items)

Severity gradient P0/P1/P2/P3. Notable open items:
- T-1, T-2: PR #195 CodeRabbit findings (BASIN_NONE collision +
  unwrap_or non-bijection) — actionable today
- T-10..T-15: ndarray substrate (HPC graduation incomplete, no
  `Transform` trait yet, NEON tile-GEMM stub)
- T-20..T-23: cross-repo coordination (cognitive-substrate-
  convergence-v1.md cross-ref, causal-edge v2 metadata wiring)

# Board hygiene

AGENT_LOG.md prepended with the session entry per CLAUDE.md rule
"every PR adding a finding/agent run must PREPEND an AGENT_LOG
entry in the same commit". Entry cites the doc structure, the 15
epiphanies + 7 holy grails by ID, and the 23 debt items.

# No code changes

Pure architecture / knowledge artifact. `cargo check --lib` clean
(nothing to recheck).

https://claude.ai/code/session_01HbqooFZHAjaUtFEzhA1R2u
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/knowledge/pr-x12-cross-domain-synergies.md:
- Line 682: The fenced code block that begins with the ASCII diagram containing
"Plan A (rANS A7)" is missing a language tag and triggers markdownlint MD040;
update the opening fence (the triple backticks that precede the diagram) to
include a language tag such as text (i.e., change ``` to ```text) so the block
reads as a language-tagged fenced code block and keep the closing fence as-is;
locate the block by searching for the triple backticks immediately before the
"Plan A (rANS A7)" diagram.
- Line 66: The fenced code block at line 66 is missing a language tag, causing
MD040; update the opening fence for that block (the triple-backtick before the
ASCII diagram) to include a language identifier such as "text" (e.g., change ```
to ```text) so the block is properly tagged and the linter warning is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4c69f065-7699-4cde-bc00-528fbdba23af

📥 Commits

Reviewing files that changed from the base of the PR and between b44fe59 and 01c77cc.

📒 Files selected for processing (2)
  • .claude/knowledge/pr-x12-codec-x265-design.md
  • .claude/knowledge/pr-x12-cross-domain-synergies.md
✅ Files skipped from review due to trivial changes (1)
  • .claude/knowledge/pr-x12-codec-x265-design.md


All four loads share the same predictive-coding skeleton:

```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced block at Line 66.

This fence is missing a language identifier, which triggers MD040.

💡 Suggested fix
-```
+```text
                 ┌──────────────────────────────────────┐
                 │ 1. Build basin codebook (offline or  │
                 │    online k-means on the carrier)    │
@@
-```
+```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/knowledge/pr-x12-cross-domain-synergies.md at line 66, The fenced
code block at line 66 is missing a language tag, causing MD040; update the
opening fence for that block (the triple-backtick before the ASCII diagram) to
include a language identifier such as "text" (e.g., change ``` to ```text) so
the block is properly tagged and the linter warning is resolved.


If we commit to all of this, the order matters:

```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced block at Line 682.

Same markdownlint issue (MD040) here; add a language to keep docs lint-clean.

💡 Suggested fix
-```
+```text
                                       Plan A (rANS A7)
                                          │
                                          ▼
@@
-```
+```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 682-682: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/knowledge/pr-x12-cross-domain-synergies.md at line 682, The fenced
code block that begins with the ASCII diagram containing "Plan A (rANS A7)" is
missing a language tag and triggers markdownlint MD040; update the opening fence
(the triple backticks that precede the diagram) to include a language tag such
as text (i.e., change ``` to ```text) so the block reads as a language-tagged
fenced code block and keep the closing fence as-is; locate the block by
searching for the triple backticks immediately before the "Plan A (rANS A7)"
diagram.

…ASIN_NONE distinctness

Two real bugs flagged by CodeRabbit's outside-diff review on b44fe59.

P1 — `pack_leaf` was non-bijective via unwrap_or fallbacks (mode.rs:194):
  Hand-constructing a `LeafCu { mode: Merge, merge_dir: None, .. }`
  serialized as `LeafCu::merge(.., MergeDir::North)` — silently rewriting
  malformed input into a different valid leaf and hiding upstream invariant
  breaks. Now uses `?` to short-circuit with None when required fields are
  missing for the mode (Merge needs merge_dir; Delta needs delta; Escape
  needs escape_idx). The `LeafCu::merge/delta/escape` constructors still
  enforce the invariants — only struct-literal bypass callers can hit the
  rejection.
  + 3 regression tests: leaf_pack_rejects_malformed_merge_without_dir,
    leaf_pack_rejects_malformed_delta_without_value,
    leaf_pack_rejects_malformed_escape_without_idx
  + pack_leaf docstring documents the bijective contract.

P1 — `BASIN_NONE` collided with `MAX_BASIN_IDX` at 4095 (mode.rs:69):
  Both equaled `(1 << 12) - 1`, so basin 4095 was ambiguous — a real
  basin and the "no basin" sentinel had the same encoded value.
  Now: MAX_BASIN_IDX = 4094 (highest real basin), BASIN_NONE = 4095
  (sentinel one slot above). Introduced private BASIN_FIELD_MASK = 0x0FFF
  for header packing/unpacking — independent of MAX_BASIN_IDX so BASIN_NONE
  still round-trips through the 12-bit field as a sentinel marker.
  pack_header/unpack_header switched from `& MAX_BASIN_IDX` (would wrongly
  clear bit 0 with 0xFFE) to `& BASIN_FIELD_MASK`.
  + 2 regression tests: basin_none_distinct_from_max_basin_idx,
    header_round_trips_max_basin_idx_and_basin_none_distinctly
  + MAX_BASIN_IDX + BASIN_NONE doctests updated to assert the new
    relationship.

Gates:
  cargo test --features codec --lib hpc::codec → 55 passed (+5)
  cargo test --features codec --doc hpc::codec → 15 passed
  cargo fmt --all -- --check → clean
  cargo clippy --features codec --lib -- -D warnings → clean
Copy link
Copy Markdown
Owner Author

Follow-up review on b44fe59f addressed in 24232985:

P1 — pack_leaf non-bijective via unwrap_or fallbacks (mode.rs:194-210)
Switched to ? short-circuit. Malformed LeafCu (Merge without merge_dir, Delta without delta, Escape without escape_idx) now returns None instead of silently rewriting into a different valid leaf. The LeafCu::* constructors still guard the well-formed path; only struct-literal bypass callers can hit the rejection. + 3 regression tests covering each mode's missing-field case. Docstring documents the bijective contract.

P1 — BASIN_NONE / MAX_BASIN_IDX collision (mode.rs:69-78)
Resolved. MAX_BASIN_IDX = 4094, BASIN_NONE = 4095 — distinct values. Added a private BASIN_FIELD_MASK = 0x0FFF for header pack/unpack so the 12-bit field can still carry BASIN_NONE as a sentinel without conflating the encoder's valid-basin range (0..=MAX_BASIN_IDX) with the wire field width. + 2 regression tests; the MAX_BASIN_IDX and BASIN_NONE doctests now assert the < BASIN_NONE and MAX_BASIN_IDX + 1 == BASIN_NONE relationships explicitly.

Gates:

  • cargo test --features codec --lib hpc::codec55 passed (+5)
  • cargo test --features codec --doc hpc::codec → 15 passed
  • cargo fmt --all -- --check → clean
  • cargo clippy --features codec --lib -- -D warnings → clean

Generated by Claude Code

@AdaWorldAPI AdaWorldAPI merged commit ef955ab into master May 22, 2026
18 checks passed
AdaWorldAPI pushed a commit that referenced this pull request May 22, 2026
621-line synthesis doc consolidating two parallel architecture threads
on PR-X12's predictive-coder substrate:

- Session A: pr-x12-codec-cognitive-substrate-mapping.md (cognitive
  shaders ↔ gradient ↔ HEVC mapping, this branch)
- Session B: pr-x12-cross-domain-synergies.md (x265 ⇄ 3DGS ⇄ attention
  ⇄ SGD, merged to master via PR #195 commit 01c77cc)

Both originals retained for archeology; merged-canon supersedes them
for new content. Structure:

- §1 overlap matrix (epiphanies, holy grails, plans, debt — unique vs
  shared between A and B)
- §2 architectural claim with two-route falsifiability
- §3 ten merge-only epiphanies (M:E-A..M:E-J) — the high-density
  section: universal LinearReduce<Basis> kernel, Morton+raster as
  space-filling curve, PR #195 CodeRabbit findings generalise to all
  four domains, ndarray-codec as 5th architectural category, three
  domain-specific plug-points only
- §4 unified holy grail list (A:H + B:HG + M:H + 2 new M:H-NEW)
- §5 unified plan with resolved critical path (B's A7-first wins,
  front-load A4-design)
- §6 sequencing DAG
- §7 unified technical debt (P0/P1/P2/P3 reranked across both docs)
- §8 resolves 4 disagreements between A and B (each with winner +
  rationale)
- §9 read order for the 4-doc family
- §10 compaction-preservation contract (8 invariants)

Key merge-only outputs:

- Three load-bearing traits: PredictiveSignal, LinearReduce<Basis>,
  CurveOrder<const N>
- M:H-NEW-1: single Rust binary consumes video|splat|gradient|kv-cache
  → compressed Lance column (the falsifiability test, Plan G builds it)
- M:H-NEW-2: ~1.5 KLoC generic + ~200 LoC per domain = ~2 KLoC for
  four industries (25× code-density vs per-domain impls)

Three new integration plans:

- Plan G: multi-domain bench harness (2 weeks, MUST ship before A7)
- Plan H: extract `ndarray-codec` crate (3 days, resolves dep-cycle)
- Plan I: trait PredictiveSignal in protocol (3 days, bit-pattern
  isomorphism with splat without code merge)

Citation IDs preserved from both originals: cite as (A:E-N), (B:E-N),
(M:E-X). No code changes; pure synthesis. Cargo check clean.
AdaWorldAPI pushed a commit that referenced this pull request May 22, 2026
Two CodeRabbit MD040/MD027 findings + one Codex P2 stale-fact finding,
all verified valid against current code.

CodeRabbit (markdownlint):
- pr-x12-codec-cognitive-substrate-mapping.md: 4 untyped code fences
  → ```text (Skip/Merge/Delta block at line 56; depth-0..3 quad-tree
  blocks for spatial/attention/gradient hierarchies at lines 80/91/104)
- pr-x12-substrate-merged-canon.md: collapse `>   - ` to `> - `
  (lines 6-7, MD027 multi-space-after-blockquote), 2 untyped fences
  → ```text (architecture-rule list at line 165, sequencing diagram
  at line 411), and "delta IS" → "delta is" stylistic

Codex P2 (stale doc vs shipped code):
- Doc cited `MAX_BASIN_IDX = 4095` and framed BASIN_NONE collision
  as "pending / not yet merged / not yet pushed" — but PR #195 commit
  2423298 already shipped `MAX_BASIN_IDX = 4094` with `BASIN_NONE
  = 4095` reserved sentinel, plus bijective `pack_leaf` via `?`
  operator with 3 regression tests. Updated:
  - §4.1 line 125: `MAX_BASIN_IDX = 4094` + sentinel-range explainer
  - §4.3 heading + body: "(resolved in PR #195)" with commit cite
  - §"Still open on PR #195" block: → "Resolved in PR #195 follow-up"
  - §12 debt-inventory T-1, T-2: marked ~~RESOLVED~~ with commit ref

No content edits beyond what the findings asked for; citation IDs
unchanged.
AdaWorldAPI pushed a commit that referenced this pull request May 22, 2026
… codebook

PR #195's follow-up commit 2423298 resolved the BASIN_NONE/MAX_BASIN_IDX
collision by shrinking MAX_BASIN_IDX to 4094 with BASIN_NONE = 4095 as a
distinct sentinel. The shrink was the wrong fix — the sentinel itself
was the bug.

Reference (the authoritative source the code drifted from):

  src/hpc/ogit_bridge/assets/cognitive/entities/Leaf.ttl — the HHTL
  ontology defines the codebook as 16 Hips × 16 Twigs × 16 Leaves =
  4096 Leaves per Heel, every Leaf carrying a real basinSignature.
  No slot is reserved for absence; the ontology forbids "no basin".

What BASIN_NONE actually was: authoring-time epistemic uncertainty
("encoder hasn't decided yet for this cell") smuggled into wire-format
ontological uncertainty ("this cell exists but has no basin"). A
category error — Rust's idiom would have surfaced this as Option<u16>
in the encoder's transient scratch state and a plain u16 on the
persistent record. The sentinel value bypassed that hygiene by hiding
optionality inside a magic u16, invisible to the typechecker.

Confirmation that the sentinel was load-bearing for nothing: full audit
of BASIN_NONE producers across src/ found zero — no LeafCu::skip(...),
no basin_idx: BASIN_NONE anywhere. The only references were the const
definition, a re-export, one regression test that constructed it, and
predict.rs:is_no_basin (a predicate with no callers that emit
BASIN_NONE leaves). Defensive infrastructure for a feature the
ontology forbids and the implementation never used.

Changes:

- mode.rs: MAX_BASIN_IDX = (1 << 12) - 1 = 4095 (full 12-bit range,
  4096 real basins). BASIN_NONE deleted. The 2 sentinel-distinctness
  regression tests replaced with one test asserting the full range.
  Docs reframed around the HHTL ontology + the Option<u16>
  scratch-state contract.
- predict.rs: is_no_basin deleted; its only doctest+test deleted.
  BASIN_NONE import dropped.
- mod.rs: BASIN_NONE and is_no_basin removed from re-exports.
- pr-x12-codec-cognitive-substrate-mapping.md §4.1/§4.3/§10.1/§12:
  framing updated. §4.3 now records the full arc (collision → shrink
  → delete) as design archeology rather than as an open issue.

Validation:
- cargo check --features codec --lib → clean
- cargo clippy --features codec --lib -- -D warnings → clean
- cargo test --features codec --lib hpc::codec → 54 tests pass
- cargo test --features codec --doc hpc::codec → 13 doctests pass

Net diff: -77, +38 lines. Pure subtraction modulo doc reframe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants