v0.0.10
Other
- round-191 (parent-dispatch r191) against ISO/IEC FDIS 18181-1:2021 — Annex E / §H.5.2 Weighted-Predictor oracle test driven by clean-room behavioural trace at noise-64x64-lossless sample 194
- round-190 (parent-dispatch r190) against ISO/IEC FDIS 18181-1:2021 — typed per-pass NonZeros(x, y) grid container above the round-183 per-channel primitive
- round-183 (parent-dispatch r183) against ISO/IEC FDIS 18181-1:2021 — typed per-channel NonZeros(x, y) grid container layered above round-177 single-channel primitive
- round-177 (parent-dispatch r177) against ISO/IEC FDIS 18181-1:2021 — typed NonZeros(x, y) grid bookkeeping + per-varblock decode driver
- round-164 (parent-dispatch r164) against ISO/IEC FDIS 18181-1:2021 — TransformType-driven entry points for the §C.8.3 per-block HF coefficient decode loop
- round-159 (parent-dispatch r159) against ISO/IEC FDIS 18181-1:2021 — §C.8.3 per-block HF coefficient decode loop scaffolding (Listings C.13 + C.14)
- round-150 (parent-dispatch r150) against ISO/IEC FDIS 18181-1:2021 — Annex I.2.3.8 Listing I.13 Inverse AFV transform wired into idct dispatch
- round-147 (parent-dispatch r147) against ISO/IEC FDIS 18181-1:2021 — Annex I.2.2 AFV basis + AFV_IDCT pure-math primitive (Listings I.5 + I.6)
- round-144 (parent-dispatch r144) against ISO/IEC FDIS 18181-1:2021 — Annex J.3 edge-preserving-filter pure-math primitive
- round-141 (parent-dispatch r141) against ISO/IEC FDIS 18181-1:2021 — Annex J.2 Gabor-like-transform pure-math primitive
- round-138 (parent-dispatch r138) against ISO/IEC FDIS 18181-1:2021 — Annex G Chroma-from-Luma pure-math primitive (Listing G.1)
- round-133 (parent-dispatch r133) against ISO/IEC FDIS 18181-1:2021 — §C.7.1 DecodePermutation() for used_orders != 0
- Round 129: per-varblock LF→LLF composition glue (§I.2.5 plumbing)
- Round 126: WP deep-trace plumbing + sample-194 hand-derivation
- round-121 (parent-dispatch r121) against ISO/IEC FDIS 18181-1:2021 — §I.2.5 LLF-from-LF pure-math step (Listings I.15 + I.16)
- round-95 (parent-dispatch r95) against ISO/IEC FDIS 18181-1:2021 — §F.3 HF dequantisation pure-math step
- round-90 (parent-dispatch r90) against ISO/IEC 18181-1:2021 FDIS — HfPass + PassGroup HF structural parsers
- round-89 (parent-dispatch r89) against ISO/IEC 18181-1:2024 — GetDCTQuantWeights + Table I.6 default dequantization-matrix materialisation
- rewrite lf_dequant comment to remove libjxl numeric-defaults citation
- round-77 fixup — inline animation-3frame fixture under crate-local tests/fixtures/
- round-77 (parent-dispatch r17) against ISO/IEC 18181-1:2024 — animation-3frame SPECDIFF audit harness
- round-32 (parent-dispatch r17) against ISO/IEC 18181-1:2024 — noise-64x64-lossless pixel-divergence bisected to WP at first predictor=6 sample with WW/NN both in-image; fix deferred pending libjxl-WP behavioural trace
- round-31 (parent-dispatch r16) against ISO/IEC 18181-1:2024 — §F.3 zero-pad uniformly applied to single-TOC-entry LfGlobal fast path
- round-30 (parent-dispatch r15) against ISO/IEC 18181-1:2024 — bit-depth-16 RGB pixel-correct + 16-bit LE plane-pack convention
- round-29 (parent-dispatch r14) against ISO/IEC 18181-1:2024 — alpha-64x64 RGBA pixel-correct + ISOBMFF FF 0A strip
- round-28 (parent-dispatch r13) against ISO/IEC 18181-1:2024 — non-DCT IDCT helpers (Annex I.9.3..I.9.7)
- round-27 (parent-dispatch r12) against ISO/IEC 18181-1:2024 — IDCT dispatch (Annex I.2.1 + I.2.2 Listing I.4)
- round-26 (parent-dispatch r11) against ISO/IEC 18181-1:2024 — Annex L colour transforms (XYB inverse + YCbCr inverse)
- round-25 (Auditor mode) against ISO/IEC 18181-1:2024 — d1 LfCoefficients per-sample rich-state range dump 22..=79
- round-24 (Auditor mode) against ISO/IEC 18181-1:2024 — d1 per-cluster D[] byte trace + per-call alias-mapping invariant audit
- round-23 (Auditor mode) against ISO/IEC 18181-1:2024 — d1 leaf-pick property dump at Y' sample 22 + WP y=0 boundary audit
- round-22 (Auditor mode) against ISO/IEC 18181-1:2024 — d1 lf_quant sample dump + WP rounding bias toggle
- round-21 (Auditor mode) against ISO/IEC 18181-1:2024 — d1 per-cluster distribution + alias-table self-map audit
- round-20 followup — refresh round-19 trace eprintln with corrected DC_GROUP budget
- round-20 (Auditor pivot) against ISO/IEC 18181-1:2024 — DC_GROUP boundary recount + ANS-final-state oracle
- round-19 (Auditor mode) — d1 cluster + ANS state evolution audit
- round-18 (Auditor mode) against ISO/IEC 18181-1:2024 — per-token bit accounting trace + drift narrowed
Added
-
Round 191 (2021-FDIS) — Annex E / §H.5.2 Weighted-Predictor
oracle test driven by clean-room behavioural trace at sample 194 of
noise-64x64-lossless. Newtests/r191_wp_trace_oracle.rs(5
tests) and newpub fn modular_fdis::wp_predict_pubtest wrapper
around the productionwp_predict. The oracle consumes the
docs/image/jpegxl/fixtures/noise-64x64-lossless/wp-trace-sample-194.md
trace (provenance recorded alongside aswp-trace-provenance.md),
which records the FDIS-conformant per-listing intermediates an
instrumented reference decoder produces at the
(channel 0, x=2, y=3)divergence point bisected in rounds 31..126:r191_wp_predict_matches_trace_at_sample_194— drives the
productionwp_predictwith the trace'sWpState/Neighbours
inputs; asserts the four sub-predictions[1248, 747, 420, 559],
the final pre-round prediction709, andmax_error = 737all
reproduce exactly. Result: PASS — proves Annex E.2 Listings
E.1 (sub-predictions), E.2 (err_sum_i+error2weight), E.3
(weighted sum + same-sign clamp), and E.4 (max_error) are
spec-correct inwp_predict, isolating the still-unfixed
sample-194 wp_pred8 = 717 vs trace 709 off-by-8 divergence to
upstream state evolution (set_true_err/set_sub_err
calls fired across samples 0..193) rather than the predictor
arithmetic itself.r191_trace_err_sum_self_consistency— pure-arithmetic sanity
check on the trace'ssub_err_{i,N/NE/NW}table summing to the
reportederr_sum_i([438, 330, 416, 240]).r191_trace_weights_match_error2weight— hand-derives the
trace'sweight_i = [495694, 599189, 474830, 825112]from
FDIS-literalerror2weight(err_sum_i, wp_w_i); documents a
1-unit inner-Idiv-vs-multiplication-first discrepancy with the
production reading that does NOT affect sample 194's shifted
weights (both readings give[3, 4, 3, 6]after the Listing E.3
>> shstep).r191_trace_prediction_matches_listing_e3— independent
hand-derivation ofprediction = 709from Listing E.3 inputs,
including verification that the same-sign clamp predicate fires
but is a no-op (pre-clamp 709 ∈ [min(W,N,NE)=584, max(W,N,NE)=
1232]).r191_pin_state_evolution_gap— pins the production-vs-trace
delta as a roadmap for the next round's bisect: Δ te_w = +21,
Δ te_nw = -21 (symmetric pair → likely a single upstream
defect), Δ wp_pred8 = +8 in 8x scale = +1 in un-shifted pixel
space (matchesr126_first_divergence_scandec=35 vs exp=34).
Spec citations and provenance attestation embedded in the test
module docstring; references the in-repo FDIS §E.1-E.4 line
numbers and the trace doc's statedprediction − true_value
sign convention. Trace doc is the newly-staged
docs/image/jpegxl/fixtures/noise-64x64-lossless/wp-trace-*.md
pair landed alongside this round (tasks #820 + #1077). Issues #6,
#64, #799.
-
Round 190 (2021-FDIS) — typed per-pass
NonZeros(x, y)grid
container (FDIS §C.8.3 + Listing C.13 per-pass keying). New
per_pass_non_zerosmodule that owns one
per_channel_non_zeros::PerChannelNonZerosGridsper pass index
p ∈ [0, num_passes), layered above the round-183 per-channel
container. A VarDCT frame is decoded innum_passesordered passes
(declared inFrameHeader.passes.num_passes); each pass scans every
PassGrouponce and §C.8.3 specifies that within a pass each
channel of each varblock maintains its ownNonZeros(x, y)state.
Between passes the per-channel bookkeeping is reset because the
per-pass histogram is selected byhfpfrom the per-passHfPass
array — a different pass uses a different histogram and the
prediction recurrence is keyed against the current pass's own
coefficient counts. The new module captures the per-pass routing
layer above round 183's per-channel routing layer:PerPassNonZerosGrids::new(pass_dims: &[&[(u32, u32)]]) -> Result<Self>
— per-pass per-channel(width, height)slice, validated
entry-by-entry viaPerChannelNonZerosGrids::new(zero / oversize
dims rejected per channel; empty pass-list rejected).PerPassNonZerosGrids::new_uniform(num_passes, num_channels, width, height) -> Result<Self>— convenience builder for the
uniform-per-pass case.PerPassNonZerosGrids::{num_passes, pass, pass_mut, predicted, get, set, update_after_block, update_after_block_for_transform}—
per-pass routing accessors; out-of-rangeperrors cleanly.PerPassNonZerosGrids::decode_block_at_for_pass_channel(p, c, x, y, t, block_ctx, nb_block_ctx, read_non_zeros, decode_symbol) -> Result<(DecodedHfBlock, u32)>— typed per-pass per-channel
driver that wraps the round-183
PerChannelNonZerosGrids::decode_block_at_for_channelwith pass
routing. Caller pre-computesblock_ctxvia
pass_group_hf::block_contextwith the matchingc; the
container is a pure storage + routing primitive and does not
re-derivepass_group_hf::block_contextnor materialise the
per-pass histogram.- Per-pass per-channel shapes are independent — ragged per-pass
channel counts are tolerated.
41 new tests (28 unit in
per_pass_non_zeros::tests+ 13 integration
intests/round190_per_pass_non_zeros.rs) pin: empty-pass-list /
zero-channel-pass / zero-dim rejection; two-pass chroma-subsampled
construction;new_uniformconvenience; out-of-range pass index
errors on every accessor (8 paths);PredictedNonZeros(0, 0) = 32
on every (pass, channel); per-pass write isolation; per-pass
predictedpropagation reads back each pass's own history (not
another pass's); per-passupdate_after_block_for_transform
dispatch (rawnon_zeros = 17→{17, 5, 2}at DCT8×8 / DCT16×16 /
DCT32×32 on three independent passes); per-pass
decode_block_at_for_pass_channelrouting; two-pass three-channel
raster walk at(0, 0)/(1, 0)with distinct[4, 8, 12]/
[3, 6, 9]per-pass per-channelraw_non_zerossequences preserves
cross-pass isolation; ragged per-pass channel counts (one-channel
DC-only preview followed by three-channel main);u32::MAX
no-panic saturating-add chain through the per-pass route. Lib
tests 608 → 636 (+28). -
Round 183 (2021-FDIS) — typed per-channel
NonZeros(x, y)grid
container (FDIS §C.8.3 + Listing C.13 channel-keying). New
per_channel_non_zerosmodule that owns one
non_zeros_grid::NonZerosGridper channel, layered above the
round-177 single-channel primitive. Listing C.13's
BlockContext()factorscinto(c < 2 ? c ^ 1 : 2) × 13 + s,
so theNonZeros(x, y)bookkeeping is keyed per-channel because
chroma subsampling +TransformTypeheterogeneity means each
channel's varblock-grid shape can differ:PerChannelNonZerosGrids::new(dims: &[(u32, u32)]) -> Result<Self>
— per-channel(width, height)slice, validated entry-by-entry
viaNonZerosGrid::new(zero /> 65535dims rejected; empty
slice rejected).PerChannelNonZerosGrids::new_uniform(num_channels, width, height) -> Result<Self>— convenience builder for the
unsubsampled 4:4:4-style container.PerChannelNonZerosGrids::{num_channels, grid, grid_mut, predicted, get, set, update_after_block, update_after_block_for_transform}— per-channel routing
accessors; out-of-rangecerrors cleanly.PerChannelNonZerosGrids::decode_block_at_for_channel(c, x, y, t, block_ctx, nb_block_ctx, read_non_zeros, decode_symbol) -> Result<(DecodedHfBlock, u32)>— typed per-channel driver
that wraps the round-177non_zeros_grid::decode_block_at
with channel routing. Caller pre-computesblock_ctxvia
pass_group_hf::block_contextwith the matchingc; the
container is a pure storage + routing primitive.DEFAULT_NUM_CHANNELS = 3— the YCbCr / XYB canonical channel
count.
36 new tests (24 unit in
per_channel_non_zeros::tests+ 12
integration intests/round183_per_channel_non_zeros.rs) pin:
empty-channel-list rejection; zero-dim / oversize-dim rejection
on any channel; three-channel chroma-subsampled construction at
[(16, 16), (8, 8), (8, 8)];new_uniformconvenience;
out-of-range channel index errors on every accessor (8 paths);
PredictedNonZeros(0, 0) = 32on every channel; per-channel
write isolation; per-channelpredictedhorizontal chain on a
seeded channel-1 grid;update_after_block_for_transform
dispatch (rawnon_zeros = 17→{17, 5, 2}at DCT8×8 /
DCT16×16 / DCT32×32 on three independent channels);
decode_block_at_for_channelroutes the round-177 typed driver
per channel; post-update cell feeds the next-position predicted
value back per-channel; OOB(x, y)past the per-channel grid
errors cleanly; a two-step three-channel raster walk at
(0, 0)/(1, 0)with distinct[4, 12, 20]/
[6, 18, 30]per-channel raw_non_zeros sequences preserves
cross-channel isolation.Lib tests 584 → 608 (+24). Pure-control-flow primitive in the
same shape as round-89dct_quant_weights, round-95
hf_dequant, round-121llf_from_lf, round-138
chroma_from_luma, round-141gaborish, round-144epf,
round-147afv_idct, round-159 / 164pass_group_hf, and
round-177non_zeros_grid— no bit reads, no spec
re-derivation. A future round wiring §C.7.2 entropy histograms
(#799 DOCS-GAP) + the per-LfGroup varblock-shape grid +
per-channelBlockContext()history can drop these helpers in
as the per-channel step without re-deriving any Listing C.13 /
C.14 formulae. -
Round 177 (2021-FDIS) — typed
NonZeros(x, y)grid bookkeeping +
per-varblock decode driver (FDIS §C.8.3 + Listing C.13 prelude +
Listing C.14 post-prose). Newnon_zeros_gridmodule bridging
round 159pass_group_hf::predicted_non_zeros(the four-branch
PredictedNonZeros(x, y)recurrence) with round 164
pass_group_hf::read_non_zeros_and_decode_block_for_transform
(theTransformType-driven per-block coefficient loop):NonZerosGrid::new(width, height) -> Result<Self>— rectangular
varblock-grid storage ofNonZeros(x, y)cells. Defensive
rejection of zero dims + dims> 65535.NonZerosGrid::{get, set, width, height, cells}— accessors.NonZerosGrid::predicted(x, y) -> Result<u32>— delegates to
pass_group_hf::predicted_non_zerosagainst
|xx, yy| self.get(xx, yy).unwrap_or(0).NonZerosGrid::update_after_block(x, y, non_zeros, num_blocks) -> Result<u32>— FDIS post-Listing-C.14 prose formula
(non_zeros + num_blocks - 1) Idiv num_blocks(ceiling-divide
identity,saturating_addatu32::MAX).NonZerosGrid::update_after_block_for_transform(x, y, non_zeros, t)—num_blocksfrompass_group_hf::transform_block_params.non_zeros_grid::decode_block_at(grid, x, y, t, block_ctx, nb_block_ctx, read_non_zeros, decode_symbol) -> Result< (DecodedHfBlock, u32)>— typed per-varblock driver: computes
predicted, invokes
read_non_zeros_and_decode_block_for_transform, then calls
update_after_block_for_transformbefore returning the
(DecodedHfBlock, raw_non_zeros)pair.
35 new tests (23 unit in
non_zeros_grid::tests+ 12 integration
intests/round177_non_zeros_grid.rs) pin: defensive rejection
of zero / oversize (> 65535) dims and out-of-range(x, y);
zero-init cells;PredictedNonZeros(0, 0) = 32across a sweep
of grid shapes; the y == 0 and x == 0 border-recurrence branches
via horizontal / vertical raster chains; the interior
(above + left + 1) >> 1average (odd-sum rounding); the
predicted_non_zeroshelper agreement byte-for-byte on a seeded
3×3 grid; the post-Listing-C.14 ceiling-divide formula at
num_blocks ∈ {1, 4, 16}(DCT8×8 / DCT16×16 / DCT32×32 — the
TransformTypedispatch reduces a rawnon_zeros = 17to
{17, 5, 2}at the three shapes); the typed driver's
predicted = 32at the origin routes through thepredicted >= 8NonZerosContextbranch (ctx = block_ctx + nb_block_ctx × (4 + 32 Idiv 2) = 67at(block_ctx, nb_block_ctx) = (7, 3));
decode_block_atreads back(0, 0)'s post-update cell when
invoked at(1, 0); OOB positions error cleanly; per-channel
independence (two grids of the same shape evolve
independently); row-majorcells()layout pinned at[0, 10, 20, 30]after writing(1,0)=10,(0,1)=20,(1,1)=30on a
2×2 grid; and pathologicalu32::MAXdoes not panic.Lib tests 561 → 584 (+23). Pure-control-flow primitive in the
same shape as round-89dct_quant_weights, round-95
hf_dequant, round-121llf_from_lf, round-138
chroma_from_luma, round-141gaborish, round-144epf,
round-147afv_idct, and round-159 / 164pass_group_hf— no
bit reads, no spec re-derivation. A future round wiring §C.7.2
entropy histograms (#799 DOCS-GAP) + the per-LfGroup
varblock-shape grid + per-channelBlockContext()history can
drop these helpers in as the per-varblock-position step without
re-deriving any Listing C.13 / C.14 formulae. -
Round 164 (2021-FDIS) —
TransformType-driven entry points for
the §C.8.3 per-block HF coefficient decode loop (DCT16×16 /
DCT16×8 / DCT32×32 dimensions pinned end-to-end). New public API
inpass_group_hf:transform_block_params(t: TransformType) -> (num_blocks, size)
— §I.2.4 opening paragraph + Listing C.14:num_blocks = (bwidth / 8) × (bheight / 8),size = bwidth × bheight.decode_block_coefficients_for_transform(t, initial_non_zeros, block_ctx, nb_block_ctx, decode_symbol)— typed wrapper that
derives(num_blocks, size, natural_order)fromt(via
[coeff_order::order_id_for_transform] +
[coeff_order::natural_coeff_order]) and reduces to the
round-159decode_block_coefficients.read_non_zeros_and_decode_block_for_transform(t, predicted, block_ctx, nb_block_ctx, read_non_zeros, decode_symbol)—
analogous typed wrapper around
read_non_zeros_and_decode_block.
20 new tests (8 unit inpass_group_hf::tests+ 12 integration
intests/round164_dct16x16_block_coefficient_loop.rs) pin the
(num_blocks, size)derivation for every Table C.16 transform
(every entry satisfiesnum_blocks * 64 == size); the DCT16×16
prevthreshold atnon_zeros == 17(= size/16 + 1); the typed
entry point at DCT8×8 reduces to the raw entry point; the typed
entry point at DCT16×16 walks(num_blocks=4, size=256)for
all-zero / single-non-zero / three-consecutive / full-density
(252 reads) cases with coefficients landing at
natural_coeff_order(Id2)[4..]; the typed and raw entry points
agree byte-for-byte on a mixed[2, 0, 4, 0, 0, 6]sequence;
read_non_zeros_and_decode_block_for_transformthreads the
NonZerosContextvalue through the first closure; the rectangular
DCT16×8 / DCT8×16 collapse to the same per-block outcome (they
share OrderId::Id4); defensive rejection ofinitial_non_zeros > size - num_blocks(= 252 max for DCT16×16); and one DCT32×32
smoke-test at(num_blocks=16, size=1024). Lib tests 553 → 561
(+8). Pure-typed wrapper layer: no new bit reads, no spec
re-derivation — the round-159 module note ("the primitive itself
is shape-agnostic and ready for the larger variable-block sizes
once their parameterisation lands") is now exercised from the
caller-facing API.
-
Round 159 (2021-FDIS) — §C.8.3 per-block HF coefficient decode
loop scaffolding (Listing C.13 + Listing C.14). New public API in
pass_group_hf:prev_for_context(k, num_blocks, size, non_zeros, prev_nonzero)
— Listing C.14 verbatim (k == num_blocks ? (non_zeros > size / 16 ? 1 : 0) : (prev_nonzero(k - 1) ? 1 : 0)).DecodedHfBlock { coeffs, remaining_non_zeros, coeffs_read }—
return bundle for the per-block primitive.decode_block_coefficients(natural_order, num_blocks, size, initial_non_zeros, block_ctx, nb_block_ctx, decode_symbol)—
Listing C.14's per-block raster-order loop with the §C.8.3
"stop when non_zeros reaches 0" early-exit,UnpackSigned
application, andnatural_order[k]placement. The
decode_symbol: FnMut(ctx) -> Result<u32>closure abstracts
over the (still un-landed) §C.7.2 entropy histograms — a real
consumer wrapsEntropyStream+HybridUintState+ the
per-grouphistogram_offset; tests can hand-roll a symbol
sequence.read_non_zeros_and_decode_block(.., predicted, .., read_non_zeros, decode_symbol)— convenience wrapper that issues the
D[NonZerosContext(predicted) + offset]read via the first
closure and drivesdecode_block_coefficientswith the result.
Returns(DecodedHfBlock, non_zeros)so the caller can update
its NonZeros-grid bookkeeping per `NonZeros(x, y) = (non_zeros- num_blocks - 1) Idiv num_blocks`.
Bounded scope: DCT8×8 alone —
num_blocks = 1,size = 64,
OrderId::Id0natural-coefficient order (the simplest case that
exercises the full state machine). The primitive itself is
shape-agnostic; the larger variable-block sizes (DCT16×16,
DCT32×32, AFV0..3, …) need theirnum_blocks/sizeparameters
threaded through the varblock driver above this primitive.11 new unit tests (
pass_group_hf::tests::*) + 11 integration
tests (round159_block_coefficient_loop) cover: all-zero block
(no symbol reads); single non-zero at the first HF slot (one
read,UnpackSigned(1) = -1atnatural_order[1]); three
consecutive non-zeros (loop stops after three reads); full
density (63 reads, LLF cell untouched); the size/16 threshold
forprev(crossover atnon_zeros == 5for DCT8×8); the
"previous coefficient is zero / non-zero" flag tracking through
the loop's history; defensive rejection of malformed
natural-order vectors, zeronum_blocks, and over-large
initial_non_zeros; closure-threaded end-to-end smoke through
read_non_zeros_and_decode_block. Lib tests 538 → 553 (+15).Pure-math / pure-control-flow primitive in the same shape as
round-89dct_quant_weights, round-95hf_dequant, round-121
llf_from_lf, round-138chroma_from_luma, round-141
gaborish, round-144epf, and round-147afv_idct— a future
round wiring §C.7.2 histograms into the per-pass entropy stream
can drop this primitive in as the per-block loop body without
re-deriving any C.13 / C.14 formulae. The §C.7.2 entropy
histogram decode (#799 DOCS-GAP), the per-channel (Y / X / B)
non_zerosread in the varblock driver above this primitive,
the per-pass NonZeros-grid update, and the per-varblock
BlockContext()derivation remain follow-up work for subsequent
rounds. -
Round 150 (2021-FDIS) — Annex I.2.3.8 / Listing I.13 Inverse AFV
transform composition (idct::idct_afv). Composes the round-147
crate::afv::afv_idctpure-math primitive (Listings I.5 + I.6)
with twoidct_2dcalls (one at 4×4, one at 4×8) per the
three-sub-block decomposition of Listing I.13 — yielding the
full 8×8 sample buffer forTransformType::Afv0..Afv3. With
this wiring theidct::idct_for_transformdispatcher routes
Afv0..Afv3toidct_afvinstead of returning
Err(Unsupported); all 10 non-DCT branches of Table I.4 are now
pure-math-complete (Hornuss / DCT2×2 / DCT4×4 / DCT8×4 / DCT4×8- AFV0..AFV3). Each AFV variant's sub-block placement is
controlled byflip_x = n & 1/flip_y = n >> 1(§I.2.3.8);
the AFV sub-block additionally mirrors its read coordinates
(flip_x == 1 ? 3 - ix : ixand the iy dual) per the inner
loop of Listing I.13. Seven new property-style tests cover:
rejection of non-AFV transforms / wrong lengths; all-zero
input → all-zero output for all four variants; DC-only input
→ constantc(0,0)output (the three DC patches(c00+c01+c10) × 4,c00-c01+c10,c00-c01collapse to4·1,1,1
respectively, with each sub-block's IDCT mapping a DC-only
cell to a constant sub-block since AFVBasis row 0 =[0.25; 16]andIDCT_2DDC-only is constant); dense-AC input →
every cell written; AFV0↔AFV1 x-axis flip swaps the AFV
sub-block column reads; AFV0↔AFV2 y-axis flip swaps the 4×8
sub-block y-band placement; linearity. Test-count delta:
+7(531 → 538).
FDIS typo documented in module docs. Listing I.13's final
source line readssamples_4×4(ix, iy)but the inner loop
iteratesix ∈ [0..8)andsamples_4×4only has columns
0..3, while the immediately preceding line computes
samples_4×8 = IDCT_2D(coeffs_4×8). Implementation reads from
samples_4×8per context; the typo is now annotated alongside
the existing four Annex D / D.3 typos in the project
FDIS-typo memory. - AFV0..AFV3). Each AFV variant's sub-block placement is
-
Round 147 (2021-FDIS) — Annex I.2.2 AFV basis +
AFV_IDCT
pure-math primitive (Listings I.5 + I.6, p. 76). New
src/afv.rsmodule transcribes the orthonormalAFVBasis[16][16]
table from Listing I.5 verbatim and the Listing I.6 cell-sum
samples[i] = sum_j coefficients[j] × AFVBasis[j][i]. Public
API:AFV_CELL_LEN: usize = 16— the §I.2.2 4×4-as-flat-16 cell.AFV_BASIS: [[f32; 16]; 16]— verbatim Listing I.5.afv_idct(coefficients: &[f32]) -> Result<[f32; 16]>—
Listing I.6.
The 256-float transcription is independently verified at the
table level: row-0 =[0.25; 16](Listing I.5 line 1); row-4 =
two non-zero entries at columns 1 and 4, both at ±1/sqrt(2),
zero elsewhere (Listing I.5 line 5); per-row L2 unit-norm
(orthonormality diagonal); pairwise zero inner product
(orthonormality off-diagonal);afv_idctis linear; one-hot
coefficient input recoversAFVBasis[j]row-for-row;
||samples||_2 == ||coefficients||_2(L2 conservation, an
orthonormal-basis property). A single transcription typo in any
of the 256 entries would fail at least one orthonormality sum.10 new unit tests + 9 integration tests
(round147_afv_idct); lib tests 521 → 531. Pure-math primitive
in the same shape as round-89dct_quant_weights, round-95
hf_dequant, round-121llf_from_lf, round-138
chroma_from_luma, round-141gaborish, and round-144epf—
a future round wiring §I.2.3.8 Inverse AFV transform (Listing
I.13) intoidct_for_transformcan drop this helper in without
re-deriving any I.5 / I.6 cells. The Listing I.13 composition
(thecoeffs_afvcorner-load, the twoIDCT_2D4×4 / 4×8
sub-blocks, theflip_x/flip_yAFVn flip) remains
follow-up work because it depends onidct_2dfor non-square
blocks plus the AFVn dispatch wiring; the §I.2.2 arithmetic
core landed in this round unblocks that follow-up. -
Round 144 (2021-FDIS) — Annex J.3 "Edge-preserving filter"
pure-math primitive (pages 85–87). Newsrc/epf.rsmodule
transcribes the four §J.3 listings as a self-contained pure-math
primitive: given a triple of three-channel f32 planes (the output
of round-141 Gaborish on the §I.2.5 + Annex G chain), per-call
scalar parameters (sigma, step_multiplier, zeroflush,
position_multiplier_border, channel_scale), and a
[frame_header::RestorationFilter] (Table C.9) for
epf_quant_mul/epf_sharp_lut[..]/epf_sigma_for_modular,
this module returns the per-pass output planes Listing J.4
prescribes. Public API:distance_step_0_and_1(x, y, b, w, h, x, y, cx, cy, scale)—
Listing J.1DistanceStep0and1(the five-pixel cross-shape
three-channel scaled L1 distance for passes 0 and 1).distance_step_2(...)— Listing J.1DistanceStep2(the
single-sample three-channel scaled L1 distance for pass 2,
under the literal(ix, iy) == (0, 0)reading of the free-
variable bug — see DOCS-GAP).weight(distance, inv_sigma, position_multiplier, zeroflush)
— Listing J.2Weight()decreasing-function-of-distance
kernel with thev <= zeroflushcutoff.inv_sigma_for_pass(step_multiplier, sigma)— Listing J.2's
pre-computedstep_multiplier × 4 × (sqrt(0.5) - 1) / sigma
factor (rejects non-finite or non-positive sigma).vardct_sigma_from_listing_j3(quantization_width, sharpness, &rf)— Listing J.3's per-varblock sigma derivation with the
max(1e-4, ..)clamp; the modular-mode branch uses
rf.epf_sigma_for_modulardirectly.is_border_position(x, y)— Listing J.2's "either coordinate
of the reference sample is 0 or 7 IMod 8" predicate driving
the per-pixelepf_border_sad_mulselection.apply_step_5tap(Pass::Pass1 | Pass::Pass2, ..)— Listing
J.4's 5-tap cross-shape kernel pass (passes 1 and 2); the
distance metric is selected by thePassdiscriminant.apply_step_13tap(..)— Listing J.4's 13-tap diamond kernel
pass 0 (always usingDistanceStep0and1).Pass— enum picking Pass0 / Pass1 / Pass2 for the dispatch.
§6.5 Mirror1D boundary handling is reused verbatim from
round-141gaborish::mirror1d. 36 new unit tests + 12 new
integration tests (round144_epf) pin self-distance-is-zero on
constant planes for both metrics, per-channel-scale linearity,
offset symmetry forDistanceStep0and1,DistanceStep2hand-
derived spatially-varying-plane case
(x:1×40 + y:2×5 + b:0×3.5 = 50),Weight()zero-distance
returns 1.0 / zeroflush cutoff / position-multiplier scaling,
Listing J.3 sigma at defaultrfsharpness 0 → 1e-4 clamp and
sharpness 7 → full quant, theis_border_position8×8 grid
layout, constant-plane invariance across all three passes, and
the zero-channel-scale collapse to the uniform mean on a centre
impulse. Lib tests 485 → 521. Pure-math primitive in the same
shape as round-89dct_quant_weights, round-95hf_dequant,
round-121llf_from_lf, round-138chroma_from_luma, and
round-141gaborish— a future round wiring §J.3 into the
per-frame restoration-filter pipeline can drop these helpers in
without re-deriving any of the J.1/J.2/J.3/J.4 listings. The
per-frame loop (calling each pass for each varblock under the
rightepf_iters/ per-block sigma / position-multiplier
conditions with output of passifeeding passi+1), the
sigma < 0.3skip-the-block path, and theepf_iters > 0skip
remain caller responsibilities (deferred to follow-up rounds).
DOCS-GAP observed in FDIS Listing J.1DistanceStep2(free
ix/iyvariables — adopted(ix, iy) == (0, 0)) and Listing
J.2step_multiplierarray (missing comma between
epf_pass0_sigma_scaleand1); both surfaced in the
module-level rustdoc with the adopted reading and rationale, and
the public API sidesteps the indexing ambiguity by accepting
step_multiplier: f32directly so the wiring round can pick the
resolution without an API churn. -
Round 141 (2021-FDIS / 2024-spec) — Annex J.2 "Gabor-like
transform" pure-math primitive (page 85). Newsrc/gaborish.rs
module transcribes FDIS §J.2 verbatim: given a per-channel plane
of f32 samples (the output of §I.2.5 LLF/HF reconstruction + the
round-138 Annex G chroma-from-luma chain) and the per-channel
gab_C_weight1/gab_C_weight2weights carried by
[frame_header::RestorationFilter] (Table C.9), the module applies
the spec's symmetric 3×3 convolution(centre = 1, edges = w1, corners = w2), rescaled uniformly so the nine kernel entries
sum to 1, with §6.5Mirror1Dboundary handling on
out-of-image references. Public API:mirror1d(coord, size)
(Listing 6.1 iterative form),sample_mirror(plane, w, h, x, y)
(direct §6.5 fetch),gab_kernel(w1, w2) -> [f32; 9]
(materialised normalized kernel in row-major order),apply_channel
(out-of-place per-channel convolution with an interior fast path- edge-mirror fallback),
apply_channel_in_place(single-buffer
scratch convenience), andapply_xyb_planes_in_place(x, y, b, w, h, &rf)(the three-channel XYB-pipeline convenience using
rf.gab_x_weight*/gab_y_weight*/gab_b_weight*). 23 new
unit tests + 10 new integration tests (round141_gaborish) pin
Mirror1D's identity / first-reflection / single-row collapse
cases, the default-weight kernel sum-to-one and centre-tap
(≈ 0.586) values, the four-edge / four-corner kernel symmetry,
identity-kernel pass-through, constant-plane invariance, the
per-channel impulse response on a 3×3 plane, linearity of the
convolution operator, single-row mirror-collapse, and the
per-channel dispatch throughapply_xyb_planes_in_place. Lib
tests 462 → 485. This is a pure-math primitive in the same shape
as round-89dct_quant_weights, round-95hf_dequant, round-121
llf_from_lf, and round-138chroma_from_luma: it lands the
bit-exact arithmetic so a future round wiring §J.2 into the
per-frame restoration-filter pipeline can drop it in without
re-deriving the kernel or the mirror semantics. Does NOT
implement §J.3 (edge-preserving filter) and does NOT honour the
rf.gabskip — both are the caller's responsibility.
- edge-mirror fallback),
-
Round 138 (2021-FDIS / 2024-spec) — Annex G "Chroma from luma"
pure-math primitive (Listing G.1). Newsrc/chroma_from_luma.rs
module transcribes FDIS Annex G (page 73) verbatim: given the
per-frame [lf_global::LfChannelCorrelation] bundle (§C.4.4) and,
for HF coefficients, the per-64×64-tile factor samples from
[lf_group::HfMetadata]'sx_from_y/b_from_ychannels
(§C.5.4), the module computes the CfL multipliers(kX, kB)and
applies the Listing G.1 reconstructionX = dX + kX × Y,
B = dB + kB × Y,Y = dYper sample. Public API:
kx_kb_raw(base_x, base_b, colour_factor, x_factor, b_factor)
(Listing G.1 lines 1-2),kx_kb_lf(cfl)(LF derivation
x_factor = x_factor_lf - 127,b_factor = b_factor_lf - 127),
kx_kb_hf(cfl, x_factor_hf, b_factor_hf)(HF derivation from the
64×64-tile factor sample),apply_sample/apply_lf_sample/
apply_hf_samplefor the per-sample reconstruction, and the
plane-levelapply_lf_plane_inplace(dx, dy, db, cfl)(constant
per-frame(kX, kB)) +apply_hf_plane_inplace(dx, dy, db, w, h, x_from_y, b_from_y, cfl)(per-tile_x=x/64/tile_y=y/64
lookup, with a per-tile(kX, kB)cache). 20 new unit tests + 11
new integration tests (round138_chroma_from_luma) pin the
default-bundle multipliers (kX = 1/84,kB = 1 + 1/84), the
Y-identity line, the round-trip against the encoder-side
decorrelationdX = X - kX × Y, multi-tile HF plane lookup
(128×64 → 2 tiles wide, 65×65 → 4 tiles viadiv_ceil), and the
defensivecolour_factor == 0rejection on both LF and HF paths.
Lib tests 442 → 462. This is a pure-math primitive in the same
shape as round-89dct_quant_weights, round-95hf_dequant, and
round-121llf_from_lf: it lands the bit-exact arithmetic so a
future round wiring §F.3 + Annex G into the per-LfGroup VarDCT
pipeline can drop it in without re-deriving any G.1 formulae.
Does not handle subsampled chroma (Annex G excludes that case
outright) and does not drive the per-LfGroup loop (deferred). -
Round 133 (2021-FDIS / 2024-spec) — §C.7.1
DecodePermutation()
forused_orders != 0.HfPass::readnow handles the
non-natural coefficient-order path of Listing C.12: the shared
"8 clustered distributions D" are read once into a
modular_fdis::EntropyStream(num_dist = 8) with its ANS state
initialised, then each setused_ordersbit runs the §C.3.2
Lehmer-code permutation against that same stream. New public
coeff_order::decode_permutation_from_stream(br, entropy, hybrid, size, skip)factors the §C.3.2 procedure generically (the same
algorithm the TOCpermuted_tocpath uses); §C.7.1 supplies
size = coefficient_count(order)andskip = size / 64, yielding
order[i] = natural_coeff_order[nat_ord_perm[i]].HfPass::read
no longer returnsError::Unsupportedforused_orders != 0.
Addsget_context+lehmer_to_permutationunit coverage and
rewrites the two formerhf_passUnsupportedtests to assert the
stream-read path is now taken. -
Round 129 (2021-FDIS / 2024-spec) — per-varblock LF→LLF
composition glue (§I.2.5 plumbing). Three new public functions
invardctthat compose the round-121
[llf_from_lf::llf_from_lf] pure-math step with a single
channel's dequantised LF samples for a single varblock placement:vardct::extract_lf_subblock(lf_samples, lf_width, lf_height, bx, by, t)— extracts thecy × cxLF sub-block at varblock
origin(bx, by)in row-major order, per FDIS §I.2.5 prose
"the corresponding X/8 × Y/8 samples from the dequantized LF
image". ReturnsErr(InvalidData)on dim-mismatch, origin
overflow, or varblock extending past the LF grid (defensive
bounds-checking before the indexing).vardct::compose_lf_to_llf_block(lf_samples, lf_width, lf_height, bx, by, t)—extract_lf_subblock+llf_from_lf
in one call, returning thecy × cxLLF coefficient block of
the top-left of an HF varblock.vardct::compose_lf_to_llf_block_3ch(&LfDequantOutput, bx, by, t)— convenience wrapper that invokes the per-channel helper
once for each of the three colour channels (X, Y, B) when no
channel is subsampled (the common case where §F.2 adaptive LF
smoothing applied); rejects mismatched per-channel dims with a
clearInvalidDatamessage pointing the caller at the
per-channelcompose_lf_to_llf_blockfor the subsampled case.
24 new tests (15 unit in
src/vardct.rs+ 9 integration in
tests/round129_compose_lf_to_llf.rs). Covers DCT8×8 / DCT16×16
/ DCT32×32 squares, all six DCT16×8-class rectangles (DCT16×8,
DCT8×16, DCT32×8, DCT8×32, DCT32×16, DCT16×32), the nine non-DCT
pass-through transforms (Hornuss / DCT2×2 / DCT4×4 / DCT4×8 /
DCT8×4 / AFV0..AFV3), every kind of out-of-bounds varblock
placement (x-only, y-only, both, and DCT32×32 at the only
fitting origin),LfDequantOutputsubsampling rejection, and
byte-exact agreement with the hand-derivabledc * ScaleF(cy, bheight, 0) * ScaleF(cx, bwidth, 0)formula for every
rectangular transform on a constant input.This is the geometry glue between rounds 12/13 (per-LfGroup
LF dequant + smoothing) and rounds 91+/95 (HF coefficient ANS
decode + HF dequantisation). A future round wiring the §F.x
pipeline intodecode_codestreamcan drop these helpers in as
the per-varblock loop body without re-deriving any LF→LLF
geometry or §I.2.5 prose mechanics. Total lib tests: 422 → 437
(+15); total integration test files: 41 → 42 (+1).Round 129 also intentionally does not chase the
noise-64x64-losslesssample-194 wp_pred8 = 717 vs spec
divergence: the trace doc retired 2026-05-06 still has no
replacement indocs/image/jpegxl/per theproject_jpegxl_ pixel_blockedmemory note (DOCS-GAP unchanged across r126 and
r129). The deep-trace plumbing from r126 remains the stable
baseline for the future Specifier round. -
**Round 126 (2021-FDIS) — Self-correcting WP deep-trace plumbing
- sample-194 hand-derivation against Listings E.1/E.2/E.3.** New
WP_DEEP_TRACE+WP_DEEP_TRACE_ARMEDthread-locals in
modular_fdiscapture the 20-entry intermediate snapshot
(subpred[0..4],err_sum[0..4], post-shiftweight_shifted[0..4],
sum_weights_pre,log_weight,sh,sum_weights_post,nn8,
ww8,pred_pre_clamp,clamped_flag) for the trace-target
sample. The existingLEAF_PICK_TRACE_WPonly exposes
(te_w, te_n, te_nw, te_ne, w8, n8, nw8, ne8, wp_pred8, max_error)— round 126 fills in the missing nn8/ww8 + Listing
E.1/E.2/E.3 internals so a by-hand FDIS re-derivation against
pinned ground-truth is possible.
New test
tests/r126_wp_intermediates_at_194.rs(~150 lines,
2 tests + a docstring with the full hand-derivation). Pins:
wp_pred8 = 717at thenoise-64x64-losslesssample 194
(y=3, x=2, channel 0); the 20-entry deep trace; the 3-plane
first-divergence scan vsexpected.png. The hand-derivation
in the module docstring proves that NEITHER the subpred[3]
sign knob NOR thes_init - 1knob (the two FDIS-vs-current
deviations round 32 swept independently) can produce a
prediction in[709..716]from the captured neighbour state.
The fix must come from somewhere else — most likely a
state-evolution bug insub_error aWpHeaderparameter
mismatch. Round 126 also tried the FDIS-literal sub_err
formula (abs(((p_i + 3) >> 3) - true_value)per FDIS line
6832 vs the legacy(abs(p_i - tv*8) + 3) >> 3); the noise
fixture'swp_pred8at sample 194 was unchanged, but the
synth_320 drift-bisect fixture regressed (first drift moved
from y=24,x=14 to y=11,x=104), so the change is reverted in
this round and parked for the docs-collaborator behavioural
trace promised inproject_jpegxl_pixel_blocked.Net deliverable: deeper diagnostic plumbing + a stable pinned
baseline for the next round to compare hypotheses against.
Seven small lossless fixtures + synth_320 baselines untouched;
the noise fixture's plane[0] first-mismatch boundary remains
at linear index 194 (dec=35vsexp=34). - sample-194 hand-derivation against Listings E.1/E.2/E.3.** New
-
Round 121 (2021-FDIS / 2024-spec) — §I.2.5 LLF-from-LF
pure-math step (Listings I.15 + I.16). Newsrc/llf_from_lf.rs
(~500 LOC + 28 unit tests + 16 integration tests in
tests/round121_llf_from_lf.rs) lands the bridge from §F.2's
dequantised+smoothed LF samples into the top-left LLF coefficient
block of each HF varblock — the step the trailing prose of
§F.2 hands off to §I.2.7 (renumbered §I.2.5 in the 2021 FDIS).Public API:
scale_i8(n, u),scale_d8(n, u),scale_i(n, u),
scale_d(n, u),scale_c(n_big, n_small, x),
scale_f(n_big, n_small, x)(FDIS Listing I.15 closed-form
helpers);dct_1d(input) -> Result<Vec<f32>>(FDIS §I.2.1
forward 1-D DCT, sizes 1..=32);dct_2d(samples, rows, cols) -> Result<Vec<f32>>(§I.2.2 Listing I.3 forward 2-D DCT, algorithmic
inverse of [idct::idct_2d]);llf_dims(t) -> (u32, u32)
(LF-block dims perTransformType);llf_from_lf(input, t) -> Result<Vec<f32>>(Listing I.16 verbatim, including the non-DCT
pass-through cases for Hornuss / DCT2×2 / DCT4×4 / DCT4×8 /
DCT8×4 / AFV0..3).44 new tests pin: (a) the Listing I.15 closed forms — I8(8, 0)
= sqrt(0.5)/2, D8 = 1/(N·I8), the N=8 branch of I/D, C(N, N, x)
= 1, C reciprocal-on-swap, ScaleF(1, 8, 0) = 1.0 (DCT8×8 corner
identity), (b) the §I.2.1 1-D forward DCT formula via the
unit-impulse closed form and the constant-signal DC-only result,
(c) byte-exact LLF blocks for DCT8×8 (single-cell identity),
DCT16×16 with both constant-block and impulse-block inputs
(out[y·2+x] = 0.25 · SF(2,16,y) · SF(2,16,x)),
DCT16×8 / DCT8×16 rectangular paths, DCT32×32 dimension
contract, and the non-DCT pass-through across all nine
single-8×8-block transforms.dct_2d↔idct::idct_2dround-trip verified at 4×4 to f32
epsilon, confirming the forward DCT is the precise algorithmic
inverse of the round-12 IDCT. -
Round 95 (2021-FDIS / 2024-spec) — §F.3 HF dequantisation
pure-math step. Newsrc/hf_dequant.rs(~310 LOC + 13 unit
tests) implements the FDIS p. 72 Annex F.3 HF coefficient
dequantisation formula verbatim: Listing F.2 bias-adjust
(*= quant_bias[c]for|q| <= 1,-= quant_bias_numerator / quantotherwise), per-blockHfMulmultiplier, per-channel
0.8^(x_qm_scale - 2)/0.8^(b_qm_scale - 2)factor (Y
channel exempt), and the §C.6.2 per-(channel, transform_type, coeff_index)dequant-matrix entry from the
round-89dct_quant_weights::DequantMatrixSet.Public API:
bias_adjust(quant: i32, channel: usize, oim: &OpsinInverseMatrix) -> f32,QmScaleFactors::for_frame(&FrameHeader),
QmScaleFactors::for_channel(channel) -> f32,
dequant_hf_coefficient(quant, channel, hf_mul, dequant_matrix_entry, oim, qm) -> f32,
dequant_hf_pre_matrix(...)(partial product helper).10 new integration tests
(tests/round35_hf_dequant.rs) pin Listing F.2 branch
boundaries (zero, ±1, |q|>1 subtractive bias sign-preservation),
the FDIS defaultquant_bias_numerator = 0.145fixed-point
quant=2 → 1.9275, the0.8^(u(3) - 2)exponent sweep, and
the cross-module composition against
materialise_default_dequant_set()for X / Y channels at the
DCT8×8 corner cell. Y channel verified to skip the qm-scale
factor; X channel under defaultx_qm_scale = 3verified to
pick up a 0.8 factor.Made
FrameHeader::default_withpub(crate)(was private) so
the newhf_dequantunit tests can construct a default
FrameHeaderwithout going through bit-stream parsing.Round 95 lands the bit-exact F.3 arithmetic so the future
round that wires the per-block ANS coefficient decode (the
round-90 followup blocked on the shared 8-cluster ANS stream- §C.7.2 histograms) can drop the integer ANS reader on top
without re-deriving any formulae. CfL (Annex G) and IDCT
(Annex I.2) still chain afterwards.
- §C.7.2 histograms) can drop the integer ANS reader on top
-
Round 90 (2021-FDIS / 2024-spec) — HfPass + PassGroup HF
structural parsers. Three new modules surface the §C.7.1 /
§C.7.2 HfPass bundle and the §C.8.3 PassGroup HF entry-points,
preparing the HF coefficient decode pipeline for the per-block
ANS-stream wiring scheduled for round 91+.New
src/coeff_order.rs(~430 LOC + 12 tests): §I.2.4 natural
coefficient ordering for everyOrderId0..=12 (Table I.1).
Builds theLLFprefix sorted byy × bwidth + xfollowed by
theHFtail sorted by(key1, key2)per Listing I.14. Public
API:OrderId,varblock_size_for_order,natural_coeff_order,
coefficient_count,order_id_for_transform,
COEFFICIENTS_PER_ORDER.New
src/hf_pass.rs(~290 LOC + 7 tests): §C.7.1 Listing C.12
parser. Readsused_orders = U32(Val(0x5F), Val(0x13), Val(0), Bits(13)). Theused_orders == 0fast path materialises all 13
natural orders directly per the listing'selsebranch.
used_orders != 0returnsError::Unsupported— the permutation
reads need the shared 8-cluster ANS stream that §C.7.2 histograms
also feed; wiring that shared stream is round-91 work. Exposes
num_histogram_distributions = 495 × num_hf_presets × nb_block_ctxso the next round knows the §C.7.2 read count
up-front. Also exposesread_hf_pass_sequencefor the per-pass
loop.New
src/pass_group_hf.rs(~460 LOC + 18 tests): §C.8.3 first
line + Listing C.13. Readshfp = u(ceil(log2(num_hf_presets))),
validateshfp < num_hf_presets, computes
histogram_offset = 495 × nb_block_ctx × hfp. Verbatim
transcriptions ofblock_context,non_zeros_context,
coefficient_context,predicted_non_zeros, plus the two
64-elementCoeffFreqContext/CoeffNumNonzeroContextladder
tables aspub constarrays. The actual per-block ANS
coefficient decode loop defers to a later round (it requires the
shared per-pass ANS stream from §C.7.2).New integration suite
tests/round34_hf_pass_pass_group_hf.rs
(12 tests) exercises the typed surface end-to-end at the
structural level — HfPassused_orders == 0parse + all 13
natural orders, §C.8.3 hfp range checks, BlockContext default-
map paths, NonZerosContext continuity at the
predicted == 8boundary, CoefficientContext with the listed
ladder constants, PredictedNonZeros four-arm dispatch table.Test delta: +49 tests (332 → 381 lib tests; new integration
suite contributes 12 more). No fixture-level pixel decode
changes; the seven small lossless fixtures continue to decode
pixel-correct, and the two committed VarDCT fixtures still hit
their existing round-13 deferral gate (next round's HF dequant- per-block decode flips that gate).
Spec gap: none new. Listing C.12 / Listing C.13 / Listing I.14
/ Table I.1 are unambiguous on the round-90 contract scope.Followups (round 91+): (a) shared per-pass 8-cluster ANS stream
init, (b)used_orders != 0DecodePermutation reads, (c)
§C.7.2 histogram read (495 × num_hf_presets × nb_block_ctx
clustered distributions), (d) per-block coefficient decode loop
per the C.8.3 prose right after Listing C.13, (e) §F.3 HF
dequantisation gluing the round-89 dequant matrices to the
newly decoded coefficients. -
Round 89 (2024-spec) —
GetDCTQuantWeights+ Table I.6 default
dequantization-matrix materialisation (parent-dispatch r89). New
src/dct_quant_weights.rs(~1k LOC + tests) transcribes the
ISO/IEC 18181-1:2024 §I.2.4 / §I.2.5 + Table I.4 + Table I.6
listing block from page 58-60 of the published core PDF:mult(v)— specMultpiecewise function
(1+v if v > 0 else 1/(1-v)).interpolate(pos, max, bands)— specInterpolatewith the
2024 correctedA * pow(B/A, frac_index)form. Includes
defensive clamping whenpos == max(would otherwise index
pastbands.size() - 1).compute_dct_weights(params, x_dim, y_dim)— spec
GetDCTQuantWeightsper the post-typo-fix 2024 listing
(bands loop closes BEFORE the weights matrix double-loop,
correcting the FDIS 2021 PDF's nested-loop bug).materialise_weights_for_dct_select(bundle, channel, X, Y)—
per-mode (DCT, DCT4, DCT2, Hornuss, DCT4x8, AFV)
weights-matrix dispatch per §I.2.4 page 58 prose +
Listing C.11 for AFV.materialise_dequant_for_channel(bundle, channel, X, Y)—
element-wise reciprocal of the weights matrix per
§I.2.4 last paragraph. Validates the
"no non-positive or infinity" spec invariant.materialise_default_dequant_set()— the full 17-slot ×
3-channel default set per Table I.6 (page 60),
transcribed verbatim including theSeqA/SeqB/
SeqCabbreviated sequences from the spec footnote and
thedct4x4_paramsconstant for slots 3 (DCT4×4) and 10
(AFV).weights_matrix_dims_for_slot(slot)— Table I.4 page 57
dimensions lookup (0..=16).slot_for_transform(t)—TransformType(Table C.16
0..=26) → Table I.4 slot (0..=16) mapping; multiple
transforms share a slot (e.g. DCT16×8 and DCT8×16 both
map to slot 6).
Test count: 26 new tests (15 unit tests in
src/dct_quant_weights.rs+ 11 integration tests in
tests/round33_dct_quant_weights.rs). Every cell of every
channel of every default slot is verified positive-finite per
the §I.2.4 invariant. Spot-checks include:- DCT8×8 slot 0 channel 0 (0,0) cell = 1 / 3150.0 (reciprocal of
Table I.6 row-0 head). - Hornuss slot 1 (0,0) cell = 1.0 (spec sets weights(0,0) = 1).
- AFV slot 10 8×8 fully populated (Listing C.11 covers all 64
cells across the freqs interpolation + weights4x8 + weights4x4
fills).
Spec-listing typo notes (recorded in module doc-comment):
- FDIS 2021 PDF Listing C.10 has the
for (y, x) { ... }
weights double-loop INSIDE thefor (i = 1; i < len; i++)
bands loop — would compute the matrixlen - 1times. The
2024 published edition (docs/image/jpegxl/ ISO_IEC_18181-1-JPEG-XL-Core-2024.pdfpage 58) corrects this.
Module follows the 2024 form. - 2024
Interpolatedropslen(usesbands.size()) and
writespow(B / A, frac_index)instead of FDIS 2021's
A * (B / A)^frac_index. Mathematically identical.
SPECGAP recorded: DCT2 cell (0, 0) is not assigned by the spec
listing block (page 58). Implementation fills it with
params(c, 0)(same value used fori == 0neighbours) so the
dequant reciprocal is finite. The 6-rectangle assignments cover
62 of 64 cells, plus (1, 1); (0, 0) is the only unmentioned
position. Recommend a spec clarification.Unblocks: downstream HF coefficient dequantisation per §F.3 on
the HfGlobalu(1) == 1default-encoding fast path. The
non-default branch'sRAWencoding mode still requires a
modular sub-bitstream decode (deferred to round 90+ alongside
the §F.3 wiring).Spec citations: ISO/IEC 18181-1:2024 page 58 (Listing for
Interpolate/Mult/GetDCTQuantWeights), page 59
(Listing C.11 AFV weights + per-mode prose), page 60
(Table I.6 default matrix parameters), page 57 (Table I.4
weights-matrix dimensions). Cross-referenced against ISO/IEC
FDIS 18181-1:2021 PDF (extractable) Listing C.10 / Table C.18
/ Table C.20 (the 2021 equivalents).Fixture count remains 7 pixel-correct lossless small fixtures
(no change — round 89 is upstream of the pixel-decode flow;
HfGlobal default-encoding parsing remains unchanged in
behaviour). -
Round 77 (2024-spec) — animation-3frame SPECDIFF audit + docs
citation. Two new audit-grade integration tests
(tests/r77_animation_3frame_specdiff.rs) characterise the
docs/image/jpegxl/fixtures/animation-3frame/input.jxlfixture
(cjxl 0.12.0, 78 B, 3 RGB Regular Modular frames of 32×32 with
have_animation = 1). The probe-level path is correct
(probe_fdisrecovers SizeHeader + ImageMetadata with
have_animation = true+ AnimationHeader populated); the
decode-level path remains blocked on a real spec-edition split
between ISO/IEC 18181-1:2021 FDIS Table C.9 (which our
RestorationFilter::readfollows; no leadingall_default
field) and the published 2024 Table J.1 (which prepends an
all_default Bool()to the bundle plus au(32)"(ignored)"
field afterepf_channel_scale). Bit-trace bisect (recorded in
the test file's module docs):- The two-bit RF SPECDIFF lifts our FrameHeader bit count from
39 to 40 for the animation fixture, which lets `permuted_toc- pu0
correctly land the TOC entry U32 at byte 11 of the codestream; that read yieldsentry value = 16, matching the libjxl trace'stotal_bytes = 16`.
- pu0
- The seven currently-pixel-correct lossless fixtures were
encoded by cjxl 0.11.1 against the 2021 FDIS layout and do
NOT include the leadingall_defaultbit; landing the
2024-Table-J.1 fix straightforwardly breaks
alpha-64x64.jxl. The audit recommendation (recorded in the
test docs) is to re-encode the seven fixtures with cjxl
0.12.0+ before applying the 2024-spec fix uniformly. This is
a docs-collaborator follow-up — there is no codestream-level
edition tag, so a single-pass parser cannot dispatch between
the two RF layouts without a heuristic. - Spec citations: ISO/IEC 18181-1:2024 Table J.1
(docs/image/jpegxl/ISO_IEC_18181-1-JPEG-XL-Core-2024.pdf
page 70) and ISO/IEC FDIS 18181-1:2021 Table C.9
(pdftotext-extractable lines 4088-4101). Trace fixture at
docs/image/jpegxl/fixtures/animation-3frame/trace.txt.
Fixture count remains 7 pixel-correct lossless small fixtures
(no change). Test count grows by 2 (audit harness). - The two-bit RF SPECDIFF lifts our FrameHeader bit count from
Changed
-
Round 32 (2024-spec) —
noise-64x64-losslesspixel-divergence
bisected to the Self-correcting weighted predictor at the first
y >= 2, x >= 2sample whosepredictor == 6; root cause
localised, fix deferred pending a libjxl-trace doc that this
workspace does not yet ship. The fixture count therefore stays
at 7 pixel-correct lossless fixtures (status quo). No source-file
semantic changes this round; the diagnostic harness used to
bisect was removed before commit and the regression set remains
green.Round 31 left the noise fixture as a "decodes without EOF, but
pixels diverge from expected.png starting at plane[0] sample
194" follow-up. Round 32 reproduced that divergence and pinned
it down further:- The first divergence is at plane[0] (y=3, x=2) — the FIRST
sample whose predictor is6 (Self-correcting)and which has
the full set of WP neighboursN, W, NW, NE, NN, WWpopulated
(i.e.y >= 2 && x >= 2). The priorpredictor == 6samples
in rowsy = 0andy = 1all decoded pixel-correct because
their WP path takes theNN does not exist → NN = N
fall-back. Twopredictor == 6samples on rowy = 2also
decoded correctly becauseWW = Wwas used (the bug requires
WW ≠ W, i.e.x >= 2). - At sample 194 the WP machinery produces
wp_pred8 = 717
(Listing E.3 weighted sum). The spec rounding `(wp_pred8 + 3)3
then yieldsp = 90, givingv = diff + p = -55 + 90
= 35— butexpected.pngsays34. Sowp_pred8is 1 too high modulo the rounding (any value in[709..716]would givep = 89and thencev = 34). The MA-tree leaf, the decoded token, the diff-55, andwp_max_error` all match what the
neighbour state legitimately implies — the discrepancy is
purely in the WP weighted sum. - Bisected against
WP_ROUND_BIAS ∈ {0..=7},s_init ∈ {(sum_weights >> 1) - 1, (sum_weights >> 1), sum_weights, 0},
thesubpred[3]sign (FDISN + …vs. round-3 codeN - …),
and the clamp condition (<= 0vs>= 0). Every alternative
either re-introduces an EARLIER divergence (samples 68, 79,
142) on the noise fixture, OR breaks one of the seven
currently-pixel-correct lossless fixtures. So the bug is NOT
in any of the dimensions our spec text exposes a knob for. - Suspected residual root cause: a subtle interaction between
the FDISerror2weightformula's outer>> shiftstep (only
in the 2024 published edition and the round-3 code; absent
from FDIS 2021 literal text), the four sub-predictor weights,
and the finals × ((1 << 24) Idiv sum_weights) >> 24
division. Most likely the libjxl reference uses ans_init
formula that depends on the shifted vs unshifted
sum_weightsin a way the FDIS spec text does not disclose.
Resolving this needs either (a) a behavioural trace of the
libjxl WP path on the noise fixture at sample 194 captured by
the docs collaborator, or (b) the docs collaborator's
promiseddocs/image/jpegxl/libjxl-trace-reverse-engineering.md
section on §H.5.2 Sub-predictions (referenced in the
project_jpegxl_pixel_blockedmemory note, but the file does
not yet exist indocs/image/jpegxl/).
Round-32 scope therefore closes with the bisect finding above
recorded and the regression set green. No.gitignore/ Cargo
changes; no API surface deltas. The §F.3 zero-pad fix from
round 31 stays in place andnoise-64x64-losslesscontinues to
decode-complete (just with non-byte-exact pixels).Spec citations: FDIS Annex E.1 (Sub-predictions, Listing E.1),
E.2 (Prediction weights, Listing E.2), E.3 (Prediction, Listing
E.3), and Table H.3 rowpredictor == 6(`(prediction + 3)3`).
- The first divergence is at plane[0] (y=3, x=2) — the FIRST
Added
-
Round 31 (2024-spec) — §F.3 zero-pad uniformly applied to the
single-TOC-entry LfGlobal fast path; noise-64x64-lossless now
decodes without EOF (parent-dispatch "r16" option A). One
narrowsrc/lib.rs::decode_codestreamdelta:-
Pre-round-31, when
num_groups == 1 && passes == 1 && toc.entries.len() == 1, the decoder routedLfGlobal::read
through the non-padding mainBitReader(pad_eof_with_zeros == false). The other LfGlobal path already used
BitReader::new_section(which implements FDIS §F.3's
section-bit-budget + zero-pad rule). For six of the seven small
lossless fixtures the entire LfGlobal section had enough
trailing slack that the read never touched the padding region;
noise-64x64-lossless(cjxl -d 0 -e 7, 64×64 high-entropy RGB
Modular, MA treenodes=167 leaves=84) does NOT — its
per-pixel ANS / hybrid-uint refill loop on the final samples
reaches a few bits past the byte budget that the spec says must
read as zero. Pre-round-31 the non-padding reader errored
instead →InvalidData("unexpected end of JXL bitstream"). -
The fix collapses both LfGlobal-read branches into one path
that always usesBitReader::new_sectionagainst the
toc-declared section byte range. This makes the single-section
fast path bit-for-bit equivalent to the multi-section path on
its real-data prefix, and applies §F.3 zero-pad uniformly.
Spec citation: FDIS §F.3 first paragraph — "When decoding a
section, no more bits are read from the codestream than 8 times
the byte size indicated in the TOC; if fewer bits are read, then
the remaining bits of the section all have the value zero."Test added:
tests/r31_noise_lossless.rswith two cases —
noise_64x64_lossless_decodes_without_eof_error(locks the
shape of the post-fixVideoFrame: 3 RGB planes, stride=64,
data.len()=4096 each) andpre_round31_seven_lossless_fixtures_ still_decode(regression sentinel: the seven pre-round-31
fixtures all decode successfully under the unified path).
Committed fixture pair undertests/fixtures/:
noise_64x64_lossless.jxl(13 505 B) +
noise_64x64_lossless_expected.png(12 505 B, 8-bit RGB PNG).Known limitation NOT fixed this round: while
noise-64x64-losslessnow decode-completes (vs hard-EOF), the
produced pixels are not yet byte-identical toexpected.png.
The first divergence is plane[0] (R) at (2, 3) — i.e. samples
0..193 of plane 0 match, and from sample 194 on ~98 % of samples
diverge. The divergence point is deterministic and well within
the section's real-byte budget, so the §F.3 fix is independent
of the residual pixel-divergence. Suspected root cause: a
latent state-evolution bug in either the MA-tree leaf decode
withnum_contexts > 16(the leaf-streamEntropyStream's
cluster_map is 84 → 3 clusters here, vs ≤ 6 → ≤ 4 in every
other lossless fixture), the Self-correcting WP state on
high-entropy neighbour history, or the hybrid-uint extra-bits
path for largen_extravalues. Deferred to round 32 — needs
the round-24-style per-cluster trace replayed against the
cleanroom Python reference at ~30 distinct bit positions across
the 108 kbit symbol stream.Docs gap noted:
docs/image/jpegxl-cleanroom/reference-impl/
(referenced in the round-31 brief as the place to bisect
against) does not yet exist; the round-30 deferral note pointed
at it as a future bisect target. The §F.3 fix landed without
needing it — pure spec-text bisect against FDIS §F.3 was
sufficient. The reference-impl directory would still be useful
for the residual pixel-divergence bisect; ask the docs
collaborator to populate it for round 32. -
-
Round 30 (2024-spec) — bit-depth-16 RGB pixel-correct decode +
16-bit LE plane-pack convention (parent-dispatch "r15" option A).
Lifts the fixture count from 6 to 7 by addingbit-depth-16
(docs/image/jpegxl/fixtures/bit-depth-16/input.jxl, 421 B,
64×64 RGB lossless Modular atbits_per_sample = 16) and
documents the wider-than-8-bit pack convention forced on us by
oxideav-core0.1.x's bit-depth-lessVideoPlane.Two narrow
src/lib.rs::decode_codestreamdeltas:-
Bit-depth gate widened. The pre-round-30 hard reject
metadata.bit_depth.bits_per_sample != 8now accepts
bps ∈ 1..=16. The XYB and YCbCr branches (FDIS Annex L.2.2 /
L.3) still hard-requirebps == 8because their dequantisation
lattice is calibrated against the 8-bit output range — a
specificError::Unsupported("jxl decoder (round 30): XYB high-bit-depth (bps={...}) deferred")now precedes the
transform call. Float (float_sample == true) andbps > 16
remain unsupported. -
Pass-through plane pack dispatches on
bps. The previous
loop unconditionally clamped eachi32sample to[0, 255]
and pushed one byte per sample withstride == width. The
new loop:bps ≤ 8— unchanged: 1 byte/sample,stride == width,
sample clamped to[0, 2^bps - 1].9 ≤ bps ≤ 16— 2 bytes/sample little-endian,
stride == width × 2, sample clamped to[0, 2^bps - 1],
packed viau16::to_le_bytes.
The LE-pack choice is documented in
crates/oxideav-jpegxl/README.mdunder "Plane byte layout"
(new section) so that downstream consumers (cli-convert/
etc.) know how to reinterpret a wide plane as&[u16]. PNG's
RFC 2083 §2.1 ships big-endian 16-bit samples; we deliberately
pick LE so abytemuck::cast_slice::<u8, u16>on a
little-endian host is a zero-cost view (vs forcing a per-sample
swap).
Test count:
tests/round30_bit_depth_16.rsadds 3 tests
(bit_depth_16_rgb_pixel_correct_vs_expected_png— full 64×64×3
16-bit byte-for-byte match against the committed
bit_depth_16_expected.png;
bit_depth_16_le_pack_convention_self_consistent— invariant
check on stride/length/round-trip;
pre_round30_8bit_fixtures_still_byte_packed— regression
sentinel for the four pre-existing 8-bit byte-packed fixtures).
Committed fixture pair undertests/fixtures/:
bit_depth_16.jxl(421 B) +bit_depth_16_expected.png
(375 B, 16-bit RGB PNG).Cross-checked against
djxl v0.11.1as a black-box oracle (PPM
output → byteswap BE→LE → byte-equal to our planes). Crate now
decodes 7 small lossless Modular fixtures pixel-correct vs
expected.png(was 6): pixel-1x1, gray-64x64,
gradient-64x64-lossless, palette-32x32, grey_8x8_lossless,
alpha-64x64, bit-depth-16.Spec citations: FDIS Annex A.6 + Table A.22
(bit_depth.bits_per_samplebundle), Annex G.1.3 (Modular
channel-order rule — colour channels share the global
bits_per_sample, no per-channel bit-depth split for kModular
RGB), PNG RFC 2083 §2.1 (PNG ships 16-bit big-endian, so our
reference-PNG read usesu16::from_be_bytes).Docs gaps identified probing adjacent fixtures during round 30:
noise-64x64-lossless(13.5 KB,nodes=167 leaves=84per
trace.txt) still fails insideLfGlobal::readwith "unexpected
end of JXL bitstream" — large MA-tree decode path likely
mis-computes a hybrid-uint extra-bits count for a high-context
leaf; deferred to round 31.vardct-256x256-d1/d3and
noise-feature-256x256fixtures all hit independent VarDCT
pipeline gaps and are unrelated to round 30. -
-
Round 29 (2024-spec) — alpha-64x64 RGBA pixel-correct decode +
ISOBMFF signature-strip fix (parent-dispatch "r14" option A).
Two narrow lib-level fixes insrc/lib.rs::decode_one_frame/
decode_codestreamunblock the docs cleanroomalpha-64x64
4-channel Modular lossless fixture (docs/image/jpegxl/fixtures/ alpha-64x64/input.jxl, 86 B) for pixel-exact decode against the
committedexpected.png(8-bit RGBA, 64×64):-
ISOBMFF
FF 0Astrip. The jxlc/jxlp box payload IS a JXL
codestream and therefore begins with the 2-byteFF 0A
codestream signature (FDIS Annex B.1). The RawCodestream branch
already stripped those 2 bytes before handing off to
decode_codestream; the ISOBMFF branch did NOT. The result was
a 16-bit misalignment at theSizeHeader::readparse that
cascaded into apparently-unrelated downstream failures
(bit-depth-16trippedJXL permutation: LZ77-enabled TOC sub-stream not supportedbecause the TOCpermutedflag bit
parsed as 1 instead of 0). Now the ISOBMFF branch validates the
FF 0Aprefix and strips it symmetric with the raw path. A new
unit test wrapsgradient-64x64-losslessin a minimal ISOBMFF
(signature + ftyp + jxlc) and asserts plane-by-plane equality
vs. the raw decode (tests/round29_alpha_rgba_pixel.rs:: isobmff_wraps_raw_codestream_decodes_identically). -
Extra-channel mapping. The post-Modular channel-count check
n_chans != expected_chansrejected RGBA Modular frames
because the Modular decoder lays out colour and extra channels
in a flat array of lengthexpected_chans + num_extra_channels
(FDIS Annex G.1.3 colour-then-extras channel-order rule). The
check now also accepts the with-extras length and emits a
trailing VideoFrame plane per extra channel. For
alpha-64x64this maps directly to 4 RGBA planes; for
hypothetical multi-extra fixtures (depth, spot colour, …) the
same path extends N-ways. The XYB-encoded / YCbCr branches are
unchanged — those still require exactly 3 colour channels and
fall through if extras are present (round-30+ work).
Test count:
tests/round29_alpha_rgba_pixel.rsadds 3 tests
(alpha_64x64_rgba_pixel_correct_vs_expected_png— full 64×64×4
byte-for-byte match;five_pre_round29_fixtures_still_pass—
regression sentinel for pixel-1x1 / gray-64x64 / gradient-64x64 /
palette-32x32 / grey_8x8_lossless;isobmff_wraps_raw_codestream_ decodes_identically— synthetic ISOBMFF wrap of
gradient-64x64). Committed fixture pair undertests/fixtures/:
alpha_64x64.jxl(86 B) +alpha_64x64_expected.png(283 B).Crate now decodes 6 small lossless Modular fixtures pixel-correct
vsexpected.png(was 5): pixel-1x1, gray-64x64,
gradient-64x64-lossless, palette-32x32, grey_8x8_lossless,
alpha-64x64.Spec citations: FDIS Annex B.1 (codestream signature),
Annex G.1.3 (channel order), Annex A.6 + A.9 + Table A.22
(ImageMetadata + ExtraChannelInfo).Docs gaps identified probing adjacent fixtures:
bit-depth-16
(421 B) reaches the 8-bit-only post-Modular check (decoder needs
a 16-bit output-pack path before VideoFrame mapping — deferred);
noise-64x64-lossless(13.5 KB) fails inside LfGlobal with
"unexpected end of JXL bitstream" suggesting the high-entropy
random-RGB MA tree exercises a code path not yet covered
(deferred). -
-
Round 28 (2024-spec) — non-DCT IDCT helpers (parent-dispatch
"r13" item 3). Extendssrc/idct.rswith five new public helpers
that complete the IDCT surface for the non-DCT TransformType
variants:aux_idct_2x2(block, S)— Annex I.9.3 Hadamard-style butterfly on
the top-leftS × Scells of an 8×8 buffer (S ∈ {1, 2, 4, 8}).idct_dct2x2(coefficients)— Annex I.9.3 closing recipe (chained
aux_idct_2x2calls at S=2, 4, 8).idct_dct4x4(coefficients)— Annex I.9.4: per-2×2-quadrant 4×4
IDCT_2D over interleaved coefficient cells with a DC patch from
aux_idct_2x2(coefficients, 2).idct_hornuss(coefficients)— Annex I.9.5: per-quadrant
block-LF + residual-sum centre cell + neighbour-fill + corner
corrective.idct_dct8x4(coefficients)— Annex I.9.6: column-major Hadamard
pair into two 4×8 (rows × cols) IDCT_2D halves tiled into rows
[0..4)and[4..8)of the 8×8 output.idct_dct4x8(coefficients)— Annex I.9.7: dual ofdct8x4,
row-major Hadamard pair into two 4×8 halves tiled by row.
idct_for_transform(t, coefficients)now dispatchesHornuss,
Dct2x2,Dct4x4,Dct8x4,Dct4x8to the dedicated helpers in
addition to the 18 plain-DCT variants from r12.Afv0..Afv3continue
to returnErr(Unsupported)pending an independently verified
256-entryAFVBasistable (deferred to a later round to avoid a
high-risk OCR transcription).New helper
non_dct_pixel_dims(t)returnsSome((8, 8))for the
nine non-DCT TransformType variants andNonefor plain-DCT — the
output of all five new helpers is always an 8×8 row-major buffer
(length 64), matching the closing entries of Listings I.9.3..I.9.8.Test count: lib
idct::tests36 → 57 (+21 new — 8 covering
aux_idct_2x2validation/butterfly/preserve/DC, 6 covering DC-only- per-quadrant correctness for the five helpers, 5 covering length
validation, 2 coveringnon_dct_pixel_dims); integration tests
+5 in newtests/round13_non_dct_idct.rsplus 1 updated
assertion intests/round12_idct_dispatch.rs(renamed
idct_for_transform_non_dct_transforms_return_unsupported→
idct_for_transform_afv_only_unsupported_after_round_13,
reflecting that only the AFV variants remain unsupported).
Spec-gap notes inline in the module documentation enumerate the OCR
transcription work deferred for AFVBasis. -
Round 27 (2024-spec) — IDCT dispatch (parent-dispatch "r12" item
5). Newsrc/idct.rs(~470 LOC including tests) wires the
spec-conformant 1-D inverse DCT (FDIS Annex I.2.1) for power-of-two
sizess ∈ {1, 2, 4, 8, 16, 32, 64, 128, 256}and the 2-D inverse
DCT (Annex I.2.2 Listing I.4) handling rectangularR × Cblocks.Three public entry points:
idct_1d(input)for the bare 1-D form,
idct_2d(coefficients, output_rows, output_cols)for the 2-D form
taking coefficients in the spec's(short × long)row-major natural-
ordering layout (Annex I.2.4) and returning samples in(R × C)
row-major, andidct_for_transform(t, coefficients)which dispatches
on adct_select::TransformTypeto the appropriate 2-D IDCT for the
18 plain-DCT transform types in Table C.16 (DCT8x8, DCT16x16,
DCT32x32, DCT16x8, DCT8x16, DCT32x8, DCT8x32, DCT32x16, DCT16x32,
DCT64x64, DCT64x32, DCT32x64, DCT128x128, DCT128x64, DCT64x128,
DCT256x256, DCT256x128, DCT128x256). The 9 non-DCT transforms
(Hornuss, DCT2x2, DCT4x4, DCT4x8, DCT8x4, AFV0..AFV3) — Listings
I.7..I.13 — returnErr(Unsupported)and are deferred to round 13+.Companion helper
dct_pixel_dims(t)returns the(rows, cols)
output shape for plain-DCT TransformType variants andNonefor the
non-DCT transforms.31 lib unit tests in
idct::tests(1-D length validation, DC-only
consistency for all 9 supported sizes, 1-D round-trip via private
forward DCT oracle for sizes 8/16/32/64, 1-D AC[1] hand-computed
spec-formula reference, 2-D length / shape validation, 2-D DC-only
consistency for 12 DCT block sizes, 2-D round-trip via 2-D forward
oracle for 8x8/16x8/8x16/16x16/32x32, dispatch validation for
DCT8x8/16x16/32x32/8x16/16x8 + every non-DCT TransformType returning
Unsupported, dct_pixel_dims completeness for both branches); 5
integration tests intests/round12_idct_dispatch.rs(1-D DC-only
for all sizes, 2-D DC-only for every plain-DCT block size,
Unsupported sentinel for every non-DCT transform, 2-D round-trip for
asymmetric 8x16 and 16x8 via inline forward oracle, five-fixture
Modular regression sentinel). Total test count 345 → 381 (+36 net).No new fixture coverage — the IDCT lands as a callable primitive that
round 13's PassGroup HF coefficient decode + F.3 dequantisation will
feed. The legacyvardct::idct1d_8andvardct::idct2d_8x8(round 8
scaffold, scaled-orthonormal IDCT) are kept untouched for backward
compatibility but are NOT spec-conformant; new HF-decode wiring will
call throughidct::idct_for_transformexclusively. -
Round 26 (2024-spec) — Annex L colour transforms (parent-dispatch
"r11"). Newsrc/xyb.rs(~210 LOC) transcribes FDIS §L.2.2 inverse
XYB → linear RGB and §L.3 inverse YCbCr → RGB verbatim from the
ISO/IEC 18181-1:2024 spec text. Three public entry points:
inverse_xyb_to_rgb(x, y, b, oim, tone_mapping),
inverse_ycbcr_to_rgb(cb, y, cr), and the convenience composite
modular_xyb_to_linear_rgb(y_prime, x_prime, b_prime, lf_dequant, oim, tone_mapping)which folds in the §L.2.2 preamble step
(X = X' * m_x_lf_unscaled,Y = Y' * m_y_lf_unscaled,
B = (B' + Y') * m_b_lf_unscaled). Helperlinear_rgb_to_u8
clamps + rounds the linear[0, 1]output to 8 bits.Wired into
decode_codestreammodular output stage: when
metadata.xyb_encoded == trueANDcolour_encoding.colour_space == Rgb(3 colour channels), the per-channel pass-through is replaced
withbuild_rgb_planes_from_xybwhich walks every pixel through
the inverse transform. Symmetricbuild_rgb_planes_from_ycbcr
branch handlesframe_header.do_ycbcr == true. The original
pass-through path is preserved for the common case
(xyb_encoded=false AND do_ycbcr=false) so all five small lossless
fixtures continue to pixel-correct decode.9 unit tests in
xyb::tests(DC zero-input, spec-listing
hand-computed reproduction, intensity_target linear scaling,
modular preamble multiplier check, YCbCr neutral / red-dominant,
linear→u8 clamping, X-sign-flip symmetry); 6 integration tests
intests/round11_xyb_inverse.rs(forward-→-inverse round-trip
for neutral grey AND saturated red using a hand-computed Cramer's-
rule matrix inversion ofoim.inv_mat, YCbCr neutral, u8
quantisation reference values, end-to-end zero-input modular wrapper,
and five-fixture pass-through regression sentinel). Total test count
345 → 362 (+17 net: 9 lib + 6 integration + 2 from earlier round-21
recount).No fixture decoded that didn't decode before — round 11 lays the
colour-transform foundation, but no modular-XYB or modular-YCbCr
fixture is currently committed (cjxl encodes photo-content XYB
inputs as VarDCT by default; the rare modular-XYB path needs a
hand-built minimal trace, deferred to round 12+ or a docs-
collaborator commission). The two committed VarDCT fixtures
(vardct_256x256_d1.jxl,vardct_256x256_d3.jxl) still terminate
at the round-13 "round 14+: HF subband decode + IDCT not yet wired"
Unsupported.SPECGAP documented in
xyb::linear_rgb_to_u8doc comment: §L.2.2
outputs linear-domain RGB (NOTE in spec) but the spec doesn't
prescribe a gamma encoding step before display — strict conformance
defers gamma application to a downstream colour-management consumer.
The crate emits linear bytes (clamp + scale by 255 + round); spec
callers needing sRGB-encoded bytes should apply sRGB transfer
themselves.Wall respected: spec PDF (Annex L pages 82-84 read directly), no
external library source consulted, nolibjxl-trace-reverse- engineering.md(retired). OpsinInverseMatrix defaults already
transcribed inmetadata_fdis::OpsinInverseMatrix::default()
(round-2) from FDIS Table L.1 independently; the new module
consumes those constants without re-reading the table. Test count
362, fmt + clippy clean against 1.95 toolchain. -
Round 24 (2024-spec, Auditor mode) — pursued round-23 candidates
(1) per-cluster ANS distribution byte-trace for clusters 0+1 and
(2) per-call alias-mapping invariant audit. Result: both paths
falsified. Cluster 0 (19 nonzero entries) and cluster 1 (23
nonzero entries) both sum to 4096; the alias table built from each
D[] routes probability mass to symbols identically to the declared
D[] (per-symbol routed-mass divergence = 0 for both clusters);
across the FULL 3072-call ANS trace the spec C.3.2
(symbol, offset) = AliasMapping(state & 0xFFF)invariant holds
bit-for-bit when checked against either cluster 0 or cluster 1's
alias table (0 hard violations; 288 ambiguous calls where both
clusters yield the same(symbol, offset, prob)). Per-call state
arithmeticstate = prob * (state >> 12) + offsetalso reproduces
the trace exactly. Cluster usage breakdown: c0=1755 calls,
c1=1317 calls, unknown=0 (no cross-talk into HFMetadata clusters
2/3/4). The d1 ANS final-state delta of0x21914271 - 0x00130000 ≈ 562Mis therefore NOT caused by a per-cluster D[]
shape mismatch, alias-table self-map / Vose-pump bug,
alias-mapping lookup bug, per-call state-arithmetic bug, or
cluster-routing leakage. Round 25 candidates: (1) D[]-vs-cjxl
reference comparison (a single mismatched count would be the
smoking gun), (2) leaf-pick + cluster-routing audit at samples
beyond sample 22 up to sample 79 (where r23's first ctx-flip was
observed), (3) HFMetadata stream-boundary cross-talk audit. New
diagnostictests/round24_d1_disttrace.rs(Auditor mode, never
asserts) with two tests:
d1_per_cluster_distribution_byte_trace_round_24(path 1) and
d1_per_call_alias_mapping_invariant_round_24(path 2). Full
audit notes incrates/oxideav-jpegxl/round24-d1-disttrace.md.
Test count 343 → 345 (+2). -
Round 22 (2024-spec, Auditor mode) — pursued round-21 candidates
(a)lf_quantfirst-256-sample dump per channel and (c) WP(p+3)>>3
rounding bias toggle on the d1LfCoefficientssub-bitstream. Result:
WP-rounding-bias bug class falsified. Added a runtime atomic
WP_ROUND_BIAS(default 3, spec-conformant per ISO/IEC 18181-1:2024
Table H.3 + FDIS-2021 Listing C.16) so the auditor can sweep biases
without recompile. Sweeps recorded post-decode ANS final state for
bias ∈ {0, 3, 4, 7}: 0 → 0x0042cd42 (|Δ|=3 132 738), 3 → 0x21914271
(|Δ|=561 922 673, spec), 4 → 0x00fd721e (|Δ|=15 364 638), 7 →
0x001214ac (|Δ|=60 244). All four miss the §D.3.3 sentinel
0x00130000; the +7 bias being closest proves the variation is
ANS-chain noise from leaf-flip cascades, not a true rounding bug.
Per-channellf_quantdump (Y'/X'/B', 1024 samples each, 32×32) shows
smooth low-frequency shape with sane stats (Y' mean=468 min=326
max=644; X' mean=14 min=−125 max=135; B' mean=41 min=−49 max=123),
consistent with a real-image fixture and proving the per-sample
decode loop is producing plausible data — not garbage. WP+3 vs +4
diverges first at Y' sample 22 (row 0, col 22), localising the actual
bug to a specific MA-tree leaf-flip at that sample. New diagnostic
tests/round22_d1_sample_dump.rs(Auditor mode, never asserts) dumps
both thelf_quanttable and the bias-sweep final states; full audit
notes incrates/oxideav-jpegxl/round22-d1-sampledump.md. Test count
337 → 338 (+1). -
Round 21 (2024-spec, Auditor mode) — pursued round-20 candidates
(1) per-cluster distribution decode bisect and (2) alias-table
self-map branch audit on the d1LfCoefficientssub-bitstream.
Result: both paths falsified. The 5 per-cluster ANS distributions
(clusters 0..4) all sum to 4096 with sane shapes (cluster sizes
19/23/5/2/2 nonzero entries out of 64); cluster 1's full 64-entry
alias table reconciles with the round-19 bit-faithful trace at calls
#0 and #1. Critically, none of the five clusters has anyD[i] == bucket_sizeentry, so the alias-table self-map branch (round-3
fix territory) is not triggered for d1. Documented one strict-spec
divergence inAliasTable::build(elsevs spec'selse if (cutoffs[i] < bucket_size)) that has zero observational effect on
d1 — hand-tracing the equal-bucket path confirms output-equivalent
behaviour. New diagnostictests/round21_d1_dist_alias_dump.rs
(Auditor mode, never asserts) captures per-cluster(cfg, D, alias)
triples + cluster-1 full alias dump as evidence; full bisect notes
incrates/oxideav-jpegxl/round21-d1-distbisect.md. Test count
336 → 337 (+1). -
Round 20 (2024-spec, Auditor mode) — re-interpreted cjxl
JXL_TRACEoutput'sbits_consumedfield as section-local (not
cumulative file position), invalidating the round-17/18/19 claim of a
267-bit overshoot inLfCoefficients. Empirical proof: in the same
trace,AC_GLOBAL_END bits_consumed=307whileDC_GLOBAL_END=1026,
so307 < 1026precludes a cumulative reading. With the corrected
interpretationDC_GROUPis 12754 bits (not 11728),LfCoefficients
fits well within the budget, andHfMetadata's slot is 759 bits.Identified a stronger oracle for the actual divergence: per FDIS
D.3.3, the ANS state must equal0x00130000after the final symbol
in any stream. WiredLATEST_ANS_STATE/LATEST_ANS_CALL_COUNT
thread-locals (insrc/ans/symbol.rs) so a test can read the
post-decode state without holding the per-streamMaTreeFdisclone.
On d1'sLfCoefficientsthe final state is0x21914271after 3072
decode_symbol calls — proving a structural decode divergence (wrong
per-cluster distribution, wrong alias mapping, wrong sample count, or
wrong read in the per-sample loop). The state never reaches the
sentinel within 3072 calls, so it's not a sample-count off-by-one.Lifted the previous 30-call cap on
STATE_TRACE_BUFso end-of-stream
bisects over multi-thousand-sample LF channels are tractable. Five
new tests intests/round20_d1_*.rs. See
crates/oxideav-jpegxl/round20-d1-hfmeta.mdfor the full audit and
the round-21 candidate ranking. -
Round 19 (2024-spec, Auditor mode) — extended the per-token
trace ring with(ctx, cluster, ans_refill_bits)and added a
STATE_TRACE_BUFrecording the first 30 ANS state transitions for
spot-checking against raw codestream bits. New
AnsDecoder::decode_symbol_with_refillreports refill-bit cost. New
tests/round19_d1_cluster.rsdrives d1 LfCoefficients under the
extended trace and emits per-cluster / per-ctx histograms plus a
diagnostic eprintln on the leaf-streamEntropyStream::readprelude
bit count. Findings: prelude is bit-exact (602 bits matching cjxl's
num_contexts=16 num_histograms=5 log_alpha_size=6), cluster_map is
bit-exact (16 → 5 distinct clusters), state transitions are
bit-faithful to raw codestream. The 267-bit overshoot remains
unexplained; deferred to round 20 with cjxl--debugper-call
bit-position trace as the proposed next-step. See
crates/oxideav-jpegxl/round19-d1-cluster.mdfor the full audit.