Other
- round-309 (parent-dispatch r309) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup VarDCT three-channel residual-plane assembly + Annex G chroma-from-luma
- round-306 (parent-dispatch r306) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup VarDCT residual-plane assembly (§C.5.4 placement + §C.8.3 + Table I.4/§I.2.3 pixel-dims)
- cover non-DCT transforms (Hornuss/DCT2x2/DCT4x4/DCT4x8/DCT8x4/AFV) — round 300
- round-293 (parent-dispatch r293) against ISO/IEC FDIS 18181-1:2021 — extend the per-block VarDCT decode walk to every plain separable-DCT transform (rectangular + DCT64..256 family), lifting the round-286 square-only orientation deferral
- Round 286 — per-block VarDCT decode walk to spatial samples (square DCTs)
- round-281 (parent-dispatch r281) against ISO/IEC FDIS 18181-1:2021 — two §C.8.3 decode-walk prose-conformance fixes: per-varblock channel decode order is Y, X, then B (rounds 221..264 advanced the entropy stream X-first; Listing C.13's (c < 2 ? c ^ 1 : 2) mapping independently corroborates Y-first) + NonZeros(x, y) writeback covers every block of the varblock footprint per the 'for each block in the current varblock' prose (rounds 177..264 wrote only the top-left cell, corrupting PredictedNonZeros reads against continuation cells of multi-cell transforms)
- round-278 (parent-dispatch r278) against ISO/IEC FDIS 18181-1:2021 — FIX the rounds-31..272 noise-64x64-lossless WP pixel divergence: Listing E.2 error2weight Idiv-first operand order + true_errNW column-0 N-fallback, both pinned by the staged wp-trace-sample-194.md; noise-64x64-lossless now byte-exact on all three planes and synth_320 pixel-exact (102400/102400)
- name + pin the WP sub_err reading choice (Annex E.1)
- round-264 (parent-dispatch r264) against ISO/IEC FDIS 18181-1:2021 — HfHistogramDecodeContext::decode_lf_group_three_channels_for_pass bundled per-LfGroup raster-walk three-channel decode driver for one pass
- round-260 (parent-dispatch r260) against ISO/IEC FDIS 18181-1:2021 — HfHistogramDecodeContext::decode_three_channel_varblock_for_pass bundled three-channel per-varblock walk composing the round-255 single-channel decode method three times against the round-214 BlockContextResolver per-channel block_ctx derivation
- round-255 (parent-dispatch r255) against ISO/IEC FDIS 18181-1:2021 — HfHistogramDecodeContext::decode_block_for_pass_transform bundled per-varblock decode method composing the round-90 Listing C.14 state machine with the round-252 per-pass histogram routing
- round-252 (parent-dispatch r252) against ISO/IEC FDIS 18181-1:2021 — multi_pass_hf_histogram_decoder::HfHistogramDecodeContext typed bridge wiring the §C.7.2 entropy stream to the §C.8.3 per-pass histogram_offset routing
- drop release-plz.toml — use release-plz defaults across the workspace
- round-247 (parent-dispatch r247) against ISO/IEC FDIS 18181-1:2021 — §C.7.2 HfCoefficientHistograms wrapper performing the actual EntropyStream::read of the 495 × num_hf_presets × nb_block_ctx clustered-distributions block
- round-238 (parent-dispatch r238) against ISO/IEC FDIS 18181-1:2021 — hf_coeff_histogram_size typed sizing primitive for §C.7.2 + §C.8.3 routing offset
- round-232 (parent-dispatch r232) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup multi-pass HF-header + per-pass histogram_offset routing driver (§C.8.3 first paragraph)
- round-228 (parent-dispatch r228) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup multi-pass three-channel varblock decode driver (§C.8.3 + Table C.6 Passes outer pass loop)
- round-221 (parent-dispatch r221) against ISO/IEC FDIS 18181-1:2021 — three-channel per-LfGroup varblock decode driver (§C.8.3 outer-varblock × inner-X/Y/B sweep)
- round-214 (parent-dispatch r214) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup BlockContext() resolver (§C.8.3 Listing C.13 + §I.2.2 HfBlockContext bundle)
- round-208 (parent-dispatch r208) against ISO/IEC FDIS 18181-1:2021 — per-LfGroup varblock-walk driver (§C.5.4 + §C.8.3)
- round-202 (parent-dispatch r202) against ISO/IEC FDIS 18181-1:2021 — full-row WP state-evolution chain validation across noise-64x64-lossless samples 192..=200
- Hat-2 scrub: replace 'libjxl' decorative-attribution lines with neutral terms
- r195 fix — add serial_test for r195 WP trace tests
Added
-
Round 309 — per-LfGroup VarDCT three-channel spatial-reconstruction
layer (src/residual_plane.rs), lifting the round-306 single-channel
assemble_channel_planeto the X / Y / B level and applying Annex G
chroma-from-luma. New public API:ChannelResidualPlanes(the three XYB
residual planes of one LfGroup, all on the shared padded block grid,
channel order 0 = X / 1 = Y / 2 = B per Listing C.13, withx()/y()
/b()/dims()accessors);assemble_three_channel_planes(grid, residual_at)(walks the shareddct_select::DctSelectGridonce per
channel viaassemble_channel_plane, invoking the caller's
residual_at(channel, &vb)decode closure — in VarDCT mode all three
channels share one DctSelect grid per §C.5.4, and Annex G CfL "is skipped
if any channel is subsampled," so the three planes are geometrically
identical);apply_chroma_from_luma(planes, x_from_y, b_from_y, cfl)
(applies Annex G Listing G.1 in place via the round-138
chroma_from_luma::apply_hf_plane_inplace:X = dX + kX·Y,
B = dB + kB·Y, with(kX, kB)looked up per the 64×64 tile containing
the sample; after the call X-plane holds finalX, B-plane finalB, Y
unchanged); and the one-call driverreconstruct_three_channel_planes( grid, x_from_y, b_from_y, cfl, residual_at)(= assemble + CfL). 9 unit +
5 integration (round309_three_channel_residual_plane, composing the
real F.3-dequant + I.2.3-IDCT walk across all three channels then the
real Annex G CfL end-to-end) tests. Lib tests 788 → 797 (+9).
Pure-control-flow composition primitive — no bit reads, no spec
re-derivation, no histogram materialisation. Gaborish (Annex J.2) + EPF
(Annex J.3) run on the returned planes and remain caller-side concerns.
Source of truth: ISO/IEC FDIS 18181-1:2021 §C.5.4 (DctSelect placement) +
Annex G (chroma-from-luma, Listing G.1) + §F.3 / §I.2.3 (dequant + IDCT). -
Round 306 — per-LfGroup VarDCT residual-plane assembly
(src/residual_plane.rs), the spatial-placement layer directly above
the round-286/293/300block_dequantper-block decode walk. Walks a
dct_select::DctSelectGridviavarblock_walk::VarblockWalkand
writes each varblock'sR × Crow-major residual block (the
block_dequant::decode_block_to_residualoutput) into a single-channel
spatial plane at the varblock's pixel origin(bx · 8, by · 8). New
public API:ResidualPlane(row-majorf32plane sized to the padded
block gridwidth_blocks·8 × height_blocks·8,for_grid/get);
block_pixel_dims(t)(the(R, C)pixel shape from
idct::dct_pixel_dims∪non_dct_pixel_dims, covering every
TransformType);place_block(plane, vb, block)(verbatim copy with
length-mismatch + footprint-spill rejection); and
assemble_channel_plane(grid, residual_at)(raster-order grid walk
invoking the caller's per-varblock decode closure once per top-left
cell, continuation cells skipped, residual-Emptycell rejected). The
plane is the padded block grid (no per-edge clamping; caller crops to
lf_w × lf_h). The geometry invariantC == block_dims().0 · 8/
R == block_dims().1 · 8is pinned for every transform. The IDCT
output already carries the LLF/DC contribution (§I.2.5) so no separate
DC add at placement; chroma-from-luma / Gaborish / EPF run on the
assembled plane and remain caller-side concerns. 14 unit + 5
integration (round306_residual_plane, composing the real F.3-dequant- I.2.3-IDCT walk end-to-end) tests. Lib tests 774 → 788 (+14).
Pure-control-flow geometry primitive — no bit reads, no spec
re-derivation, no histogram materialisation. Source of truth:
ISO/IEC FDIS 18181-1:2021 §C.5.4 (DctSelect placement) + §C.8.3 +
Table I.4 / §I.2.3 (pixel-dims).
- I.2.3-IDCT walk end-to-end) tests. Lib tests 774 → 788 (+14).
-
Round 300 — extend the per-block VarDCT decode walk
(src/block_dequant.rs) to the non-DCT transforms: Hornuss,
DCT2×2, DCT4×4, DCT4×8, DCT8×4, and AFV0..AFV3 — i.e. exactly the set
for whichidct::non_dct_pixel_dimsreturnsSome(all8 × 8).
This lifts the round-293 deferral. The deferral worried that the
AFV / DCT2×2 sub-block coefficient extraction "does not reduce to a
flat identity over an8 × 8grid", but per ISO/IEC FDIS 18181-1:2021
the §I.2.3 sub-block re-mapping happens inside the inverse-transform
dispatch (idct_afv,idct_dct2x2, …), which the spec applies
after the Annex F.3 dequant. The §F.3 dequant stage is uniform: it
multiplies each stored coefficient by a multiplier keyed on "the
channel, the transform type and the coefficient index inside the
varblock". For every non-DCT transform the varblock is the8 × 8
OrderId-1 grid (coeff_order::varblock_size_for_order→(8, 8)),
the dequant matrix is the8 × 8slot matrix
(weights_matrix_dims_for_slot→(8, 8)for slots 1 / 2 / 3 / 9 /
10), and the decoded block is already in raster index space
(coeffs[natural_order[k]],natural_order[k] = y·bwidth + x), so
the per-cell dequant is the identity raster map — exactly as for the
square / rectangular DCT family, with no orientation subtlety
(bwidth == bheight == 8).covered_grid_dimsnow returnsSomefor
everyTransformType;require_covered'sUnsupportedpath now only
guards a hypothetical future variant lacking a pixel-dims mapping.
+3 lib tests (non-DCT all-zero residual census; non-DCT single-coeff
per-sample-formula identity; AFV0..AFV3 shared-slot/grid dequant
equality; chained == manual dequant-then-IDCT for the non-DCT path).
Lib tests 771 → 774. -
Round 293 — extend the per-block VarDCT decode walk
(src/block_dequant.rs) from the three square plain-DCT transforms
to every plain separable-DCT transform: the rectangular
DCT16×8 / DCT8×16 / DCT32×8 / DCT8×32 / DCT32×16 / DCT16×32 family and
the larger DCT64×64 … DCT256×256 family. The round-286 orientation
deferral is lifted by pinning, against ISO/IEC FDIS 18181-1:2021
§I.2.4 + Table I.4 + Annex I.2.3.2, that the decoded coefficient grid
(varblock_size_for_order→(bwidth, bheight),bwidth >= bheight)
and the dequant matrix (weights_matrix_dims_for_slot→
(cols, rows) = (bwidth, bheight)) share one "wide"
bwidth × bheightrow-major layout, which is exactly the
(short × long)"spec coefficient layout"idct_for_transform
already consumes. A rectangular transform and its transpose
(e.g. DCT16×8 / DCT8×16) share one coefficient grid and one dequant
matrix; they differ only in the pixel orientation(R, C)the IDCT
emits, so the per-cell dequant is the identity and no transpose is
needed in this stage. New public APIcovered_grid_dims(t) -> Option<(bwidth, bheight)>(the full plain-DCT covered set, keyed off
dct_pixel_dims);covered_square_dimretained for the square
subset;dequant_block_for_transform/decode_block_to_residual
now accept the whole plain-DCT set. The non-DCT transforms
(Hornuss / DCT2×2 / DCT4×4 / DCT4×8 / DCT8×4 / AFV0..AFV3) stay
Error::Unsupported— their dequant matrix is canonicalised to 8×8
while their IDCT path is the §I.2.3 dispatch, so the sub-block
coefficient extraction does not reduce to a flat 8×8 identity.
+4 unit tests (transpose-pair grid/matrix sharing, full plain-DCT
covered-set census, rectangular all-zero + pure-DC residuals);
lib tests 767 → 771. -
Round 286 — first per-block VarDCT decode-walk stage that reaches
spatial samples (src/block_dequant.rs). Chains the §C.8.3 decoded
quantised-coefficient block through Annex F.3 HF dequantisation and
the Annex I.2.3.2 inverse DCT for the square plain-DCT transforms
(DCT8×8 / DCT16×16 / DCT32×32), where the coefficient grid, the
dequantisation matrix, and the inverse-DCT input all share one
unambiguousdim × dimrow-major layout. New public API:
dequant_block_for_transform(Annex F.3 across the whole raster,
per-cell dequant-matrix entry viaslot_for_transform),
decode_block_to_residual(dequant →idct_for_transform), and
covered_square_dim. Rectangular / non-DCT transforms return
Error::Unsupported, deferred to a follow-up round so their
coefficient-grid-vs-pixel-block orientation can be pinned
independently. 11 unit tests; lib tests 756 → 767.
Fixed
-
Round 281 — two §C.8.3 decode-walk prose-conformance fixes against
ISO/IEC FDIS 18181-1:2021, both affecting the (not-yet-wired)
VarDCT HF coefficient path. (1) Per-varblock channel decode
order is Y, X, then B — the §C.8.3 prose reads "for each
varblock it reads channels Y, X, then B"; rounds 221..264 advanced
the entropy stream X-first. Fixed in
block_context_resolver::decode_varblocks_three_channels_with_resolver
(round 221; also feeds the round-228 multi-pass and round-232
HF-header drivers) and
HfHistogramDecodeContext::decode_three_channel_varblock_for_pass
(round 260; also feeds the round-264 per-LfGroup driver). Output
arrays stay indexed 0 = X / 1 = Y / 2 = B per Listing C.13's
"c is the current channel (with 0=X, 1=Y, 2=B)" — only the
stream-advance order changed. The Listing C.13BlockContext()
channel mapping(c < 2 ? c ^ 1 : 2)(Y → 0, X → 1, B → 2)
independently corroborates Y-first decode order. (2)
NonZeros(x, y)writeback covers every block of the varblock
— the prose reads "The decoder then computes the NonZeros(x, y)
field for each block in the current varblock"; rounds 177..264
wrote only the top-left cell, so a neighbouring varblock's
PredictedNonZeros(x, y)reading a continuation cell of a
multi-cell transform (e.g. the second row/column of a DCT16×16)
saw the zero-init sentinel instead of the varblock's
ceiling-divided value.NonZerosGrid::update_after_block_for_transform
now fills the fullTransformType::block_dims()footprint
(rejecting footprints that spill outside the grid); the
per-channel / per-pass wrappers and every typed driver above them
inherit the fix. Ordering + footprint tests rewritten to the
prose readings acrossround177/round183/round190/
round221/round228suites plus the in-module unit tests; new
rectangular-footprint (DCT16×8 1×2-cell vs DCT8×16 2×1-cell) and
footprint-spill rejection pins. Tests 1156 → 1159. -
Round 278 — the long-standing
noise-64x64-losslessWeighted-
Predictor pixel divergence (rounds 31..272) is FIXED; the fixture
decodes byte-exact on all three planes and the round-10synth_320
drift is gone (102400/102400 pixels correct). Two FDIS Annex E
readings inmodular_fdis::wp_predict, both pinned by the staged
behavioural trace
(docs/image/jpegxl/fixtures/noise-64x64-lossless/wp-trace-sample-194.md):
(1) Listing E.2error2weightperforms the inner
(1 << 24) Idiv ((err_sum >> shift) + 1)division FIRST and
multiplies the truncated quotient bymaxweight(the FDIS-2021
parenthesisation) — the trace's 52 full-precision
(err_sum, weight)cells (samples 188..200) all match this
reading while the previous multiply-first form mismatches 18 of
them; (2) thetrue_errNWread falls back totrue_errNwhen NW
does not exist (x = 0), matching the H.5.2 NW/NE→N edge rule the
err_sum accumulator reads already applied — the previous zero
fallback corrupted every column-0 prediction and produced the
sample-129Δ = -21state-evolution divergence. Root-caused via a
from-scratch Annex E state-evolution sweep over the fixture's
known-correct decoded values across every contested reading knob:
exactly one combination reproduces all 13 traced samples plus the
three known row-2 stored true_err cells (737 / -456 / -165), and
it differs from production only in these two readings. The
production 8x-domainsub_errreading (round 272) is confirmed —
the literal reading now breaks the fixture at plane[0] sample 68.
Newerror2weight_puboracle +tests/r278_error2weight_trace.rs
(3 tests) pin the 52 trace cells and the operand order; 12
historical divergence-pin tests across 6 files
(r32/round10/r126/r195/r202/r272) promoted to
spec/pixel-exact assertions. Tests 1153 → 1156.
Added
-
Round 272 — extracted the Weighted-Predictor post-decode
sub_err_icomputation (FDIS Annex E.1 / §H.5.2) into the named
modular_fdis::sub_err_for(8x-domain magnitude-then-round reading,
used on the decode path) plus amodular_fdis::sub_err_fdis_literal
reference oracle for the literal FDIS-2021 listing reading
abs(((prediction_i + 3) >> 3) - true_value). New
tests/r272_sub_err_reading.rs(4 tests) pins the reading choice as
a regression guard: the two readings coincide for every non-negative
sub-prediction (so both reproduce thenoise-64x64-lossless
sample-194 trace valuesub_err = [122, 59, 18, 36]) but diverge for
negative sub-predictions; and the production decode path must keep
synth_320's round-10 drift anchor at PG[0][0](y=24, x=14)— the
literal reading moves it EARLIER to(y=11, x=104)(decodes the
fixture less far), confirming the 8x-domain reading is the
bisect-validated one. Round 272 also ruled thesub_errreading OUT
as the cause of the residualnoise-64x64-losslesssample-129
Δ = -21WP state-evolution divergence (switching readings leaves
that fixture's divergence profile unchanged). -
Round 264 —
multi_pass_hf_histogram_decoder::HfHistogramDecodeContext::decode_lf_group_three_channels_for_pass
bundled per-LfGroup raster-walk three-channel decode driver for one
pass against ISO/IEC FDIS 18181-1:2021 §C.8.3 — one
(br, p, grid, resolver, qdc_at, predicted_at)call walks the
DctSelectGridin raster order viaVarblockWalk, invokes the
caller's per-varblockqdc_at+predicted_atclosures once per
varblock to read the sharedqdc[3]triple and the per-channel
predicted[3]triple, then composes the round-260
decode_three_channel_varblock_for_passbundled three-channel walk
to yield oneThreeChannelVarblockper top-left cell. Returns the
in-raster-orderVec<ThreeChannelVarblock>per the round-221 / 228
/ 260 type alias. The driver owns both the raster walk and the
§C.7.2 entropy-stream routing through the round-252 typed decode
context — noread_non_zeros/decode_symbolclosures cross the
boundary, only the storage-onlyqdc_at+predicted_atlookups
do. Per-varblock ordering:qdc_atfires beforepredicted_at;
per-LfGroup ordering: row-major (DctSelectGrid raster). Defensive
shape: propagatesVarblockWalk::nexterrors (residualEmpty
cell), closure errors (qdc_ataborts beforepredicted_at;
predicted_aterror aborts before the inner method runs), and any
innerdecode_three_channel_varblock_for_passerror verbatim. On
closure error the per-varblock cursor halts without advancing the
BitReader past the failing call. Empty grid (width × height == 0) yields an empty output vector. 11 unit + 10 integration
(round264_lf_group_three_channels_for_pass) tests pin: 1×1 DCT8×8
short-circuit; 2×2 / 3×3 uniform raster ordering ((0,0), (1,0),
(0,1), (1,1) — row-major); per-varblockqdc → predicted → decode
ordering; per-pass offset routing matches round-260 cluster_map
indexing for bothp = 0andp = 1; mixed-transform grid
(DCT16×16 single varblock covering 2×2 cells) emits one
varblock withcoeffs.len() == 256per channel; out-of-range pass
index rejected; residualEmptycell rejected (VarblockWalk error
propagated); closure errors (qdc_at / predicted_at) propagated
without advancing the BitReader past the failing call; round-trip
withPerPassHfHeaders::readdriven off a real bitstream
preserves per-pass histogram offsets across both passes; empty
grid yields empty vector. Lib tests 742 → 753 (+11). -
Round 260 —
multi_pass_hf_histogram_decoder::HfHistogramDecodeContext::decode_three_channel_varblock_for_pass
bundled three-channel per-varblock walk against ISO/IEC FDIS
18181-1:2021 §C.8.3 — one
(br, p, vb, resolver, qdc, predicted[3])call composes the
round-255 single-channeldecode_block_for_pass_transformthree
times (channel order X = 0 → Y = 1 → B = 2 per the §C.8.3 listing
sequence) against the round-214
BlockContextResolver::resolve(c, vb, qdc)per-channel Listing
C.13block_ctxderivation, returning the per-channel
([DecodedHfBlock; 3], [u32; 3])pair (decoded coefficient bundle
plus the un-dividedraw_non_zerostriple the caller threads into
the per-channel NonZeros-grid bookkeeping). Thenb_block_ctx
invariant is read offresolver.nb_block_ctx()so the caller does
not have to pass it separately; theqdc[3]triple is shared
across the three channels per round-221's per-varblock invariant
(one read, three lookups). Channel ordering is fixed at X → Y → B
— the §C.7.2 entropy stream advances in that order; an error on Y
aborts before B reads, so the B-channel ANS state is not
advanced (matching round-221's error-path invariant). Defensive
shape: propagates anyBlockContextResolver::resolveerror
(channel> 2,sout-of-range, threshold-table inconsistency)
and anydecode_block_for_pass_transformerror (out-of-range
pass index,u32-overflowctx + offset, downstream
EntropyStreamerror, ornon_zeros > size - num_blockscap)
verbatim. 8 unit + 11 integration
(round260_three_channel_varblock_for_pass) tests pin: DCT8×8 /
DCT16×16 / DCT16×8 / DCT8×16 / DCT4×4 per-channel short-circuit
toraw == [0, 0, 0] → coeffs_read == 0 → all-zero coeffs vector of the right length; per-pass offset routing matches round-252
cluster_map indexing for bothp = 0andp = 1against a
2-preset bundle; out-of-range pass index rejected;u32overflow
onctx + offsetrejected; BitReader cursor unchanged on a
short-circuited three-channel block; round-trip with
PerPassHfHeaders::readdriven off a real bitstream preserves
the per-pass histogram offsets across both passes; per-channel
block_ctxvalues resolved by theBlockContextResolverare< nb_block_ctx(= 15) for the default-table bundle. Lib tests 734
→ 742 (+8). -
Round 255 —
multi_pass_hf_histogram_decoder::HfHistogramDecodeContext::decode_block_for_pass_transform
bundled per-varblock decode method closing the round-252 deferred
next-step "per-block raster walk remain caller-side concerns above
this primitive" against ISO/IEC FDIS 18181-1:2021 §C.8.3 + Listing
C.13 + Listing C.14. One(p, t, predicted, block_ctx, nb_block_ctx)call now wires the round-90 Listing C.14 state
machine (prev_nonzero[]tracking,non_zeros == 0early-stop,
non_zeros > size - num_blocksdefensive cap) against the
round-252 per-pass histogram routing for one varblock, returning
the round-90DecodedHfBlockcoefficient bundle plus the un-
dividedraw_non_zerosfor downstream(raw + num_blocks - 1) Idiv num_blocksNonZeros-grid bookkeeping. The internal walk is a
single sequential&mut selfloop because the two underlying
entry points (non_zeros_at,coefficient_at) each need&mut selfand therefore can't be wrapped into the round-90
read_non_zeros_and_decode_block_for_transformclosure pair —
this method is the typed bridge. Defensive shape: rejectsp >= num_passes,ctx + offset > u32::MAX, andnum_blocks == 0/
mismatched natural-order length, all without panicking. 7 unit +
10 integration (round255_decode_block_for_pass_transform) tests
pin: DCT8×8 / DCT16×16 / DCT16×8 / DCT8×16 / DCT4×4 short-circuit
toraw_non_zeros == 0 → coeffs_read == 0 → all-zero coeffs vector of the right length; per-pass offset routing matches round-252
cluster_map indexing; out-of-range pass index rejected;u32
overflow onctx + offsetrejected; BitReader cursor unchanged on
a short-circuited block; round-trip withPerPassHfHeaders::read
driven off a real bitstream preserves the per-pass histogram
offsets. Lib tests 727 → 734 (+7). -
Round 252 —
multi_pass_hf_histogram_decoder::HfHistogramDecodeContexttyped
bridge that wires the round-247HfCoefficientHistograms§C.7.2
entropy stream to the round-232PerPassHfHeadersper-pass
(hfp, histogram_offset)array, closing the round-247 deferred
next-step (the §C.8.3 per-block decode walk through the freshly-
read histograms). Public surface against ISO/IEC FDIS 18181-1:2021:
HfHistogramDecodeContext::new(histograms, headers)validates
per-passhfp < histograms.num_hf_presets()(defensive cross-
container invariant) +headers.num_passes() ≥ 1, then caches the
per-passhistogram_offsetarray for a single-array-index per-
symbol path. Three decode entry-points expose the §C.8.3 prose
shape: (1)decode_symbol_for_pass(br, p, ctx)performs the raw
D[ctx + histogram_offset(p)]routing through
EntropyStream::decode_symbol; (2)non_zeros_at(br, p, predicted, block_ctx, nb_block_ctx)composes
pass_group_hf::non_zeros_context+ the per-pass offset routing,
matching the spec'sD[NonZerosContext(predicted) + offset]line
exactly; (3)coefficient_at(br, p, k, non_zeros, num_blocks, size, prev, block_ctx, nb_block_ctx)composes
pass_group_hf::coefficient_context+ the per-pass offset
routing, matching the spec'sD[CoefficientContext(...) + offset]line, and propagates thenum_blocks == 0rejection
without touching theBitReader. The(ctx + offset)sum is
computed inu64with a defensiveu32overflow check so the
spec-permitted parameter maxima (nb_block_ctx ≤ 256×
hfp < num_hf_presets ≤ 2^28) cannot silently truncate. Accessor
surface:num_passes(),histogram_offset(p),
per_pass_offsets()slice. Adds 10 unit tests + 9 integration
tests (tests/round252_multi_pass_hf_histogram_decoder.rs)
pinning: zero-pass rejection (no decode without passes); per-pass
hfp ≥ num_hf_presetscross-container rejection; per-pass offset
caching matchesPerPassHfHeaders::histogram_offsetindependent
read; single-symbol prefix decode for(p, ctx)matrix consumes
zero bits and returns 0; out-of-range pass index rejection;
u32-overflow synthetichistogram_offsetrejection;
non_zeros_atcomposes cleanly withnon_zeros_context(cross-
checked against the standalone helper);coefficient_atcomposes
cleanly withcoefficient_context(cross-checked against the
standalone helper);num_blocks == 0rejection propagation does
not advance theBitReader; round-trip with
PerPassHfHeaders::readagainst a real bitstream (round-232
derivation) preserves the per-pass offsets. Lib test count
717 → 727 (+10). Pure-control-flow wiring primitive — no spec
re-derivation, no ANS state initialisation, no per-block raster
walk. The per-channelBlockContext()history threading, per-
channel coefficient-order lookup againsthf_pass::HfPass, and
the per-block raster walk remain caller-side concerns above this
primitive. -
Round 247 —
hf_coefficient_histograms::HfCoefficientHistograms
typed wrapper closing the round-238 deferred next-step. Performs
the actual ISO/IEC FDIS 18181-1:2021 §C.7.2 codestream read of the
495 × num_hf_presets × nb_block_ctxclustered-distributions block
by routingHfCoefficientHistogramSize::num_distributions()into
modular_fdis::EntropyStream::readasnum_dist. Two entry-points:
read(br, size)for a caller-built sizing descriptor, and
read_after_hf_pass_sequence(br, num_hf_presets, nb_block_ctx)
for the §C.7.1 → §C.7.2 transition convenience (constructs the
sizing descriptor inline so a caller that has just walked
hf_pass::read_hf_pass_sequencecan drive the §C.7.2 step against
the sameBitReaderwithout a separate constructor call). ANS
state initialisation is deferred toread_ans_state_initper the
round-3 2024-spec correction (theu(32)initialiser is read
between the prelude and the first symbol decode); forwarded
straight through toEntropyStream::read_ans_state_init. Defensive
usize-cap guard onnum_distributions()rejects 32-bit overflow
before theEntropyStream::readcall. Sizing accessors
(num_distributions,offset_for_hfp,num_hf_presets,
nb_block_ctx) forward through the underlying
HfCoefficientHistogramSize.entropy_mut()exposes the
underlying stream for the downstream §C.8.3 per-block decode loop.
Adds 7 unit tests + 6 integration tests
(tests/round247_hf_coefficient_histograms.rs). Lib test count
710 → 717 (+7). Pure wiring primitive — the per-block decode walk
through the freshly-read histograms (Listing C.13 contexts already
landed by rounds 90 / 214 / 221 / 228 / 232) remains the next
deferred step. -
Round 238 —
hf_coeff_histogram_size::HfCoefficientHistogramSize
typed sizing primitive for the §C.7.2 HF coefficient histogram
block. Encapsulates the spec line "Letnb_block_ctxbe equal to
max(block_ctx_map)+1. The decoder reads a histogram with
495 × num_hf_presets × nb_block_ctxclustered distributions D
from the codestream as specified in D.3." behind a single typed
constructor pair (new(num_hf_presets, nb_block_ctx)and
from_block_ctx_map(map, num_hf_presets)), plus accessors
per_preset()(495 × nb_block_ctx),num_distributions()
(495 × num_hf_presets × nb_block_ctx— the §C.7.2 total),
andoffset_for_hfp(hfp)(495 × nb_block_ctx × hfp— the
§C.8.3 per-pass routing offset, withhfp < num_hf_presets
range check). Spec constant published as
PER_PRESET_PER_BLOCK_CTX = 495. Defensive zero-input guards
rejectnum_hf_presets == 0,nb_block_ctx == 0, and empty
block_ctx_map. The duplicated495u64 * num_hf_presets * nb_block_ctxand495u64 * nb_block_ctx * hfparithmetic in
hf_pass::HfPass::readandpass_group_hf::PassGroupHfHeader::read
is now routed through the primitive so the spec constant has one
home and the per-pass offset shares itsnb_block_ctxfactor
with the §C.7.2 read-size derivation. Sizing-only — the actual
§C.7.2EntropyStream::read(br, num_distributions)call against
the clustered-distributions block remains the deferred next step.
Adds 5 unit tests + 6 integration tests
(tests/round238_hf_coeff_histogram_size.rs). Lib test count
705 → 710 (+5). Pure refactor; no wire-format change. (§C.7.2
entropy-stream read itself remains a deferred next step.) -
Round 232 —
multi_pass_hf_header::PerPassHfHeaders+
decode_multi_pass_with_hf_headersper-LfGroup multi-pass driver
with per-passhfpreads + per-passhistogram_offsetrouting
(FDIS §C.8.3 first paragraph). Newmulti_pass_hf_headermodule
wraps the round-228
[multi_pass_decode::decode_multi_pass_three_channels_with_resolver]
driver with the §C.8.3 first-paragraph per-pass header read
hfp = u(ceil(log2(num_hf_presets)))and the derived
histogram_offset = 495 × nb_block_ctx × hfpthe spec writes as
theoffsetterm inD[NonZerosContext(...) + offset]and
D[CoefficientContext(...) + offset].PerPassHfHeaders::read(br, num_passes, num_hf_presets, nb_block_ctx)consumes the
per-pass header sequence by invoking the round-90
[pass_group_hf::PassGroupHfHeader::read] once per pass;
from_headersbuilds the container from a pre-builtVec.
Accessors expose per-passhfp+histogram_offset+ a
PassHfDigestsnapshot. The new driver
decode_multi_pass_with_hf_headersmirrors the round-228 signature
with two augmented closure shapes
read_non_zeros(p, channel, predicted, histogram_offset)/
decode_symbol(p, channel, coeff_ctx, histogram_offset)— the
per-pass histogram_offset is pre-resolved once per pass before the
inner per-varblock walk so the closure body sees a constant offset
across each pass's per-channel calls. Pass count is taken from
headers.num_passes()and verified againstnz.num_passes()
(mismatch returnsError::InvalidData). The companion
read_and_decode_multi_pass_with_hf_headersreads the per-pass
header sequence inline from aBitReaderand invokes the driver
in one call — the entry-point a future round wiring the §C.7.2
entropy histogram bundle (#799 DOCS-GAP) into a per-pass
EntropyStreamwill use. 16 unit + 12 integration
(round232_multi_pass_hf_header) tests pin: per-pass header read
withnum_hf_presets ∈ {1, 2, 4, 8}(single-preset zero-bit fast
path, two-preset one-bit-per-pass, four-preset two-bits-per-pass,
eight-preset three-bits-per-pass with 15 bits across 5 passes);
digest round-trip through bits LSB-first;hfp = 0always yielding
histogram_offset = 0regardless ofnb_block_ctx;
histogram_offsetscaling withnb_block_ctx(495 × 100 =
49500);get/histogram_offset/hfpout-of-range errors;
zero-passes degenerate case yielding an empty container;
PassGroupHfHeader::readnum_hf_presets == 0rejection
propagating throughPerPassHfHeaders::read; the driver routing
the per-pass offset uniformly across all three channels (X / Y / B)
within a pass; bothread_non_zerosanddecode_symbolclosures
receiving the matching per-pass offset (378 = 2 × 3 × 63
decode_symbol calls covering the full DCT8×8k ∈ [num_blocks, size)sweep); per-pass error propagation (pass-1 closure failure
aborts the outer driver);num_passesmismatch
(headers.num_passes() != nz.num_passes()) rejected pre-walk;
pass-distinctqdc_atclosure invocation preserving the round-228
per-passqdc[3]propagation; mixed transformDCT16×8 + 2 DCT8×8layout consistency across passes with distinct per-pass
offsets; inlineread_and_decode_multi_pass_with_hf_headers
end-to-end (header bits consumed exactly, decode walk runs, output
shape matches); inline-read error path (empty BitReader yields a
properError::InvalidDatafromread_bit); per-pass-header
offsets-threaded-through-both-closures invariant verifying
decode_symbolcalls observe the same per-pass offset as
read_non_zerosacross the 2-pass × 3-channel sweep. Lib tests
689 → 705 (+16). Pure-control-flow primitive in the same shape as
round-89 [dct_quant_weights], round-95 [hf_dequant], round-121
[llf_from_lf], round-138 [chroma_from_luma], round-141
[gaborish], round-144 [epf], round-147 [afv::afv_idct],
round-159 / 164 [pass_group_hf], round-177 [non_zeros_grid],
round-183 [per_channel_non_zeros], round-190
[per_pass_non_zeros], round-208 [varblock_walk], round-214
[block_context_resolver], round-221's three-channel driver, and
round-228's multi-pass driver — no bit reads beyond the per-pass
hfpu-read defined by the spec line, no spec re-derivation, no
histogram materialisation, no ANS state setup. A future round
wiring §C.7.2 histograms + per-pass [hf_pass::HfPass] selection
(theselect_pass(passes)method onPassGroupHfHeaderalready
performs the per-pass coefficient-order lookup) can drop this
driver in as the per-LfGroup multi-pass HF-header + histogram-
routing control-flow layer. -
Round 228 —
multi_pass_decode::decode_multi_pass_three_channels_with_resolver
per-LfGroup multi-pass three-channel varblock decode driver (FDIS
§C.8.3 + Table C.6Passes). Newmulti_pass_decodemodule lifts
the round-221 single-pass three-channel driver into an outer
per-pass loop that iteratesp ∈ [0, num_passes), gathering per-
pass [block_context_resolver::ThreeChannelVarblock] vectors in
pass order —out[p][i]is thei-th varblock (raster order)
decoded in passp. The driver readsnum_passesoff
nz.num_passes()(the
[per_pass_non_zeros::PerPassNonZerosGrids] container is the
authoritative pass-count source), walks the
[dct_select::DctSelectGrid] once per pass, invokes the caller's
qdc_at(p, &vb)closure once per varblock per pass (so the
closure may read from a per-pass quantised-LF buffer if the
upstream signal evolves between passes), and threads each
(p, c)call through
[per_pass_non_zeros::PerPassNonZerosGrids::decode_block_at_for_pass_channel].
The per-pass per-channelNonZeros(x, y)bookkeeping is already
isolated byp(round-190 invariant), so the caller does not have
to clear state between passes. Theread_non_zeros(p, channel, predicted)/decode_symbol(p, channel, coeff_ctx)closures take
the pass index as their first argument so the caller can route
each call to the matching per-pass per-channel histogram without
rebinding closures for each pass. The new
MultiPassThreeChannelOutputtype alias names the per-LfGroup
output shape; the newcount_decoded_blocks(grid, num_passes)
helper returnsnum_passes × count_varblocks(grid)for callers
that need to size a downstream coefficient buffer ahead of time
(defensive u64 overflow check on the multiplication). 14 unit +
12 integration (round228_multi_pass_decode) tests pin: single-
pass single-DCT8×8 parity with the round-221 inner driver; 4×4
DCT8×8 grid (16 varblocks) preserving raster order in a single
pass; two-pass 2×2 raster-order per-pass walk; per-passqdc
closure invocation count (3 passes × 4 varblocks = 12 calls, not
36); three-pass per-channel routing isolation with pass-distinct
raw_non_zeros values landing on per-pass writeback cells without
cross-pass leakage; pass error aborts remaining passes (the
outer Vec is discarded on error); pass-0 inner error aborts
before pass-1 starts (pass-1 closure never called); per-pass
predicted invariant (PredictedNonZeros(0, 0) = 32across every
pass + channel); per-passqdc[3]value propagation through the
outer loop; mixed-transform (DCT16×8 + 2 DCT8×8) layout
consistency across passes; pass-1 channel routing read from
pass-1 histogram;count_decoded_blockshelper covers
num_passes ∈ {0, 1, 2, 5, u32::MAX}; DCT16×16 single-block
single-pass pass-through; integration coverage of pass-index
threading through bothread_non_zerosanddecode_symbol
closures; inner-driver mid-varblock error (pass 1, X-channel
decode_symbol failure) propagating through the outer loop.
Lib tests 675 → 689 (+14). Pure-control-flow primitive in the
round-89 / 95 / 121 / 138 / 141 / 144 / 147 / 159 / 164 / 177 /
183 / 190 / 208 / 214 / 221 family; no bit reads, no spec re-
derivation, no histogram materialisation. The follow-up §C.7.2
histogram array (#799 DOCS-GAP) + per-passhfpselection +
per-channelBlockContext()history threading still apply
unchanged — round 228 is purely the outer-loop control-flow
layer above the round-221 inner three-channel driver. -
Round 221 —
block_context_resolver::decode_varblocks_three_channels_with_resolver
three-channel per-LfGroup varblock decode driver (FDIS §C.8.3
prose ordering: outer varblock raster, inner X / Y / B channel
sweep). Walks thedct_select::DctSelectGridonce; computes the
sharedqdc[3]triple once per varblock; invokes
BlockContextResolver::resolvethree times against that shared
qdc(channel order 0 = X → 1 = Y → 2 = B); routes each(p, c)
call through
per_pass_non_zeros::PerPassNonZerosGrids::decode_block_at_for_pass_channel.
Return isVec<ThreeChannelVarblock>= per-varblock
(Varblock, [DecodedHfBlock; 3], [u32; 3])triples in raster
order; per-channel ANS closures are
read_non_zeros(channel, predicted)and
decode_symbol(channel, coeff_ctx)so the caller routes
per-channel histograms inside one closure pair. The new
ThreeChannelVarblocktype alias names the per-varblock output
triple. 11 unit + 12 integration
(round221_three_channel_resolver) tests pin: single-DCT8×8 with
3 per-channel decodes per varblock; 4×4 DCT8×8 grid (16 varblocks)
preserving raster order; single DCT16×16 (1 varblock); qdc
closure invoked exactly once per varblock (= 4 calls for 4
varblocks, NOT 12); strict X / Y / B channel order at each
read_non_zeros/decode_symbolcall site; per-channel
non_zeros writeback at(0, c, 0, 0)with distinct per-channel
raw counts (10 / 20 / 30); per-pass routing (pass = 1 isolated
from pass = 0); qdc error aborts before any per-channel reads;
X-channel error aborts before Y + B reads; mixed-transform
DCT16×8 + 2 DCT8×8placement preserved; customHfBlockContext
(qf_threshold = 5) round-trip; DCT16×16num_blocks = 4
per-channel non_zeros = 4 → 4 decode_symbol calls(4 + 3) / 4 = 1stored.
-
Round 214 —
block_context_resolvermodule (per-LfGroup
BlockContext()resolver, FDIS §C.8.3 Listing C.13 + §I.2.2
HfBlockContextbundle). Exposes the borrow-based
BlockContextResolver::new(&HfBlockContext)wrapper with a
per-varblockresolve(channel, &Varblock, qdc) -> Result<u32>
lookup (appliesorder_id_for_transformfors, threads
hf_mulasqf, forwardsqdc[3]+ the LfGlobal
qf_thresholds/lf_thresholds/block_ctx_mapto the
round-159pass_group_hf::block_contextformula) plus
decode_varblocks_with_resolver(grid, nz, p, c, &resolver, qdc_at, read_non_zeros, decode_symbol)driver that pairs the
round-208VarblockWalkraster-order iterator with the
round-190PerPassNonZerosGrids::decode_block_at_for_pass_channel
per-block primitive. The resolver eliminates the four-argument
(qf_thresholds, lf_thresholds, block_ctx_map, nb_block_ctx)
boilerplate at every per-varblock callsite. 14 unit + 12
integration (round214_block_context_resolver) tests pin:
borrow accessor +nb_block_ctxdefault-15 pass-through;
default-branch(c=0, s=0)/(c=1, s=0)/(c=2, s=0)
DCT8×8 →block_ctx_map[{13, 0, 26}]={7, 0, 7}; DCT16×16 /
DCT32×32 / DCT16×8 / DCT8×16 / Hornuss order-id mapping;
default-branch invariance toqdcandhf_mul(empty
thresholds collapse those knobs); custom-branch
qf_thresholdperturbation; driver pass-through on
single-DCT8×8 / raster-order 2×2 DCT8×8 / single-DCT16×16
grids;qdc_atclosure called once per varblock in walk
order; closure-error propagation. Lib tests 650 → 664 (+14).
Pure-control-flow primitive in the round-89 / 95 / 121 / 138 /
141 / 144 / 147 / 159 / 164 / 177 / 183 / 190 / 208 family; no
bit reads, no spec re-derivation, no histogram materialisation. -
Round 208 —
varblock_walkmodule (per-LfGroup varblock-walk
driver, FDIS §C.5.4 + §C.8.3). Exposes theVarblockdescriptor
({x, y, transform, hf_mul}), the borrow-basedVarblockWalk
raster-order iterator over adct_select::DctSelectGrid(skips
Continuation cells; residual Empty cell errors cleanly), the
count_varblockscell-scan helper, and the typed per-pass
per-channel driverdecode_varblocks_for_pass_channelthat
walks the grid + invokes the caller'sblock_ctx_for_varblock
closure (Listing C.13BlockContext()lookup) + threads each
varblock through
per_pass_non_zeros::PerPassNonZerosGrids::decode_block_at_for_pass_channel.
Returns the in-raster-orderVec<(Varblock, DecodedHfBlock, raw_non_zeros)>triple. 14 unit + 12 integration
(round208_varblock_walk) tests pin single-DCT8×8 / raster-order
4×4 / DCT16×16-covers-2×2 / mixed-transform placement order /
count-vs-walk parity / residual-Empty error / all-Continuation
tolerance / hf_mul top-left read / typed driver per-pass
per-channel routing isolation / closure-error propagation /
DCT16×16 typed-driver pass-through / multi-varblock distinct
hf_mul. Lib tests 636 → 650 (+14). Pure-control-flow primitive
in the round-89 / 95 / 121 / 138 / 141 / 144 / 147 / 159 / 164 /
177 / 183 / 190 family; no bit reads, no spec re-derivation, no
histogram materialisation. -
Round 202 —
tests/r202_wp_row3_chain.rs(7 tests) widens the
round-191 / round-195 weighted-predictor diagnostic from a
one-sample pin into a full-row chain acrossnoise-64x64-lossless
samples 192..=200, validating the production WP state against the
trace doc's surrounding-sample context table
(wp-trace-sample-194.mdlines 130-168). New finding: the WP
divergence is already large at sample 192 (Δ pred8 = -50,
Δ stored = -50), before the round-191-pinnedΔ pred8 = +8at
sample 194. Tests pin in-row + cross-row read chains, sample 192's
left-border zeroing, sample 194's cross-row reads, and the
production decoded valuev(194) = 35.