feat(encoder): strategy-aware literal gates (G4 + G5)#182
Conversation
…y possible The previous wording claimed the bail-out routes blocks that "would have been raw-fallbacked by the real emit anyway, by the same threshold". That is true ONLY for the single-partition path the bail-out routes to — it is NOT a proof that a recursive split could never beat the whole-block outcome. In principle both sub-blocks could compress strictly (no raw-fallback in either half) and produce `cost(first) + cost(second) < source_len + 3`. The wider donor band gives at most `min_gain` bytes of theoretical recoverable ratio per block — small, but non-zero. Comment split into two paragraphs: 1. The "no wire-output drift" guarantee (only applies to the single-partition path the bail-out actually takes). 2. The "what we forgo by skipping the split case" caveat — both sub-blocks would need to compress strictly AND beat the whole- block cost; band is bounded by `min_gain`, empirically zero delta in the bench matrix. No behavior change. Doc-only.
Donor `ZSTD_minLiteralsToCompress` (`zstd_compress_literals.c:114-127`) floors literal-compression entry at `8 << shift` bytes with shift derived from strategy index (fast/dfast/greedy/lazy=3, btopt=2, btultra=1, btultra2=0) and 6-byte floor on reused HUF tables. Previously hardcoded `< 8` floor — same lower bound but ignored strategy: btultra2 missed the 8-byte tail compression opportunity, btopt missed 32-byte band, etc. Donor `ZSTD_minGain` (`zstd_compress_internal.h:677-684`) requires compressed literals to beat raw by `(srcSize >> minlog) + 2` margin where minlog=6 for fast..btopt, 7 for btultra, 8 for btultra2. Previously bare `>= raw_section_bytes` — overcompressed micro-wins that fell back to raw on real emit, biasing splitter cost predictions. `CompressState.strategy_tag` plumbed through frame compressor, streaming encoder, and `CompressedBlockScratch::scratch_state` so emit and estimator paths use identical gates — keeps splitter probe costs in sync with actual emit decisions. - Verified 514/514 lib tests pass, clippy + fmt clean - Full ratio matrix (203 cells): 0 deltas vs main — gates donor-equivalent on current fixtures - Block-level `min_gain = (srcSize >> 8) + 2` (btultra2 margin) kept uniform at `SplitEstimator::estimate_subblock_size` / `emit_single_sequence_block` — migration to strategy-aware block-level gain is a separate cleanup Related: #23
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds strategy-aware literal-compression gating: CompressState carries a StrategyTag; new helpers compute per-strategy min-literals and min-gain; encoder and estimator use the same branch order (RLE first, then strategy gate) and donor-style payload-vs-srcSize raw-vs-compressed fallback; tests updated. ChangesStrategy-aware literal compression gating
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
zstd/src/encoding/blocks/compressed.rs (1)
530-537:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEstimator RLE gate does not mirror emitter for short literal runs.
estimate_literals_section_bytesallows RLE on all-identical literals without the emitter’slen >= 8guard. With reused HUF tables (min_lits == 6), 6–7 byte runs can be costed as RLE in estimator but not emitted as RLE, which breaks estimator/emit parity.Proposed parity fix
fn estimate_literals_section_bytes( literals: &[u8], last_huff: &mut Option<huff0_encoder::HuffmanTable>, counts: &mut [usize; 256], strategy: crate::encoding::strategy::StrategyTag, ) -> usize { // Mirror `encode_block_parts_with_sequence_scratch` literal-mode branches. let min_lits = min_literals_to_compress(strategy, last_huff.is_some()); - if literals.len() < min_lits { - *last_huff = None; - return uncompressed_literals_header_bytes(literals.len()) + literals.len(); - } - if all_bytes_identical(literals) { + if literals.len() >= 8 && all_bytes_identical(literals) { *last_huff = None; return uncompressed_literals_header_bytes(literals.len()) + 1; } + if literals.len() < min_lits { + *last_huff = None; + return uncompressed_literals_header_bytes(literals.len()) + literals.len(); + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@zstd/src/encoding/blocks/compressed.rs` around lines 530 - 537, The estimator allows RLE for any all-identical literal slice but the emitter only emits RLE for runs len >= 8, causing parity errors; in estimate_literals_section_bytes (the block using min_literals_to_compress, last_huff, all_bytes_identical, and uncompressed_literals_header_bytes) add the same short-run guard the emitter uses by requiring literals.len() >= 8 before treating an all-identical run as RLE, and otherwise fall through to the non-RLE path (clearing last_huff and returning uncompressed header + length or 1 as currently done).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@zstd/src/encoding/blocks/compressed.rs`:
- Around line 530-537: The estimator allows RLE for any all-identical literal
slice but the emitter only emits RLE for runs len >= 8, causing parity errors;
in estimate_literals_section_bytes (the block using min_literals_to_compress,
last_huff, all_bytes_identical, and uncompressed_literals_header_bytes) add the
same short-run guard the emitter uses by requiring literals.len() >= 8 before
treating an all-identical run as RLE, and otherwise fall through to the non-RLE
path (clearing last_huff and returning uncompressed header + length or 1 as
currently done).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 54afb8a4-0802-4612-b2a5-29a28af9eae1
📒 Files selected for processing (4)
zstd/src/encoding/blocks/compressed.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/streaming_encoder.rs
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Emitter only emits RLE when `len >= 8 && all_identical` (`encode_block_parts_with_sequence_scratch` RLE branch). After G4 landed, estimator's `min_lits` can be 6 on reused HUF tables, so a bare `all_identical` check let estimator predict a 1-byte RLE payload for 6-7 byte all-identical literals that the emitter would actually write as raw — splitter probe costs drift from real emit output. Mirror the same `len >= 8` guard explicitly so estimator and emit stay byte-equivalent on every input. Reported by CodeRabbit on PR #182.
There was a problem hiding this comment.
Pull request overview
This PR threads encoder strategy information into compression state so literal-section emit and estimator paths can apply strategy-aware donor gates for minimum literals and minimum gain.
Changes:
- Adds
strategy_tagtoCompressStateand initializes/syncs it in frame and streaming encoders. - Adds
min_literals_to_compressandmin_gainhelpers in compressed-block encoding. - Applies the new gates to literal compression and estimator paths, including post-split estimator state.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
zstd/src/encoding/streaming_encoder.rs |
Initializes streaming encoder compression state with a strategy tag. |
zstd/src/encoding/levels/fastest.rs |
Updates tests to construct CompressState with the new field. |
zstd/src/encoding/frame_compressor.rs |
Adds and synchronizes CompressState::strategy_tag for frame compression. |
zstd/src/encoding/blocks/compressed.rs |
Implements and applies strategy-aware literal compression gates. |
Comments suppressed due to low confidence (4)
zstd/src/encoding/blocks/compressed.rs:535
- The estimator no longer mirrors the emit path for small RLE literal sections. For strategies whose
min_litsis greater than 8, an 8..63 byte all-identical literal section is emitted as RLE byencode_block_parts_with_sequence_scratch, but this branch returns a raw-literals cost before checkingall_bytes_identical, so splitter decisions are based on the wrong size/state.
let min_lits = min_literals_to_compress(strategy, last_huff.is_some());
if literals.len() < min_lits {
*last_huff = None;
return uncompressed_literals_header_bytes(literals.len()) + literals.len();
}
if all_bytes_identical(literals) {
zstd/src/encoding/blocks/compressed.rs:596
- This does not match the donor
cLitSize >= srcSize - minGaingate quoted below.totalincludes the zstd literals-section header, andraw_section_bytesincludes the raw header, so the fallback boundary is shifted by the header-size difference instead of applyingminGainto the literal source size/HUF payload. That can make the estimator choose raw in cases donor would still use compressed literals.
let raw_section_bytes = uncompressed_literals_header_bytes(literals.len()) + literals.len();
let mg = min_gain(literals.len(), strategy);
if total >= raw_section_bytes.saturating_sub(mg) {
zstd/src/encoding/blocks/compressed.rs:1762
- This fallback still compares the full on-wire literals section against the raw section with
minGainsubtracted, rather than applying the quoted donor check tocLitSizeversussrcSize - minGain. Becausetotal_lenincludes the compressed literals header andraw_section_bytesincludes the raw header, the decision boundary is offset by header sizes and can fall back to raw when donor would emit compressed literals.
let raw_section_bytes = uncompressed_literals_header_bytes(literals.len()) + literals.len();
// Donor `compress_literals` (`zstd_compress_literals.c:187-188`):
// `cLitSize >= srcSize - minGain` → raw fallback. `minGain` is
// strategy-aware (`min_gain` helper above; ~1.5% for fast..btopt,
// ~0.78% for btultra, ~0.39% for btultra2). Previously this check
// used bare `>= raw_section_bytes` (zero margin) which let
// marginally-compressing blocks emit huf when raw + reused table
// would compress better in the next block. Saturating subtraction
// covers tiny inputs where `raw_section_bytes < min_gain`.
let mg = min_gain(literals.len(), strategy);
if total_len >= raw_section_bytes.saturating_sub(mg) {
zstd/src/encoding/blocks/compressed.rs:537
- When a previous HUF table exists,
min_litsis 6, so 6- or 7-byte all-identical literal sections reach this branch and are costed as RLE. The real emit path only takes the RLE shortcut forlen >= 8, andcompress_literalsdoes not have an RLE fallback for these sub-8 inputs, so estimator probes undercount these sections and can choose different splits than the encoder emits.
let min_lits = min_literals_to_compress(strategy, last_huff.is_some());
if literals.len() < min_lits {
*last_huff = None;
return uncompressed_literals_header_bytes(literals.len()) + literals.len();
}
if all_bytes_identical(literals) {
*last_huff = None;
return uncompressed_literals_header_bytes(literals.len()) + 1;
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@zstd/src/encoding/blocks/compressed.rs`:
- Around line 529-544: The estimator currently checks min_literals_to_compress
first which can return raw for 8-byte identical literals when emit would choose
RLE; in the function (mirror of encode_block_parts_with_sequence_scratch) swap
the order so the all_bytes_identical && literals.len() >= 8 RLE check runs
before the min_lits check: if it matches, set *last_huff = None and return
uncompressed_literals_header_bytes(literals.len()) + 1, otherwise fall through
to the min_literals_to_compress check and the existing raw path; keep references
to last_huff, min_literals_to_compress, all_bytes_identical and
uncompressed_literals_header_bytes so behavior matches the emitter exactly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1f9e826e-9867-43a6-addb-5bc2245623c6
📒 Files selected for processing (1)
zstd/src/encoding/blocks/compressed.rs
Emit checks the RLE pre-shortcut (`len >= 8 && all_identical`) BEFORE the strategy-aware `min_lits` gate, so an all-identical 8-byte literal section emits a 2-byte RLE block under every strategy — including fast/dfast/greedy/lazy where `min_lits == 64`. The estimator previously gated `min_lits` first, predicting raw for those inputs while emit wrote RLE — splitter probe costs diverged. Swap the order in `estimate_literals_section_bytes` so both paths agree byte-for-byte; update emit's RLE-pre-check comment to be accurate about the donor divergence. Add unit tests covering: - `min_literals_to_compress` per-strategy values with and without HUF reuse (donor table from `zstd_compress_literals.c:114-127`) - `min_gain` per-strategy margins (`zstd_compress_internal.h:677-684`) including small-source `(src >> minlog) + 2` edge - Estimator/emit parity for boundary literal lengths around each strategy's `min_lits` and across the RLE `>= 8` cliff Verified 531/531 lib tests pass; ratio matrix (203 cells) shows 0 delta vs main.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
zstd/src/encoding/blocks/compressed.rs:1773
- This compares the compressed literals size to
raw_section_bytes - min_gain, but the donor check documented above is againstsrcSize - minGain(the literal payload length, without the raw header). Adding the raw header makes the gate 1–3 bytes too permissive, so marginal Huffman sections can be emitted where the strategy-aware min-gain rule should fall back to raw and clear the table state.
if total_len >= raw_section_bytes.saturating_sub(mg) {
zstd/src/encoding/blocks/compressed.rs:543
- This RLE mirror keeps the old
len >= 8cutoff, but the new strategy gate can attempt Huffman at length 6 whenlast_huffis present. If the emitter is corrected to RLE 6–7 byte all-identical sections with a reusable table, the estimator needs the same threshold or split probes will overestimate those partitions.
if literals.len() >= 8 && all_bytes_identical(literals) {
*last_huff = None;
return uncompressed_literals_header_bytes(literals.len()) + 1;
Two related parity fixes to the literal-section pipeline: 1. min_gain comparison now matches donor formula. Previously compared `total_len >= raw_section_bytes - mg` (compressed lhSize on LHS, raw lhSize on RHS), which skewed the threshold by `compressed_header - raw_header` bytes and forced raw fallback on marginally-winning compressed sections that donor would keep. Switch to donor's `cLitSize >= srcSize - minGain` form (`zstd_compress_literals.c:187-188`): compare `(total_len - compressed_header_len) >= literals.len() - mg`. Applied in both estimator (`estimate_literals_section_bytes`) and emit (`compress_literals`) so probe costs stay in lockstep. 2. RLE pre-check fires for any non-empty all-identical literal section, not just `len >= 8`. The old `len >= 8` floor caused 6/7-byte all-identical sections with a reusable HUF table (`min_lits == 6`) to skip RLE and emit a larger HUF block — donor reaches the same RLE block via `cLitSize == 1` after the `min_lits` gate. Dropping the floor matches donor on `>= min_lits` inputs and produces strictly smaller output than donor on the `< min_lits` all-identical edges (where donor falls through to raw). Applied symmetrically in estimator and emit. 531/531 lib tests pass; ratio matrix (203 cells via REPORT lines) shows 0 delta vs main — both fixes are ratio-safe on current fixtures.
The parity test previously only seeded `last_huff_table: None`, so the floor-6 HUF reuse path (`ZSTD_minLiteralsToCompress` with valid prior HUF table) and 6/7-byte all-identical sections that newly route through the unconditional RLE pre-check were not exercised in estimator/emit parity. Add cases: - Sub-`min_lits` all-identical (`len in 1..=7` at fast strategy): exercises the new "any non-empty all-identical → RLE" path - Seeded HUF table with `Lazy` strategy at 6/7-byte all-identical and 16-byte non-identical: exercises the `min_lits == 6` floor and the HUF-reuse decision under the new RLE pre-check Also update the in-test docstring to drop the stale "RLE `>= 8` cliff" phrasing — the cliff was removed when RLE became unconditional for non-empty all-identical inputs. 531/531 lib tests pass.
…igin `donor_clit` told the reader where the formula came from, not what the value represents. Rename to `huf_section_size` (tree description plus HUF payload, excluding the literals lhSize) — describes the bytes themselves so the call site is readable without recalling reference naming conventions. Same rename in both the emit (`compress_literals`) and estimator (`estimate_literals_section_bytes`) sides. Test names follow the same principle: - `min_literals_to_compress_matches_donor_table` → `min_literals_to_compress_returns_per_strategy_floor` - `min_gain_matches_donor_margin` → `min_gain_returns_per_strategy_margin` Behavior unchanged; 531/531 lib tests pass, clippy/fmt clean, ratio matrix 0 deltas vs main.
Extract `use_raw_literal_fallback(huf_section_size, literals_len, strategy)` helper so `compress_literals` (emit) and `estimate_literals_section_bytes` (probe) share one decision point — neither side can drift back to an on-wire-vs-on-wire comparison that inflates the threshold by the lhSize delta and rejects marginally compressed sections. Add `use_raw_literal_fallback_uses_payload_vs_srcsize_threshold` test that walks the 2-byte gap (`compressed_lhsize - raw_lhsize`) at `literals.len() = 20` to lock in the payload-vs-srcSize semantics. Tighten the RLE shortcut docstring: RLE equals raw at `len == 1` (both 2 bytes), strictly smaller only at `len >= 2`. 532/532 lib tests, clippy/fmt clean.
`min_literals_to_compress` Lazy mapping: drop the "most-aggressive-of-band" phrasing — donor's shift table pins strategies 1..6 to `shift = 3` uniformly, so Lazy → 64-byte floor regardless of which donor index in the 4..6 band we'd nominally pick. No aggressiveness gradient exists within this band. `min_gain` doc: split the donor formula (gates both block-level and literal-section in donor) from the current crate usage (helper is wired into literal-section gates only; block-level emit and probe still use uniform `(source_len >> 8) + 2`). Previous wording conflated the two and read as if the helper were already plumbed everywhere.
`StreamingEncoder::ensure_frame_started` resets the matcher and clears last_huff / fse_tables for the new frame but did not refresh `state.strategy_tag` from the active compression level. That left the literal-compression gates (`min_literals_to_compress`, `min_gain`) reading the construction-time strategy on every subsequent frame, which would silently leak the wrong gate floor if the encoder were ever extended with a between-frame level change. Mirror `FrameCompressor::compress`'s reset-time sync at the streaming reset site so both entry points are byte-equivalent at the gate level. Tighten the `CompressState.strategy_tag` invariant docstring to enumerate the two owned reset sites explicitly. Add a regression test that exercises the streaming reset across four levels and asserts the synced tag matches `StrategyTag::for_compression_level`. 533/533 lib tests pass; clippy and fmt clean.
…y_tag The streaming reset-time sync test only exercises a single fixed level per encoder, which would pass even with the reset-time sync deleted because the construction-time tag already matches. Add a regression test that drives the level-change path explicitly: construct `FrameCompressor` at `CompressionLevel::Fastest`, call `set_compression_level(Best)` to cross from the Fast band into the BtUltra2 band, run a full `compress()` cycle, then verify `state.strategy_tag` reflects the new level's `StrategyTag::for_compression_level` resolution. Includes an `assert_ne!` fixture invariant that the two chosen levels resolve to different `StrategyTag` variants, so a future renumbering that collapses the bands would be loud about it rather than turning the test into a no-op. 534/534 lib tests pass; clippy and fmt clean.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
zstd/src/encoding/blocks/compressed.rs:566
- This repeats the small-section byte formula even though the estimator branch applies to all literal lengths; for lengths 32+ the headers are not 1 byte, so the comment should qualify the comparison while keeping the correct "RLE is smaller" conclusion.
// (any non-empty section) BEFORE the `min_lits` gate — on
// all-identical inputs RLE is equal to raw at `len == 1` (both
// 2 bytes) and strictly smaller for `len >= 2`, so it is never
// worse than raw and is selected regardless of strategy. Estimator
`min_literals_to_compress` doc: replace the "≥30 bytes" HUF tree-description lower bound with neutral phrasing — small alphabets with low max symbols can serialize substantially smaller descriptions. The exact byte cost depends on alphabet size and max symbol, not a fixed minimum; the surrounding sentence already conveys that overhead-dominates-payload is what drives the per-strategy floor. RLE-shortcut docstrings (emit and estimator branches): replace "2 bytes vs `1 + len`" with the header-helper-aware form. Both RLE and raw use the same `uncompressed_literals_header_bytes(len)` (1/2/3/5 bytes by length tier), so the size relationship is RLE = lhSize + 1, raw = lhSize + len → equal at `len == 1`, RLE smaller by `len - 1` for `len >= 2`. The previous wording was only correct for the `len < 32` tier where lhSize = 1. No code changes; clippy and fmt clean.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (3)
zstd/src/encoding/blocks/compressed.rs:414
- The literals header size tiers listed here include a 5-byte case, but
write_uncompressed_literals_header/uncompressed_literals_header_bytesin this file only encode raw/RLE literal headers as 1, 2, or 3 bytes. Keeping the extra tier in the explanation makes the RLE-vs-raw size argument harder to verify.
// reuse). RLE and raw share the same lhSize for a given `len`
// (both use `uncompressed_literals_header_bytes`), so RLE = lhSize + 1
// and raw = lhSize + len. That makes RLE equal to raw on `len == 1`
// and smaller by exactly `len - 1` bytes for `len >= 2`, regardless of
// the lhSize tier (1 / 2 / 3 / 5 bytes). Our pre-check fires for ANY
zstd/src/encoding/blocks/compressed.rs:572
- This duplicates the same inaccurate header-size tier list: raw/RLE literal headers in this file are 1, 2, or 3 bytes, not 1/2/3/5. Please align the estimator comment with
uncompressed_literals_header_bytesso future parity work does not reason from the wrong size table.
// (any non-empty section) BEFORE the `min_lits` gate — RLE and raw
// share `uncompressed_literals_header_bytes(len)` (1/2/3/5 bytes by
// length tier), so on all-identical inputs RLE = lhSize + 1 equals
// raw = lhSize + len at `len == 1` and is smaller by `len - 1` for
// `len >= 2`. RLE is never worse than raw, so it is selected
zstd/src/encoding/blocks/compressed.rs:2003
- This repeats the inaccurate historical claim for the 6/7-byte HUF-reuse cases. Before this PR, the hardcoded
len >= 8gate meant these cases did not enter HUF compression; they were emitted raw, so the comment should describe the new lowered-floor scenario rather than prior behavior.
// HUF reuse path: floor drops to 6, so 6/7-byte sections
// previously hit the HUF compress path. With the RLE
// pre-check now non-empty-only, all-identical 6/7-byte
// sections must route through RLE and stay byte-equivalent
…out the sync Three factual cleanups so the literal-gate tests provably guard the reset-time `strategy_tag` sync and the docstrings around the RLE pre-check stop misrepresenting historical behavior. - `set_compression_level_then_compress_refreshes_strategy_tag`: switch from `CompressionLevel::Best` to `CompressionLevel::Level(20)` so the post-switch tag genuinely crosses into the BtUltra2 band. `Best` resolves to `Lazy` today, which keeps `min_literals_to_compress` in the same `shift=3 → 64-byte` band as `Fast` and weakens the signal. Add an explicit `assert_eq!(expected, StrategyTag::BtUltra2)` fixture invariant so future renumbering breaks loudly. - `ensure_frame_started_refreshes_stale_strategy_tag_at_reset` (renamed): the previous version constructed the encoder at the same level it asserted, so `StreamingEncoder::new`'s own init satisfied the assertion even if the reset-time sync were deleted. Now the test deliberately corrupts `state.strategy_tag` to a sentinel (`BtUltra2`) before the first write and asserts the reset path restores the legitimate tag — a missing sync leaves the sentinel visible and trips the assertion. - RLE pre-check docstring on the emit side: drop the "6/7 bytes with HUF reuse" example. With HUF reuse `min_lits == 6`, so 6/7-byte sections are NOT below the gate; donor would HUF-encode them and reach RLE via `cLitSize == 1`. The justification for the pre-check is the genuine under-gate range (e.g. 8..63 bytes at fast/dfast/greedy/lazy without HUF reuse), not the HUF-reuse band. - Estimator/emit parity test prose: the historical behavior for 6/7-byte all-identical sections under the prior hardcoded `len >= 8` gate was RAW (not HUF). Correct both the per-case rationale and the block-level docstring; the test cases themselves are unchanged. 534/534 lib tests pass; clippy and fmt clean.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
Summary
Donor-parity
ZSTD_minLiteralsToCompress(G4) andZSTD_minGain(G5)wired into literal-section gates on both emit and estimator paths.
Companion to G3 (whole-block bail-out, #181).
G4 —
ZSTD_minLiteralsToCompress(donorzstd_compress_literals.c:114-127)Strategy-aware floor
8 << shift, 6-byte floor on reused HUF tables:Previously hardcoded
< 8— same lower bound as btultra2 but ignoredstrategy.
G5 —
ZSTD_minGain(donorzstd_compress_internal.h:677-684)Strategy-aware expansion margin
(srcSize >> minlog) + 2:G5 comparison uses donor formula
cLitSize >= srcSize - minGainform(payload + tree_desc vs literal payload length, no headers on either
side). Previously bare
>= raw_section_bytes(zero margin) and earlieriteration of
>= raw_section_bytes - mg(which skewed by header diff)forced fallback on marginally-winning compressed sections that donor
would keep.
Plumbing + estimator/emit parity
CompressState.strategy_tagthreaded throughFrameCompressor::new/new_with_matcher/resetand synced intoCompressedBlockScratch'sscratch_state. Reset-time sync happens at both reset sites:FrameCompressor::compressandStreamingEncoder::ensure_frame_started.Centralized
use_raw_literal_fallbackhelper shared by emit(
compress_literals) and probe (estimate_literals_section_bytes) —single decision point, no drift.
Emit's RLE pre-shortcut fires for any non-empty all-identical literal
section (was gated by
len >= 8). RLE and raw share the same lhSizefor a given
len, so RLE = lhSize + 1 equals raw = lhSize + len atlen == 1and is smaller bylen - 1bytes forlen >= 2. Thepre-check now correctly covers:
len < min_litsall-identical (donor falls through to raw — we win)len >= min_litsall-identical (donor reaches the same RLE blockvia
cLitSize == 1after compress — same wire output)The pre-check runs BEFORE the
min_litsgate, andestimate_literals_section_bytesmirrors this order so splitter probecosts match real emit byte-for-byte.
Verification
min_literals_to_compress_returns_per_strategy_floor— per-strategyfloor + HUF reuse
min_gain_returns_per_strategy_margin— per-strategy margin +small-source edge
use_raw_literal_fallback_uses_payload_vs_srcsize_threshold—walks the 2-byte gap between the donor payload-vs-srcSize formula
and the on-wire-vs-on-wire form at
literals.len() = 20estimator_literals_section_mirrors_emit_for_short_inputs—byte-equivalent estimator/emit on boundary lengths around each
strategy's
min_lits, sub-min_litsall-identical (RLEpre-check), and the HUF-reuse path with seeded
last_huff_tableensure_frame_started_refreshes_stale_strategy_tag_at_reset—streaming reset-time sync: deliberately corrupts
state.strategy_tagto a sentinel before the first write,asserts
ensure_frame_startedrestores the correct tagset_compression_level_then_compress_refreshes_strategy_tag—FrameCompressor::set_compression_levelfromFastesttoLevel(20)(Fast band → BtUltra2 band) followed bycompress()must produce the new band's resolved tag
-D warnings— cleanvs main (all p > 0.4 in latest clean run)
Rust ~10× faster than FFI; fast-skip preserved
separately as next-up lazy investigation, not introduced by this
PR)
Notes / Follow-ups
SplitEstimator::estimate_subblock_size/emit_single_sequence_blockstill uses uniform(source_len >> 8) + 2(btultra2 value applied across all strategies). The
min_gainhelperis currently wired only into literal-section gates; migration to
the block-level sites is a separate cleanup.
HUF_flags_preferRepeat), G1(superblock shared entropy), G2 (cost-budgeted carve), G7 (extend
post-split to lazy band) — tracked in feat: block splitting for improved compression ratio #23.
Related: #23
Summary by CodeRabbit