Stabilize run_client delete test timestamps#2065
Merged
Merged
Conversation
This was referenced May 5, 2026
oferchen
added a commit
that referenced
this pull request
May 6, 2026
* feat(match): add zsync-inspired seq-match extend-run
Coalesces consecutive matched basis blocks at the DeltaScript layer into
a single fat `DeltaToken::Copy { len = run * block_length }`, mirroring
zsync's `next_match` shortcut from `librcksum/rsum.c:262`. The wire
layer (`script_to_wire_delta`) expands fat Copy tokens back into one
DeltaOp per basis block, so the wire byte stream stays byte-identical
to the no-coalesce baseline (closes #2065).
Adds `DeltaSignatureIndex::extend_run(start_block_index, target,
max_blocks)` as the public helper and pins the post-coalesce token
stream with a golden-byte regression test in
`crates/match/tests/seq_match_golden.rs` (closes #2066). Updates
shifted-insertion and sparse-match fixtures to expand fat Copy runs
before asserting per-block contiguity.
Wire-compat invariants from `docs/design/zsync-seq-match.md`:
- One write_int(-(block_index + 1)) per basis block on the wire.
- CPRES_ZLIB dictionary sync still feeds one block per match call.
- `apply_delta` and `compute_file_checksum` already honour `len`,
no semantic changes downstream.
* fix(match): pad extend_run test data to multiple of DEFAULT_BLOCK_SIZE
* fix(test): assert copied bytes instead of token count for repeated blocks
The seq-match extend_run helper coalesces consecutive matching blocks
into a single COPY token, so the prior token-count assertion no longer
holds. Asserting that all input bytes are covered by COPY tokens (and
zero literal bytes) preserves the test's intent without depending on
the internal coalescing strategy.
* fix(test): size synthetic basis to exact multiple of DEFAULT_BLOCK_SIZE
Both seq_match_emits_single_fat_copy_for_full_basis_run and
seq_match_matched_bytes_match_baseline assume every basis block is
full-length so that extend_run can walk the whole basis in a single
fat-copy and matched bytes equal block_count * block_length. With
65 536-byte input and 700-byte blocks, the trailing 536-byte partial
block defeated those invariants. Sizing to 700 * 94 = 65 800 bytes
keeps the basis well under the < 700^2 byte threshold (so the layout
still picks 700) and removes the partial trailing block.
oferchen
added a commit
that referenced
this pull request
May 18, 2026
* feat(match): add zsync-inspired seq-match extend-run
Coalesces consecutive matched basis blocks at the DeltaScript layer into
a single fat `DeltaToken::Copy { len = run * block_length }`, mirroring
zsync's `next_match` shortcut from `librcksum/rsum.c:262`. The wire
layer (`script_to_wire_delta`) expands fat Copy tokens back into one
DeltaOp per basis block, so the wire byte stream stays byte-identical
to the no-coalesce baseline (closes #2065).
Adds `DeltaSignatureIndex::extend_run(start_block_index, target,
max_blocks)` as the public helper and pins the post-coalesce token
stream with a golden-byte regression test in
`crates/match/tests/seq_match_golden.rs` (closes #2066). Updates
shifted-insertion and sparse-match fixtures to expand fat Copy runs
before asserting per-block contiguity.
Wire-compat invariants from `docs/design/zsync-seq-match.md`:
- One write_int(-(block_index + 1)) per basis block on the wire.
- CPRES_ZLIB dictionary sync still feeds one block per match call.
- `apply_delta` and `compute_file_checksum` already honour `len`,
no semantic changes downstream.
* fix(match): pad extend_run test data to multiple of DEFAULT_BLOCK_SIZE
* fix(test): assert copied bytes instead of token count for repeated blocks
The seq-match extend_run helper coalesces consecutive matching blocks
into a single COPY token, so the prior token-count assertion no longer
holds. Asserting that all input bytes are covered by COPY tokens (and
zero literal bytes) preserves the test's intent without depending on
the internal coalescing strategy.
* fix(test): size synthetic basis to exact multiple of DEFAULT_BLOCK_SIZE
Both seq_match_emits_single_fat_copy_for_full_basis_run and
seq_match_matched_bytes_match_baseline assume every basis block is
full-length so that extend_run can walk the whole basis in a single
fat-copy and matched bytes equal block_count * block_length. With
65 536-byte input and 700-byte blocks, the trailing 536-byte partial
block defeated those invariants. Sizing to 700 * 94 = 65 800 bytes
keeps the basis well under the < 700^2 byte threshold (so the layout
still picks 700) and removes the partial trailing block.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
run_client_delete_removes_extraneous_entriespins source and destination mtimes so the fresh content is always recopied before pruning extraneous filesTesting
Codex Task