Skip to content

feat: add buffer recycling to checksum pipeline#3979

Merged
oferchen merged 2 commits into
masterfrom
feat/double-buffer-pipeline-1759
May 13, 2026
Merged

feat: add buffer recycling to checksum pipeline#3979
oferchen merged 2 commits into
masterfrom
feat/double-buffer-pipeline-1759

Conversation

@oferchen
Copy link
Copy Markdown
Owner

Summary

  • Add a recycle channel (Sender<Vec<u8>>) from the main thread back to the I/O thread so consumed buffers are reused instead of allocating fresh ones on every block read
  • In pipelined mode, the I/O thread calls try_recv on the recycle channel before each read - if a buffer is available it is reused, otherwise a fresh buffer is allocated as fallback to avoid stalling
  • In synchronous mode, a single pre-allocated buffer is reclaimed from current_block on each call, cycling through idle/in-use/reclaimed without allocation after initial setup
  • Add comprehensive tests covering various file sizes (empty, sub-block, exact boundary, partial last block, 100 blocks) and property tests verifying pipelined and sequential checksums always match for random data

Closes #1759

Test plan

  • CI passes on all platforms (Linux, macOS, Windows)
  • pipelined_matches_sequential_various_sizes covers 9 size variants including edge cases
  • many_blocks_pipelined_recycling exercises 100 blocks of buffer recycling
  • sync_mode_reuses_buffer verifies synchronous buffer reuse
  • Property test pipelined_equals_sequential verifies equivalence for random data up to 512 KiB
  • Property test reader_collects_all_data verifies no data loss for random data in both modes
  • Existing comprehensive tests and benchmarks pass unchanged (API is backward-compatible)

Add a recycle channel from the main thread back to the I/O thread so
consumed buffers are reused instead of allocating fresh ones on every
read. This eliminates per-block heap allocations on the hot path.

In pipelined mode, the I/O thread tries `try_recv` on the recycle
channel before each read. If a buffer is available it is reused;
otherwise a fresh buffer is allocated as fallback, avoiding any
stall when the computation thread is slower than I/O.

In synchronous mode, a single pre-allocated buffer is reclaimed from
`current_block` on each call, cycling through idle -> in-use ->
reclaimed without any allocation after the initial setup.

Tests cover various file sizes (empty, sub-block, exact boundary,
partial last block, many blocks) and property tests verify that
pipelined and sequential checksums always match for random data.
@github-actions github-actions Bot added the enhancement New feature or request label May 13, 2026
@oferchen oferchen merged commit 6bb5cab into master May 13, 2026
39 checks passed
@oferchen oferchen deleted the feat/double-buffer-pipeline-1759 branch May 13, 2026 11:06
oferchen added a commit that referenced this pull request May 18, 2026
* feat: add buffer recycling to double-buffered checksum pipeline

Add a recycle channel from the main thread back to the I/O thread so
consumed buffers are reused instead of allocating fresh ones on every
read. This eliminates per-block heap allocations on the hot path.

In pipelined mode, the I/O thread tries `try_recv` on the recycle
channel before each read. If a buffer is available it is reused;
otherwise a fresh buffer is allocated as fallback, avoiding any
stall when the computation thread is slower than I/O.

In synchronous mode, a single pre-allocated buffer is reclaimed from
`current_block` on each call, cycling through idle -> in-use ->
reclaimed without any allocation after the initial setup.

Tests cover various file sizes (empty, sub-block, exact boundary,
partial last block, many blocks) and property tests verify that
pipelined and sequential checksums always match for random data.

* style: remove unused RollingDigest import in property tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant