Skip to content

bench(fast_io): IOCP vs stdio write throughput on Windows (#1899)#4241

Merged
oferchen merged 1 commit into
masterfrom
bench/iocp-vs-stdio-1899
May 17, 2026
Merged

bench(fast_io): IOCP vs stdio write throughput on Windows (#1899)#4241
oferchen merged 1 commit into
masterfrom
bench/iocp-vs-stdio-1899

Conversation

@oferchen
Copy link
Copy Markdown
Owner

Summary

  • Adds crates/fast_io/benches/iocp_vs_stdio.rs, a Criterion bench that pits the Windows IOCP writer path (IocpDiskBatch) against std::fs and BufWriter<File> baselines.
  • Four cells per payload size (4 KiB, 64 KiB, 1 MiB) for 12 measurement points: iocp_default, iocp_concurrent_ops_8, stdfs, bufwriter_64k. Each cell writes 1000 files into a fresh tempfile::TempDir per iteration.
  • Closes the "wired but never measured" gap from Ensure release flow builds distributable artifacts #1868 with repeatable data on whether the IOCP path actually helps under typical and high-concurrency loads (task Fix RPM artifact paths and cargo xtask alias #1899).

Implementation notes

  • Bench is cfg-gated to target_os = "windows" + the iocp feature. On Linux and macOS the file compiles to a stub fn main() that prints a skip line, so cargo bench -p fast_io stays cheap on every other host.
  • IocpDiskBatch is the only writer surface that consumes IocpConfig::concurrent_ops (via submit_write_batch in crates/fast_io/src/iocp/disk_batch.rs). The bench drives it through the same begin_file -> write_data -> commit_file sequence the receiver disk-commit thread uses.
  • bufwriter_64k matches the default IocpConfig::buffer_size so the comparison isolates overlapped submission from buffering.
  • Payloads are seeded by a deterministic LCG to defeat NTFS sparse-region folding without introducing a random-number crate dependency.
  • See the file's module doc for the run recipe and the outcome -> action mapping that feeds back into Fix RPM artifact paths and cargo xtask alias #1899 and Cache x86 checksum SIMD feature detection #1929/Optimize vectored rolling checksum aggregation #1930.

Test plan

  • cargo bench -p fast_io --bench iocp_vs_stdio on a Windows host produces three throughput rows per iocp_default/iocp_concurrent_ops_8/stdfs/bufwriter_64k cell.
  • cargo build -p fast_io --bench iocp_vs_stdio on Linux and macOS hosts compiles to a no-op stub that runs and prints the skip line.
  • CI fmt+clippy passes on every platform matrix.

Adds a Criterion bench that compares the Windows IOCP writer path
(IocpDiskBatch) against std::fs and BufWriter baselines across 4 KiB,
64 KiB, and 1 MiB payloads. Closes the "wired but never measured"
gap from #1868 with repeatable data on whether the IOCP path actually
helps under typical and high-concurrency loads.

Four cells per payload size:
- iocp_default: IocpDiskBatch with default concurrent_ops (4)
- iocp_concurrent_ops_8: IocpDiskBatch with concurrent_ops = 8
- stdfs: std::fs::File::create + write_all baseline
- bufwriter_64k: BufWriter<File> with a 64 KiB buffer

The bench is cfg-gated to target_os = "windows" + iocp feature; on
Linux and macOS the file compiles to a stub main that prints a skip
line, so cargo bench -p fast_io is cheap on every other host.
@oferchen oferchen force-pushed the bench/iocp-vs-stdio-1899 branch from a864384 to eaa321c Compare May 17, 2026 19:43
@oferchen oferchen merged commit e484fb5 into master May 17, 2026
13 checks passed
@oferchen oferchen deleted the bench/iocp-vs-stdio-1899 branch May 17, 2026 19:43
oferchen added a commit that referenced this pull request May 18, 2026
)

Adds a Criterion bench that compares the Windows IOCP writer path
(IocpDiskBatch) against std::fs and BufWriter baselines across 4 KiB,
64 KiB, and 1 MiB payloads. Closes the "wired but never measured"
gap from #1868 with repeatable data on whether the IOCP path actually
helps under typical and high-concurrency loads.

Four cells per payload size:
- iocp_default: IocpDiskBatch with default concurrent_ops (4)
- iocp_concurrent_ops_8: IocpDiskBatch with concurrent_ops = 8
- stdfs: std::fs::File::create + write_all baseline
- bufwriter_64k: BufWriter<File> with a 64 KiB buffer

The bench is cfg-gated to target_os = "windows" + iocp feature; on
Linux and macOS the file compiles to a stub main that prints a skip
line, so cargo bench -p fast_io is cheap on every other host.
oferchen added a commit that referenced this pull request May 18, 2026
)

Adds a Criterion bench that compares the Windows IOCP writer path
(IocpDiskBatch) against std::fs and BufWriter baselines across 4 KiB,
64 KiB, and 1 MiB payloads. Closes the "wired but never measured"
gap from #1868 with repeatable data on whether the IOCP path actually
helps under typical and high-concurrency loads.

Four cells per payload size:
- iocp_default: IocpDiskBatch with default concurrent_ops (4)
- iocp_concurrent_ops_8: IocpDiskBatch with concurrent_ops = 8
- stdfs: std::fs::File::create + write_all baseline
- bufwriter_64k: BufWriter<File> with a 64 KiB buffer

The bench is cfg-gated to target_os = "windows" + iocp feature; on
Linux and macOS the file compiles to a stub main that prints a skip
line, so cargo bench -p fast_io is cheap on every other host.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant