Skip to content

Enable SIMD rolling checksums on x86#1928

Merged
oferchen merged 1 commit into
masterfrom
implement-missing-rsync-components-for-oc-rsync
Nov 4, 2025
Merged

Enable SIMD rolling checksums on x86#1928
oferchen merged 1 commit into
masterfrom
implement-missing-rsync-components-for-oc-rsync

Conversation

@oferchen
Copy link
Copy Markdown
Owner

@oferchen oferchen commented Nov 4, 2025

Summary

  • enable the existing SIMD rolling-checksum implementation on 32-bit x86 targets by wiring the architecture dispatcher to the shared fast path
  • share the x86 SSE2/AVX2 intrinsics between x86 and x86_64 and extend the parity tests to cover both targets
  • document the broader CPU coverage in internal docs

Testing

  • cargo test

https://chatgpt.com/codex/tasks/task_e_6909a2caa6a08323acf3149f63e0f1f7

@oferchen oferchen merged commit b8ff57a into master Nov 4, 2025
@oferchen oferchen deleted the implement-missing-rsync-components-for-oc-rsync branch November 4, 2025 07:09
oferchen added a commit that referenced this pull request May 1, 2026
Add CompletionPump: a shared completion-port driver that owns a worker
thread looping on GetQueuedCompletionStatusEx and dispatches each
completion entry to a per-operation handler keyed by the OVERLAPPED
pointer. This is the Windows analogue of an io_uring CQE drain and is
the foundational building block for the upcoming batched-write API
(#1898), the concurrent file/socket paths (#1928-#1932), and the IOCP
benchmark harness (#1899).

Existing IocpReader/IocpWriter create a private port and inline-call
GetQueuedCompletionStatus per operation, so the pump introduces a
process-wide proactor without disturbing those single-op paths.

Cross-platform: real implementation lives behind
#[cfg(all(target_os = "windows", feature = "iocp"))]; non-Windows
targets get an Unsupported-returning stub so the crate still compiles
on Linux and macOS.

All unsafe stays inside fast_io per the workspace unsafe-code policy.
oferchen added a commit that referenced this pull request May 1, 2026
Adds an IOCP-backed async socket reader/writer that mirrors the
io_uring socket surface and dispatches completions through the shared
CompletionPump from #1897. Each WSARecv/WSASend issues an OVERLAPPED
operation; synchronous completions and WSA_IO_PENDING paths both wait
on a oneshot handler registered with the pump, so file I/O and socket
I/O share a single drain thread.

Mirrors upstream rsync's socket I/O semantics:
- recv treats graceful peer close (WSAEDISCON, WSAESHUTDOWN,
  WSAENETRESET, WSAECONNRESET, WSAECONNABORTED) and STATUS_END_OF_FILE
  as Ok(0), matching safe_read breaking on n == 0
  (target/interop/upstream-src/rsync-3.4.1/io.c:276).
- send maps WSAESHUTDOWN/WSAECONNRESET/WSAECONNABORTED to BrokenPipe
  and returns short counts so the caller's loop re-issues, matching
  safe_write's partial-write retry (io.c:316-336).

Files:
- crates/fast_io/src/iocp/socket.rs (new): IocpSocketReader,
  IocpSocketWriter, SharedPump alias, six tests covering localhost
  TCP roundtrip, partial 64KB send accounting, peer-shutdown EOF,
  empty-buffer fast paths, and completion-key override.
- crates/fast_io/src/iocp/mod.rs: pub mod socket; (single line)
- crates/fast_io/src/iocp_stub.rs: matching stub module so the
  workspace cross-compiles on Linux and macOS, all methods returning
  io::ErrorKind::Unsupported.
- crates/fast_io/Cargo.toml: add Win32_Networking_WinSock to the
  windows-sys feature list for WSARecv/WSASend bindings.

Cross-platform compile preserved: the real implementation is gated
behind cfg(all(target_os = "windows", feature = "iocp")) at the
fast_io::iocp module level; non-Windows builds fall through to the
stub.
oferchen added a commit that referenced this pull request May 2, 2026
Adds an IOCP-backed async socket reader/writer that mirrors the
io_uring socket surface and dispatches completions through the shared
CompletionPump from #1897. Each WSARecv/WSASend issues an OVERLAPPED
operation; synchronous completions and WSA_IO_PENDING paths both wait
on a oneshot handler registered with the pump, so file I/O and socket
I/O share a single drain thread.

Mirrors upstream rsync's socket I/O semantics:
- recv treats graceful peer close (WSAEDISCON, WSAESHUTDOWN,
  WSAENETRESET, WSAECONNRESET, WSAECONNABORTED) and STATUS_END_OF_FILE
  as Ok(0), matching safe_read breaking on n == 0
  (target/interop/upstream-src/rsync-3.4.1/io.c:276).
- send maps WSAESHUTDOWN/WSAECONNRESET/WSAECONNABORTED to BrokenPipe
  and returns short counts so the caller's loop re-issues, matching
  safe_write's partial-write retry (io.c:316-336).

Files:
- crates/fast_io/src/iocp/socket.rs (new): IocpSocketReader,
  IocpSocketWriter, SharedPump alias, six tests covering localhost
  TCP roundtrip, partial 64KB send accounting, peer-shutdown EOF,
  empty-buffer fast paths, and completion-key override.
- crates/fast_io/src/iocp/mod.rs: pub mod socket; (single line)
- crates/fast_io/src/iocp_stub.rs: matching stub module so the
  workspace cross-compiles on Linux and macOS, all methods returning
  io::ErrorKind::Unsupported.
- crates/fast_io/Cargo.toml: add Win32_Networking_WinSock to the
  windows-sys feature list for WSARecv/WSASend bindings.

Cross-platform compile preserved: the real implementation is gated
behind cfg(all(target_os = "windows", feature = "iocp")) at the
fast_io::iocp module level; non-Windows builds fall through to the
stub.
oferchen added a commit that referenced this pull request May 2, 2026
* feat(fast_io): IOCP socket I/O via WSARecv/WSASend (#1928)

Adds an IOCP-backed async socket reader/writer that mirrors the
io_uring socket surface and dispatches completions through the shared
CompletionPump from #1897. Each WSARecv/WSASend issues an OVERLAPPED
operation; synchronous completions and WSA_IO_PENDING paths both wait
on a oneshot handler registered with the pump, so file I/O and socket
I/O share a single drain thread.

Mirrors upstream rsync's socket I/O semantics:
- recv treats graceful peer close (WSAEDISCON, WSAESHUTDOWN,
  WSAENETRESET, WSAECONNRESET, WSAECONNABORTED) and STATUS_END_OF_FILE
  as Ok(0), matching safe_read breaking on n == 0
  (target/interop/upstream-src/rsync-3.4.1/io.c:276).
- send maps WSAESHUTDOWN/WSAECONNRESET/WSAECONNABORTED to BrokenPipe
  and returns short counts so the caller's loop re-issues, matching
  safe_write's partial-write retry (io.c:316-336).

Files:
- crates/fast_io/src/iocp/socket.rs (new): IocpSocketReader,
  IocpSocketWriter, SharedPump alias, six tests covering localhost
  TCP roundtrip, partial 64KB send accounting, peer-shutdown EOF,
  empty-buffer fast paths, and completion-key override.
- crates/fast_io/src/iocp/mod.rs: pub mod socket; (single line)
- crates/fast_io/src/iocp_stub.rs: matching stub module so the
  workspace cross-compiles on Linux and macOS, all methods returning
  io::ErrorKind::Unsupported.
- crates/fast_io/Cargo.toml: add Win32_Networking_WinSock to the
  windows-sys feature list for WSARecv/WSASend bindings.

Cross-platform compile preserved: the real implementation is gated
behind cfg(all(target_os = "windows", feature = "iocp")) at the
fast_io::iocp module level; non-Windows builds fall through to the
stub.

* style(transport): cargo fmt
oferchen added a commit that referenced this pull request May 5, 2026
Add CompletionPump: a shared completion-port driver that owns a worker
thread looping on GetQueuedCompletionStatusEx and dispatches each
completion entry to a per-operation handler keyed by the OVERLAPPED
pointer. This is the Windows analogue of an io_uring CQE drain and is
the foundational building block for the upcoming batched-write API
(#1898), the concurrent file/socket paths (#1928-#1932), and the IOCP
benchmark harness (#1899).

Existing IocpReader/IocpWriter create a private port and inline-call
GetQueuedCompletionStatus per operation, so the pump introduces a
process-wide proactor without disturbing those single-op paths.

Cross-platform: real implementation lives behind
#[cfg(all(target_os = "windows", feature = "iocp"))]; non-Windows
targets get an Unsupported-returning stub so the crate still compiles
on Linux and macOS.

All unsafe stays inside fast_io per the workspace unsafe-code policy.
oferchen added a commit that referenced this pull request May 5, 2026
* feat(fast_io): IOCP socket I/O via WSARecv/WSASend (#1928)

Adds an IOCP-backed async socket reader/writer that mirrors the
io_uring socket surface and dispatches completions through the shared
CompletionPump from #1897. Each WSARecv/WSASend issues an OVERLAPPED
operation; synchronous completions and WSA_IO_PENDING paths both wait
on a oneshot handler registered with the pump, so file I/O and socket
I/O share a single drain thread.

Mirrors upstream rsync's socket I/O semantics:
- recv treats graceful peer close (WSAEDISCON, WSAESHUTDOWN,
  WSAENETRESET, WSAECONNRESET, WSAECONNABORTED) and STATUS_END_OF_FILE
  as Ok(0), matching safe_read breaking on n == 0
  (target/interop/upstream-src/rsync-3.4.1/io.c:276).
- send maps WSAESHUTDOWN/WSAECONNRESET/WSAECONNABORTED to BrokenPipe
  and returns short counts so the caller's loop re-issues, matching
  safe_write's partial-write retry (io.c:316-336).

Files:
- crates/fast_io/src/iocp/socket.rs (new): IocpSocketReader,
  IocpSocketWriter, SharedPump alias, six tests covering localhost
  TCP roundtrip, partial 64KB send accounting, peer-shutdown EOF,
  empty-buffer fast paths, and completion-key override.
- crates/fast_io/src/iocp/mod.rs: pub mod socket; (single line)
- crates/fast_io/src/iocp_stub.rs: matching stub module so the
  workspace cross-compiles on Linux and macOS, all methods returning
  io::ErrorKind::Unsupported.
- crates/fast_io/Cargo.toml: add Win32_Networking_WinSock to the
  windows-sys feature list for WSARecv/WSASend bindings.

Cross-platform compile preserved: the real implementation is gated
behind cfg(all(target_os = "windows", feature = "iocp")) at the
fast_io::iocp module level; non-Windows builds fall through to the
stub.

* style(transport): cargo fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant