Skip to content

windows: Fix polly and utils crates to compile on Windows#660

Open
lstocchi wants to merge 4 commits intocontainers:mainfrom
lstocchi:polly_win
Open

windows: Fix polly and utils crates to compile on Windows#660
lstocchi wants to merge 4 commits intocontainers:mainfrom
lstocchi:polly_win

Conversation

@lstocchi
Copy link
Copy Markdown

This is the second PR for porting libkrun on Windows. Nothing fancy and relatively small.
I'm just going back through the code I wrote for my tests over the last few weeks. Cleaning it up and pushing it with a bit more organization. Next would be the WHP crate.

As @slp confirmed I'm going to push to main.
As you can see the first two commits are the same as #609 because the main do not contain them. If you prefer we can merge them separately and rebase this PR.

Overall, it's just a bunch of instructions to gate unix-related functions which won't be used on Windows and replace the misssing one with Windows code.

This introduces an epoll-compatible polling abstraction for Windows by
leveraging I/O Completion Ports (IOCP) as a central event multiplexer.
By bypassing the standard `WaitForMultipleObjects` 64-handle limit, this
architecture gives us true O(1) wake-ups with no handle-count limitations.

When an event is added to the epoll, we heap-allocate a `Watch` struct
and attach a Wait Completion Packet (WCP) to it. The raw memory pointer
of this struct is passed to the kernel as the completion key. When the
event signals, the kernel pushes the WCP to the IOCP queue. The waiting
thread pops the packet and reads the pointer, allowing us to process
events with zero heap allocations and a completely lock-free hot path.

When deleting an event, the WCP is closed and the `Watch` memory is moved
to a "zombie" list with a 5-second garbage collection delay. Because the
IOCP queue is managed asynchronously by the Windows kernel, a deleted
event might still have a completion packet in-flight to a worker thread.
This GC window safely drains those ghost packets before freeing the memory,
preventing Use-After-Free segfaults.

One calculated tradeoff is the reliance on the undocumented Windows NT
native API (`NtAssociateWaitCompletionPacket`). While unofficial, it has
been stable in the Windows kernel for over a decade and is currently the
only way to achieve VMM-grade, O(1) polling performance for arbitrary handles.

Additional details:
- EventSet bit values mirror the macOS implementation for cross-platform portability.
- Adds `windows-sys` as a Windows-only dependency for OS APIs.

Signed-off-by: lstocchi <lstocchi@redhat.com>
Emulate eventfd on Windows using a manual-reset kernel Event object paired with a Mutex-protected counter.

To maximize VMM throughput, the `write` path only trigger the event when the counter transitions from `0 -> non-zero`.
If a virtual device rapid-fires multiple interrupts before the vCPU wakes up,
we accumulate the data in user-space RAM and skip the redundant kernel syscalls entirely.

`read` and `wait_timeout` maintain strict level-triggered synchronization.
The kernel event is only reset (`ResetEvent`) when the internal counter is fully drained,
preventing the IOCP epoll loop from entering an infinite busy-wait cycle.

Signed-off-by: lstocchi <lstocchi@redhat.com>
cfg-gate dependencies and implementations based on the target OS so that
the utils crate also compile on Windows

Signed-off-by: lstocchi <lstocchi@redhat.com>
Make the polly event manager compile and pass tests on Windows by
replacing Unix-specific types and APIs with platform-gated equivalents

Signed-off-by: lstocchi <lstocchi@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant