Skip to content

wayland: treat flush WouldBlock as recoverable back-pressure#4564

Closed
johannesCmayer wants to merge 1 commit intorust-windowing:masterfrom
johannesCmayer:wayland-flush-eagain
Closed

wayland: treat flush WouldBlock as recoverable back-pressure#4564
johannesCmayer wants to merge 1 commit intorust-windowing:masterfrom
johannesCmayer:wayland-flush-eagain

Conversation

@johannesCmayer
Copy link
Copy Markdown

Summary

Fix a long-standing wayland event-loop bug where any error from
connection.flush() is treated as fatal. WaylandError::Io(WouldBlock)
returned from EAGAIN on the wayland socket is recoverable
back-pressure, not a protocol error, and should be retried on the next
loop iteration rather than calling set_exit_code(1).

This matches the pattern already used by
calloop-wayland-source's flush_queue
and is consistent with libwayland's documented contract for
wl_display_flush
.

Reproducer

A client that issues many wayland requests in a tight burst — e.g.
~4000 xdg_toplevel.set_title / set_size calls within 60 ms from
multiple threads — saturates the kernel SO_SNDBUF (~208 KiB by default).
The next wl_display.flush() returns WaylandError::Io(WouldBlock)
and the event loop currently exits with code 1.

After this fix, the same workload completes cleanly: the next loop
iteration's poll(POLLOUT) drains the buffer and the next flush
succeeds.

Test plan

  • Applied to a downstream embedder (winit consumer); the same
    stress workload that previously killed the loop in ~80 ms now
    runs 25,000 requests in under 1 s without errors.

Notes

  • Verified with cargo +nightly fmt --check (nightly rustfmt 1.9.0):
    the added code is fmt-clean. There is some pre-existing import-order
    drift in upstream master unrelated to this fix; not touched here to
    keep the PR focused.

Checklist (winit PR template)

  • Tested on Wayland (only changes Wayland code path)
  • Added an entry to the changelog module
  • Documentation updated — N/A, no API surface change
  • Created or updated an example program — N/A, this is an internal
    bug fix

When the kernel socket send buffer (`SO_SNDBUF`) saturates under client
write bursts, `wl_display.flush()` returns `WaylandError::Io(WouldBlock)`.
libwayland documents this as recoverable: poll(POLLOUT) on the display
fd and retry on the next iteration.

Previously every flush error — including this transient back-pressure
case — was treated as fatal and called `set_exit_code(1)`, killing the
event loop on inputs that should have been absorbed naturally.

Match on the flush result, ignore `Io(WouldBlock)`, propagate other
errors. Mirrors the pattern in Smithay/calloop-wayland-source's
flush_queue, which already handles this distinction correctly.

Reproduces with high client→server request bursts (e.g. ~4000
xdg_toplevel.set_title/set_size requests in 60ms from multiple threads).
After the fix, the same workload succeeds without exiting the loop.
@johannesCmayer
Copy link
Copy Markdown
Author

The two failing checks (Test stable Linux 32bit and Test stable Linux 64bit) are unrelated to this PR — they fail on a clippy iter_kv_map lint in winit-x11/src/event_processor.rs:1760, which this PR doesn't touch. The same failures appear on other current PRs against master (#4561, #4563), so it looks like a master-broken state from a recent clippy stricter-lint bump rather than something I introduced.

All Wayland-specific checks (stable / 1.85 / nightly) pass.

@kchibisov
Copy link
Copy Markdown
Member

Well, if you submit AI generated code, please at least state so.

Also, I'll let you know that there's other way to fix that.

@kchibisov kchibisov closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants