Skip to content

fix(gateway): add accept-loop backoff to prevent CPU spin | 修复(gateway): 为 accept 循环增加退避,避免 CPU 空转#853

Open
manelsen wants to merge 3 commits intonullclaw:mainfrom
manelsen:fix/issue-851-accept-backoff
Open

fix(gateway): add accept-loop backoff to prevent CPU spin | 修复(gateway): 为 accept 循环增加退避,避免 CPU 空转#853
manelsen wants to merge 3 commits intonullclaw:mainfrom
manelsen:fix/issue-851-accept-backoff

Conversation

@manelsen
Copy link
Copy Markdown
Contributor

Fixes #851

Summary

EN:

  • Hardened the gateway accept loop to apply bounded backoff on non-WouldBlock accept errors, preventing tight CPU spin under repeated transient failures.
  • Added explicit backoff helpers/constants (ACCEPT_ERROR_BACKOFF_MAX_MS, log interval control) so retry behavior is deterministic and observable.
  • Added regression coverage in src/gateway.zig for the backoff progression and cap behavior.

ZH:

  • 加固了 gateway 的 accept 循环:在非 WouldBlock 的 accept 错误下应用有上限的退避,避免瞬时错误反复发生时出现 CPU 空转。
  • 增加了明确的退避辅助逻辑与常量(ACCEPT_ERROR_BACKOFF_MAX_MS、日志节流间隔),使重试行为可预测且可观测。
  • src/gateway.zig 增加回归测试,覆盖退避增长与上限行为。

Validation

  • zig build test --summary all

Notes

  • This change is intentionally scoped to gateway accept-loop error handling and does not alter successful connection flow.

@rijuyuezhu
Copy link
Copy Markdown

Met the same problem on Arch Linux with nullclaw of latest dev version

@mark-os
Copy link
Copy Markdown
Contributor

mark-os commented Apr 20, 2026

Posted a root cause analysis on #851 — the WouldBlock branch is unreachable on Zig 0.16 because Io/Threaded.zig maps EAGAIN to error.Unexpected via errnoBug(). This PR's backoff on the else branch is the right short-term fix, but the compat shim should probably remap Unexpected back to WouldBlock when the socket is known non-blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gateway: Busy-loop on accept4() returning EAGAIN pegs CPU core

3 participants