fix: schedule discord heartbeat checks after sends#78087
Conversation
|
ClawSweeper status: review started. I am starting a fresh review of this pull request: fix: schedule discord heartbeat checks after sends This is item 1/1 in the current shard. Shard 0/1. This placeholder means the worker is alive and reading the current context. I will edit this same comment with the actual review when the claws are done clicking. Crustacean status: shell secured, claws on keyboard, evidence pebbles being sorted. |
byungskers
left a comment
There was a problem hiding this comment.
Good catch on the heartbeat timing! Switching from "setInterval" to "setTimeout" makes the ACK check accurately measure one full interval after the actual heartbeat send, which aligns better with Discord Gateway expectations. The fake-timer tests clearly demonstrate the fix.
|
I'm the original reporter of #77668 — happy to provide real-behavior proof to help unblock the Deterministic repro signature on this host
Configuration that rules out other contributing factors
OfferIf this PR lands, I will:
That should give you the empirical data needed to clear the Thanks @bryce-d-greybeard for digging into this — same heartbeat-lifecycle race that @wena369 / @mfbergmann / @holgergruenhagen have been narrowing down on the issue thread. |
7de7eb7 to
bf239b8
Compare
|
Landed via squash onto
Thanks @bryce-d-greybeard and @NikolaFC. |
Summary
Fixes the Discord gateway heartbeat scheduler so ACK timeout checks are measured from the actual heartbeat send time, not from the HELLO-time fixed interval.
The previous scheduler randomized the first heartbeat but started a fixed interval immediately. If the first heartbeat fired late in that interval — or the event loop was delayed — the next interval tick could check
lastHeartbeatAcktoo soon after the send and trigger a falseGateway heartbeat ACK timeout/reconnect cycle while the Discord channel was still awaiting readiness.Changes
Real Behavior Proof
Behavior or issue addressed: Discord gateway heartbeat ACK timeout race causing false reconnect loops and intermittent
awaiting gateway readinesshangs (#77668).Real environment tested: OpenClaw 2026.5.5 from commit
43dcdcd9on WSL2 Ubuntu 22.04, Node.js v22.22.0, Discord bot runtime.Exact steps or command run after this patch:
pnpm buildin the OpenClaw source checkout.@openclaw/discordruntime bundle with the recursive timeout heartbeat logic.systemctl --user restart openclaw-gateway.journalctl --user -u openclaw-gateway -ffor 2+ minutes.Evidence after fix: Runtime log excerpt from the real Discord gateway run:
Observed result after fix: Zero
Gateway heartbeat ACK timeoutentries; Discord reached READY and stayed connected with repeated heartbeat ACKs instead of reconnecting.What was not tested: Long-running 24h+ stability, multi-guild load, and Windows native runtime were not covered by this after-fix run. The original #77668 reporter offered to rerun 5 macOS launchd restart cycles once this lands.
Testing
pnpm exec oxfmt --write --threads=1 CHANGELOG.md extensions/discord/src/internal/gateway-lifecycle.ts extensions/discord/src/internal/gateway-lifecycle.test.ts extensions/discord/src/internal/gateway.test.ts— passed.pnpm test extensions/discord/src/internal/gateway-lifecycle.test.ts extensions/discord/src/internal/gateway.test.ts— passed, 2 files / 24 tests.pnpm exec oxfmt --check --threads=1 CHANGELOG.md extensions/discord/src/internal/gateway-lifecycle.ts extensions/discord/src/internal/gateway-lifecycle.test.ts extensions/discord/src/internal/gateway.test.ts— passed.git diff --check— passed.Fixes #77668
Supersedes #77956