fix(plugins): add default timeout for before_compaction/after_compaction hooks#84153
Conversation
|
Codex review: passed. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. from source inspection: current main has no defaults for these two void hooks, and awaited PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. PR egg Rarity: 🥚 common. What is this egg doing here?
Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land the central hook-runner default if maintainers accept 30 seconds as the lifecycle hook budget, while preserving per-hook Do we have a high-confidence way to reproduce the issue? Yes, from source inspection: current main has no defaults for these two void hooks, and awaited Is this the best way to solve the issue? Yes, the central default table is the narrowest maintainable place to bound all callers while preserving the existing per-registration override. The only open solution-fit decision is whether maintainers accept 30 seconds as the fail-open default budget. Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3bc728eaa993. |
|
@clawsweeper automerge |
|
🦞🧹
Draft PRs stay fix-only until GitHub marks them ready for review. Pause with Automerge progress:
|
jalehman
left a comment
There was a problem hiding this comment.
Reviewed locally with the maintainer review workflow. No actionable findings from me; approving.
8297a29 to
e85c9d6
Compare
76cf75d to
1a046cc
Compare
…efault timeout DEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOK only listed agent_end, so the before_compaction and after_compaction void hooks ran fully unbounded when a plugin supplied no hook.timeoutMs. In the codex agent harness these hooks fire on the serialized notification queue, so a slow or hung handler froze processing of every later codex notification — including turn/completed — hanging the whole agent turn. Add defensive default timeout entries for both hooks, mirroring the existing agent_end pattern. The budget matches agent_end's 30s rather than the tighter modifying-hook defaults because compaction hooks can legitimately do real work (e.g. a memory flush). The runner is fail-open for void hooks, so a timed-out handler is logged and compaction proceeds.
1a046cc to
41fa5fe
Compare
|
Merged via squash.
Thanks @100yenadmin! |
Bug
DEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOKinsrc/plugins/hooks.tslisted onlyagent_end. Thebefore_compactionandafter_compactionplugin hooks arevoid hooks (
runBeforeCompaction/runAfterCompaction→runVoidHook(...)),and
runVoidHookapplies a timeout only whengetVoidHookTimeoutMsreturnsa value. With no table entry and no plugin-supplied
hook.timeoutMs, bothhooks ran fully unbounded.
In the codex agent harness these hooks fire on the serialized notification
queue —
extensions/codex/src/app-server/event-projector.tshandleItemStartedawaitsrunAgentHarnessBeforeCompactionHook(and the matchingafter_compactioncall site) for acontextCompactionitem. A slow or hungcompaction hook therefore freezes processing of every later codex notification —
including
turn/completed— so the whole agent turn hangs.Fix
Add
before_compactionandafter_compactionentries toDEFAULT_VOID_HOOK_TIMEOUT_MS_BY_HOOK, mirroring the defensive-default patternalready applied to
agent_end. The budget is 30s, matchingagent_end(the closest like-for-like precedent — a void lifecycle hook) rather than the
tighter 15s modifying-hook defaults, because compaction hooks can legitimately
do real work such as a memory flush. The runner is fail-open for void hooks,
so a timed-out handler is logged and compaction proceeds without it.
A plugin can still override the default per-registration via
hook.timeoutMs.Related
Related to #84077 — the same compaction-stall investigation. That issue covers
the host's missing safety timeout on plugin-owned
compact(); this is aseparate, independent mechanism (unbounded void hooks freezing the codex
notification queue), hence "Related" rather than "Fixes".
Test plan
pnpm tsgo:coreandpnpm tsgo:core:test— typecheck passesoxlinton changed files — 0 warnings / 0 errorssrc/plugins/hooks.compaction-timeout.test.ts:before_compactionhandler is bounded by the default timeout (no override) and logged, rather than hangingafter_compactionbefore_compactionhandler completes without a false-positive timeoutvitest(plugins project): new tests +hooks.correlation.test.ts+wired-hooks-compaction.test.ts— 15/15 passReal behavior proof
Behavior addressed: hung plugin
before_compactionandafter_compactionhandlers no longer stall compaction forever. The hook runner now applies the default fail-open timeout when no per-hook override is configured.Real environment tested: local OpenClaw checkout
/Volumes/LEXAR/repos/openclaw-fix-compaction-hook-timeout, branchfix/compaction-hook-default-timeout, Node vianode --import tsx, actualsrc/plugins/hooks.tsplugin hook runner.Exact steps or command run after the patch: ran a live script that registers a plugin whose
before_compactionandafter_compactionhooks never settle, then invokesrunHookTimeoutfor both hooks without per-hook timeout overrides.Evidence after fix:
{ "branch": "fix/compaction-hook-default-timeout", "behavior": "hung before_compaction and after_compaction hooks return via default fail-open timeout", "elapsedMs": 30033, "eventCount": 2, "events": [ "[hooks] before_compaction handler from live-proof-plugin failed: timed out after 30000ms", "[hooks] after_compaction handler from live-proof-plugin failed: timed out after 30000ms" ] }Observed result after fix: both hung hook handlers returned through the default timeout path in about 30 seconds each, emitted fail-open timeout events, and did not block the caller indefinitely.
What was not tested: a real external plugin process wedged in production; the proof uses the in-process hook runner with intentionally never-settling handlers.