fix(acpx): await startup probe before gateway ready#80187
Conversation
0181e94 to
7462d9d
Compare
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: yes. from source plus issue logs: current main awaits plugin service startup before gateway ready, but ACPX currently launches its probe in a detached async block and can resolve startup before the probe reports ready or failed. I did not live-reproduce a Windows gateway boot in this read-only review. Real behavior proof Next step before merge Security Review detailsBest possible solution: Land the awaited-probe fix once maintainers accept the default-behavior change and add redacted cold-start proof; keep broader sidecar parallelization tracked separately in #79625. Do we have a high-confidence way to reproduce the issue? Yes, from source plus issue logs: current main awaits plugin service startup before gateway ready, but ACPX currently launches its probe in a detached async block and can resolve startup before the probe reports ready or failed. I did not live-reproduce a Windows gateway boot in this read-only review. Is this the best way to solve the issue? Yes, with maintainer approval for the default change: moving the ACPX probe into the awaited service-start path is the narrow owner-boundary fix for the reported readiness race while preserving explicit lazy-start opt-outs. Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 0a17339a7064. |
5d1f219 to
e7be9bf
Compare
e7be9bf to
2294499
Compare
Summary
OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE=0opts back into lazy startup.service.start(), so Gateway sidecar readiness waits until acpx is usable or has reported a bounded probe failure.OPENCLAW_SKIP_ACPX_RUNTIME_PROBE=1fast path for E2E scripts and update docs/changelog for the new default.Closes #79596.
Real behavior proof
OPENCLAW_HOME,OPENCLAW_STATE_DIR, andOPENCLAW_CONFIG_PATHunder/tmp/openclaw-acpx-proof.7B2Plt.plugins.entries.acpx.enabled=true,plugins.entries.acpx.config.timeoutSeconds=0.001, andplugins.entries.acpx.config.agents.codex.command="node"withargs=["-e", "setInterval(() => {}, 1000)"], then ranOPENCLAW_HOME=/tmp/openclaw-acpx-proof.7B2Plt/home OPENCLAW_STATE_DIR=/tmp/openclaw-acpx-proof.7B2Plt/state OPENCLAW_CONFIG_PATH=/tmp/openclaw-acpx-proof.7B2Plt/state/openclaw.json pnpm openclaw gateway run --auth none --bind loopback --port 19876 --verbose --allow-unconfigured.readyinstead of hanging.Verification
pnpm exec oxfmt --check --threads=1 extensions/acpx/register.runtime.ts extensions/acpx/src/service.ts extensions/acpx/src/service.test.ts docs/tools/acp-agents-setup.md CHANGELOG.mdpnpm test extensions/acpx/src/service.test.tsenv -u OPENCLAW_TESTBOX -u OPENCLAW_TESTBOX_ID pnpm check:changedpnpm buildAdditional suite check:
pnpm test extensions/acpxstill fails inextensions/acpx/src/claude-agent-acp-completion.test.ts:178(resolvedbecomestrueafter idle). This file is not touched by this PR and the focused ACPX startup test passes.