Skip to content

[#579-1] Add waitForAgentChattrReady after every AC spawn #580

@realproject7

Description

@realproject7

Context

Follow-up from #579 (Symptom 1 & 2). On fresh installs where AC takes several seconds to bind port 8300, the health monitor's first 30s tick sees port down → triggers auto-restart → causes ghost agent cascade. Existing installs are unaffected because AC starts in ~2s — but the fix is needed for slow-start environments.

Problem

Both bin/quadwork.js (cmdStart) and server/index.js (spawnChattr) declare AC success the moment spawn() returns a PID. AC takes several seconds to actually bind 8300 (templates load, MCP servers start, FastAPI lifespan). The health monitor and dashboard history-snapshot fetch check before that's done → 502 + false-down → unnecessary auto-restart cascade.

Fix

Add await waitForAgentChattrReady(port, 30000) after every AC spawn, before declaring success. The function already exists in server/index.js — reuse it.

Call sites to patch

  1. server/index.jsspawnChattr() function (around the setProc({ process: child, state: "running" }) line)

    • After confirming child.pid is valid, before setting state to "running"
    • await waitForAgentChattrReady(chattrPort, 30000)
    • If not ready after 30s, set state to "error" instead of "running"
  2. bin/quadwork.jscmdStart() AC spawn loop (around the ok("AgentChattr started for ...") line)

    • After confirming acProc.pid is valid
    • Import or inline a simple port-check loop (bin/ doesn't have access to waitForAgentChattrReady from server/)
    • Can be a simple fetch poll: retry http://127.0.0.1:${port}/api/health every 2s for up to 30s
  3. server/routes.js/api/agentchattr/:id/restart endpoint — see Fresh 1.13.1 install: AC startup race → triple auto-reset → ghost agent identities (head-N, re1-N) + Restart button leaves AC dead #579-3 for this one

Safety

When AC starts fast (existing installs, pinned commit), the wait resolves in 1-2s. No behavioral change. Only prevents false-down detection on slow starts.

Acceptance criteria

  • spawnChattr() waits for port before setting state to "running"
  • cmdStart() waits for port before printing success
  • Existing fast-start environments see no delay beyond AC's actual bind time
  • npm run build passes

Fixes part of #579

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent/devAssigned to Dev agentbugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions