Skip to content

Fix #274: Reorder Tower startup to prevent architect terminal loss#275

Merged
waleedkadous merged 3 commits intomainfrom
builder/bugfix-274-architect-terminal-should-surv
Feb 15, 2026
Merged

Fix #274: Reorder Tower startup to prevent architect terminal loss#275
waleedkadous merged 3 commits intomainfrom
builder/bugfix-274-architect-terminal-should-surv

Conversation

@waleedkadous
Copy link
Contributor

Summary

Architect terminals were lost on Tower restart due to a startup race condition between terminal reconciliation and API request processing. Reorders startup so reconciliation completes before the instance module enables request handling.

Fixes #274

Root Cause

In Tower's startup sequence, initInstances() was called before reconcileTerminalSessions(). This enabled dashboard polls (getInstances()getTerminalsForProject()) to arrive during reconciliation. Both getTerminalsForProject()'s on-the-fly reconnection and reconcileTerminalSessions() would attempt to connect to the same shellper socket. The shellper's single-connection model (new connection replaces old) caused the first client to be disconnected, triggering removeDeadSession() which deleted the socket file and corrupted the session — permanently losing the architect terminal.

Builder terminals were unaffected because getInstances() explicitly skips /.builders/ paths, so their getTerminalsForProject() was never called during the race window.

Fix

Moved reconcileTerminalSessions() to run before initInstances() in tower-server.ts. During reconciliation, getInstances() returns [] (since _deps is null), preventing any on-the-fly reconnection from interfering.

Test Plan

  • Regression test added (bugfix-274-architect-persistence.test.ts)
  • Build passes
  • All 1292 tests pass (13 skipped tunnel E2E — expected)

The architect terminal was lost on Tower restart due to a race condition
in the startup sequence. initInstances() was called before
reconcileTerminalSessions(), which allowed dashboard polls arriving
during reconciliation to trigger on-the-fly shellper reconnection via
getTerminalsForProject(). Both paths competed for the same shellper
socket — the shellper's single-connection model replaced the first
client, triggering removeDeadSession() which deleted the socket file
and corrupted the session. Builder terminals were unaffected because
getInstances() skips /.builders/ paths.

Fix: Move reconcileTerminalSessions() before initInstances() so
getInstances() returns [] during reconciliation (since _deps is null),
preventing any on-the-fly reconnection from interfering.
CMAP review (Codex) identified that /project/<path>/api/state calls
getTerminalsForProject() directly, bypassing the getInstances() guard.
A browser tab left open from before the restart could trigger on-the-fly
shellper reconnection during reconciliation via this path.

Fix: Add _reconciling flag set during reconcileTerminalSessions() that
blocks on-the-fly reconnection in getTerminalsForProject(). This closes
both race paths: getInstances (startup ordering) and direct API access
(_reconciling guard).
@waleedkadous waleedkadous merged commit 82567a3 into main Feb 15, 2026
7 checks passed
@waleedkadous waleedkadous deleted the builder/bugfix-274-architect-terminal-should-surv branch February 15, 2026 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Architect terminal should survive Tower restarts (shellper persistence)

1 participant