Skip to content

Conversation

@ammar-agent
Copy link
Collaborator

@ammar-agent ammar-agent commented Oct 27, 2025

Problem

SSH workspaces require asynchronous initialization (rsync, checkout, init hook) before tools can execute. Without proper synchronization, tools may race with initialization and fail with confusing errors.

Solution

Promise-based init waiting: Tools wait for workspace initialization to complete before executing. If init fails or times out (5 minutes), tools proceed anyway and fail naturally with their own errors (better UX than blocking).

Architecture

InitStateManager - Tracks init state with promise-based waiting:

  • startInit(workspaceId, hookPath) - Creates completion promise
  • endInit(workspaceId, exitCode) - Resolves promise (success or failure)
  • waitForInit(workspaceId) - Never throws. Returns immediately if no init needed, init already done, or timeout (5 min)

wrapWithInitWait() - Wrapper for runtime-dependent tools:

  • Wraps: bash, file_read, file_edit_insert, file_edit_replace_string
  • Does NOT wrap: propose_plan, todo_*, web_search (execute immediately)

Clean separation: initStateManager only passed to wrapper, not to individual tools. Each tool receives clean config without init-related fields.

Before/After

Before After
Tools execute immediately Runtime tools wait for init first
Race conditions on SSH Init completes before tools run
Tools called waitForInit() inline Centralized in wrapWithInitWait()
initStateManager in all tool configs Only in wrapper layer

Changes

Core:

  • src/services/initStateManager.ts (362 lines) - Promise-based state tracking
  • src/services/tools/wrapWithInitWait.ts (38 lines) - Init wait wrapper
  • src/utils/tools/tools.ts - Apply wrapper to runtime tools only

Tests:

  • tests/ipcMain/initWorkspace.test.ts - Extended integration tests (7 tests, all pass)
  • tests/ipcMain/helpers.ts - New helpers: collectInitEvents(), waitForInitEnd()
  • Deduplicated ~100 lines of test code

Integration Test Results

Verified behavior:

  1. Init hook output streams correctly (not batched)
  2. Tools wait for init before executing (both success and failure cases)
  3. Workspace remains usable after init failure (tools work, fail naturally)
  4. Init state persists across page reloads
  5. Works correctly for both local and SSH runtimes
✓ 7 integration tests pass (initWorkspace.test.ts)
✓ 831 unit tests pass
✓ Init wait adds ~5 seconds on first tool use (expected)
✓ Subsequent tools execute immediately (no redundant waiting)

Key Design Decision

Why waitForInit() never throws: Init hook is optional setup, not a prerequisite. If init fails or times out, tools should proceed and fail naturally with their own errors (e.g., "file not found"). This provides better error messages than blocking on init.

Generated with cmux

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Replace event-based waiting with promise-based state map in InitStateManager.
This eliminates race conditions between timeout and event firing, simplifies
cleanup, and handles workspace deletion correctly.

Changes:
- Add initPromises map to store completion promises for running inits
- startInit() creates promise, endInit() resolves it
- waitForInit() uses Promise.race() for clean timeout handling
- clearInMemoryState() rejects orphaned promises on workspace deletion
- No event listener cleanup needed (promises auto-resolve)

Extended integration tests:
- Verify tools wait for init completion before executing
- Send second message after init completes to verify state persistence
- Use failed init (exit code 1) to verify tools proceed regardless of status
- Both messages succeed, second is faster (no init wait)

Benefits:
- Multiple tools share same promise (no duplicate listeners)
- Promise.race() handles timeout cleanly (no manual cleanup)
- Workspace deletion properly cancels pending waits
- Simpler code, fewer edge cases, easier to reason about

Tests: All 7 integration tests pass (19s), 111 tool tests pass (3.5s)
- waitForInit() now never throws - tools proceed regardless of init outcome
- If init fails or times out, tools will fail naturally with their own errors
- Simplified toolHelpers wrapper (no longer needs error handling)
- Updated all 4 tools to use simplified helper
- Added test helpers: collectInitEvents() and waitForInitEnd()
- Refactored all 6 tests in initWorkspace.test.ts to use helpers
- Reduced test duplication by ~100 lines
- Updated InitStateManager test to reflect new non-throwing behavior
- Created wrapWithInitWait() wrapper to centralize init waiting
- Removed waitForWorkspaceInit() calls from bash, file_read, file_edit_insert, file_edit_operation
- Deleted toolHelpers.ts (no longer needed)
- Updated tools.ts to wrap runtime tools (bash, file_read, file_edit_*)
- Non-runtime tools (propose_plan, todo, web_search) unchanged
- Clearer separation: explicit list of which tools wait for init
- Net: -36 lines of code, better architecture
@ammario ammario changed the title 🤖 Eliminate race conditions in init state with promise-based waiting 🤖 Suspend tool calls until workspace init completes Oct 27, 2025
Remove initStateManager from ToolConfiguration interface - it's only needed
by wrapWithInitWait, not by individual tools.

Changes:
- Remove workspaceId and initStateManager from ToolConfiguration
- Update wrapWithInitWait to accept these as separate parameters
- Update getToolsForModel signature to pass them separately
- Update all call sites in aiService.ts and ipcMain.ts
- Update test helpers to match new config interface

This makes the separation clearer:
- Tools only receive the config they need (cwd, runtime, secrets, etc.)
- Init state management is isolated to the wrapper layer
- Non-runtime tools never see init-related fields

All tests pass (806 unit, 7 integration).
Add local 'wrap' helper to eliminate repetition of workspaceId and
initStateManager parameters across all runtime tool wrapper calls.

Before: wrapWithInitWait(tool, workspaceId, initStateManager) x4
After: wrap(tool) x4
TypeScript has a bug with duplicate Tool types from node_modules that
makes type inference fail with spread operator. The 'as' assertion is
necessary here and is actually safer than building the object manually.

Disable the ESLint rule for this specific case.
Explicitly instruct the model which tool to use in prompts to reduce
flakiness from tool selection variance. Also relax file_edit_insert test
to accept either file_edit_insert or file_edit_replace_string since both
are valid ways to accomplish the task.

This fixes intermittent CI failures where the model chose a different
tool than expected.
@ammario ammario merged commit 368aed4 into main Oct 27, 2025
13 checks passed
@ammario ammario deleted the rt-init-queue branch October 27, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants