Skip to content

Fix orchestration stale manifest writes after external bundle swap#365

Closed
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-detection-98b7
Closed

Fix orchestration stale manifest writes after external bundle swap#365
cursor[bot] wants to merge 1 commit into
mainfrom
cursor/critical-bug-detection-98b7

Conversation

@cursor
Copy link
Copy Markdown
Contributor

@cursor cursor Bot commented May 26, 2026

Bug and impact

Stale manifest overwrite (data corruption)
When manifest.json on disk is replaced externally with a different runId (e.g. git checkout on the lane worktree), the file watcher marked the run suspended but left the previous in-memory manifest cached. loadIntoRuntime short-circuited on that cache, so manifestPatch, claimTask, releaseTask, and related paths could persist the stale manifest and overwrite the foreign bundle on disk.

Root cause

  • handleExternalChange set suspended on runId mismatch but did not clear runtime.manifest.
  • Mutation APIs did not check runtime.suspended.
  • loadIntoRuntime threw on disk runId mismatch instead of suspending, which could surface as an unhandled error.

Fix

  • Clear runtime.manifest / runtime.planMd when the watcher detects a foreign runId.
  • Suspend (not throw) when loadIntoRuntime reads a mismatched manifest from disk.
  • Reject all mutation entry points while suspended via assertRunWritable / explicit suspended checks.

Validation

  • npx vitest run src/main/services/orchestration/orchestrationService.test.ts -t "external manifest runId"
  • npx vitest run src/main/services/orchestration/orchestrationService.test.ts src/main/services/orchestration/patchPolicy.test.ts

Does not overlap open PRs #363 (CRR alter / remote switch binding) or #364 (plan approval bypass / openRepo binding).

Open in Web View Automation 

When git checkout replaces manifest.json with a different runId, the
watcher marked the run suspended but kept a stale in-memory manifest.
Subsequent manifestPatch/claimTask calls could overwrite the foreign
manifest on disk.

Clear the runtime cache on runId mismatch, treat mismatches in
loadIntoRuntime as suspended, and reject all mutation paths while
suspended. Add a regression test for the watcher + patch interaction.

Co-authored-by: Arul Sharma <arul28@users.noreply.github.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
ade Ignored Ignored May 26, 2026 8:14am

@arul28
Copy link
Copy Markdown
Owner

arul28 commented May 28, 2026

Closing in favor of #382. I validated the stale-manifest bug and folded the corrected version into the combined orchestration hardening lane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants