Harden orchestration approval and manifest state by cursor[bot] · Pull Request #382 · arul28/ADE

cursor · 2026-05-28T09:11:48Z

Summary

require explicit plan approval decisions and block lead manifest patches that can self-approve the planning phase
suspend runs instead of writing through foreign/stale bundles, and reject on-disk manifest conflicts before atomic commit
relocate orchestration bundle paths for managed and persisted sessions after lane placement changes, including active watcher restart
restore the previous project binding when openRepo is cancelled or fails

Validation

npx vitest run src/main/services/orchestration/patchPolicy.test.ts
npx vitest run src/main/services/ai/tools/orchestrationTools.test.ts
npx vitest run src/main/services/orchestration/orchestrationService.test.ts
npx vitest run src/preload/preload.test.ts
npx vitest run src/main/services/chat/agentChatService.test.ts -t "orchestration|lane launch directives|openRepo"
npm --prefix apps/desktop run typecheck
npm --prefix apps/desktop run lint (warnings only, existing repo-wide warnings)
npm --prefix apps/desktop run build

folds validated fixes from Fix orchestration plan approval bypass and openRepo binding leak #364, Fix orchestration stale manifest writes after external bundle swap #365, Fix orchestration manifest clobber on external disk updates #375, and Fix stale orchestration bundle path after lane VM placement change #376 into this combined orchestration lane

Greptile Summary

This PR hardens the orchestration service against self-approval exploits, concurrent disk conflicts, and stale bundle paths after lane relocation. It also restores the project binding on a cancelled openRepo.

Conflict detection: persistManifest now double-checks the on-disk manifest before and inside atomicWrite via rejectIfDiskAdvanced; foreign-runId or advanced-generation states suspend the run rather than overwriting, and the temp file is cleaned up on any pre-commit failure.
Self-approval hardening: /leadState, /phases/{id:planning}/status, /phases/{id:planning}/completedAt, and /leadState/planApprovalSummary are added to LEAD_DENY_PATTERNS; the regex-based free-text approval check is replaced with an explicit decision field; and isOrchestrationPlanApproved no longer accepts a planning-phase status shortcut.
Bundle relocation: relocateRunBundle resets the runtime and restarts the watcher when a lane's worktree moves; handleLanePlacementChanged is made async and propagates relocations for both managed and cold (persisted) sessions with an unbounded limit: null query instead of the previous 500-row cap.

Confidence Score: 3/5

Safe to merge after the missing conflict-error handler in recordValidation is addressed; the rest of the changes are well-structured.

The new OrchestrationPersistConflictError type introduced in this PR is handled in externalManifestPatch and agentHeartbeat, but the recordValidation function's catch block only covers OrchestrationRunSuspendedError and rethrows everything else. A concurrent disk write during a validation record operation will propagate as an unhandled exception instead of the structured error response the function contract implies. Multiple agents recording validation results simultaneously — a normal multi-agent scenario — can trigger this path.

apps/desktop/src/main/services/orchestration/orchestrationService.ts — specifically the recordValidation try-catch around directPatch (lines ~1392–1408)

Important Files Changed

Filename	Overview
apps/desktop/src/main/services/orchestration/orchestrationService.ts	Core change: adds disk-conflict detection (`rejectIfDiskAdvanced`), suspension on foreign-runId, `relocateRunBundle`, and `assertRunWritable` guards. Missing `OrchestrationPersistConflictError` catch in `recordValidation` means concurrent writes surface as unhandled exceptions there.
apps/desktop/src/main/services/orchestration/patchPolicy.ts	Adds `/leadState`, `/leadState/planApprovalSummary`, `/phases/{id:planning}/status`, `/phases/{id:planning}/completedAt` to `LEAD_DENY_PATTERNS` to close self-approval bypass. Pattern matching is exact (not prefix), so sub-path mutability is preserved correctly.
apps/desktop/src/main/services/orchestration/runtimeProfile.ts	Removes the planning-phase-status shortcut from `isOrchestrationPlanApproved`; approval is now exclusively gated on `planApprovedAt` being set via the controlled `approvePlan` path.
apps/desktop/src/main/services/chat/agentChatService.ts	Adds bundle-path relocation for managed and cold sessions when a lane's worktree moves. Cold-session path now uses `limit: null` (no cap) and synchronous file I/O to rewrite the metadata JSON directly. Previously flagged `limit: 500` cap is resolved.
apps/desktop/src/main/services/sessions/sessionService.ts	Accepts `limit: null` to remove the SQLite LIMIT clause; default (undefined) still falls through to 200. Logic is correct.
apps/desktop/src/main/services/ai/tools/orchestrationTools.ts	Removes free-text regex approval check; plan approval now requires an explicit `decision: "accept"` or `"accept_for_session"` field to prevent false-positives on negation phrases.
apps/desktop/src/preload/preload.ts	Restores the previous project binding when `openRepo` returns falsy or throws, preventing a null project-binding state after a cancelled or failed open.
apps/desktop/src/main/services/lanes/laneService.ts	Makes `onPlacementChanged` callback async-aware so that the relocation await chain completes before `emitPlacementChanged` returns, avoiding fire-and-forget relocation races.
apps/desktop/src/main/main.ts	Awaits `handleLanePlacementChanged` in the `onPlacementChanged` callback to correctly propagate the now-async relocation chain.
apps/desktop/src/shared/types/sessions.ts	Widens `limit` in `ListSessionsArgs` from `number

Sequence Diagram

sequenceDiagram
    participant Caller
    participant persistManifest
    participant atomicWrite
    participant disk as Disk (manifest.json)
    participant runtime as RunRuntime

    Caller->>persistManifest: write next manifest
    persistManifest->>disk: rejectIfDiskAdvanced() [pre-check]
    alt runId mismatch on disk
        disk-->>persistManifest: foreign runId
        persistManifest->>runtime: "suspended=true, manifest=null"
        persistManifest-->>Caller: throw OrchestrationRunSuspendedError
    else disk generation advanced
        disk-->>persistManifest: newer serverGeneration
        persistManifest->>runtime: "manifest = onDisk"
        persistManifest-->>Caller: throw OrchestrationPersistConflictError
    else no conflict
        disk-->>persistManifest: ok / ENOENT
        persistManifest->>runtime: markSelfWrite()
        persistManifest->>atomicWrite: "write tmp, beforeCommit=rejectIfDiskAdvanced"
        atomicWrite->>disk: rejectIfDiskAdvanced() [post-write check]
        alt disk advanced between pre-check and write
            disk-->>atomicWrite: conflict
            atomicWrite->>disk: unlink(tmp)
            atomicWrite-->>persistManifest: throw
            persistManifest->>runtime: "recentSelfWriteUntil=0"
            persistManifest-->>Caller: throw
        else still ok
            atomicWrite->>disk: rename(tmp to manifest.json)
            persistManifest->>disk: writeServerGeneration(.gen)
            persistManifest->>runtime: "manifest = next"
            persistManifest-->>Caller: ok
        end
    end

Comments Outside Diff (1)

apps/desktop/src/main/services/orchestration/orchestrationService.ts, line 486-522 (link)

loadIntoRuntime does not clear suspended on successful reload

When a branch is restored after a foreign-runId swap, loadIntoRuntime reads the correct manifest and sets runtime.manifest, but never resets runtime.suspended = false. Every caller then checks if (runtime.suspended) or calls assertRunWritable before inspecting runtime.manifest, so all API operations (bundleRead, manifestPatch, etc.) return a false "suspended" error even though the correct bundle is already loaded.

The watcher path (handleExternalChange, lines 628–636) does reset the flag when it processes the same file-change event, but there is a race: if any API call enters the mutex before the debounced watcher task does, it will see suspended = true and return an error even though the correct manifest is sitting in runtime.manifest. Fix: add runtime.suspended = false; at the end of the successful-load path in loadIntoRuntime.

Prompt To Fix With AI

This is a comment left during a code review.
Path: apps/desktop/src/main/services/orchestration/orchestrationService.ts
Line: 486-522

Comment:
**`loadIntoRuntime` does not clear `suspended` on successful reload**

When a branch is restored after a foreign-runId swap, `loadIntoRuntime` reads the correct manifest and sets `runtime.manifest`, but never resets `runtime.suspended = false`. Every caller then checks `if (runtime.suspended)` or calls `assertRunWritable` before inspecting `runtime.manifest`, so all API operations (`bundleRead`, `manifestPatch`, etc.) return a false "suspended" error even though the correct bundle is already loaded.

The watcher path (`handleExternalChange`, lines 628–636) does reset the flag when it processes the same file-change event, but there is a race: if any API call enters the mutex before the debounced watcher task does, it will see `suspended = true` and return an error even though the correct manifest is sitting in `runtime.manifest`. Fix: add `runtime.suspended = false;` at the end of the successful-load path in `loadIntoRuntime`.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
apps/desktop/src/main/services/orchestration/orchestrationService.ts:1399-1408
**Missing `OrchestrationPersistConflictError` handler in `recordValidation`**

`directPatch` → `persistManifest` now throws `OrchestrationPersistConflictError` (new in this PR) when the second `rejectIfDiskAdvanced` check inside `atomicWrite.beforeCommit` detects a concurrent write. The catch block here only handles `OrchestrationRunSuspendedError` and rethrows everything else. Any concurrent manifest mutation during a `recordValidation` call will therefore surface as an uncaught exception to the IPC caller instead of returning the structured `{ ok: false, error: "etag_conflict" }` response that the function's return type implies. The sibling call-sites `externalManifestPatch` and `agentHeartbeat` both handle `OrchestrationPersistConflictError` explicitly — this one was missed.

_{Reviews (5): Last reviewed commit: "Harden orchestration approval and manife..." | Re-trigger Greptile}

Greptile also left 1 inline comment on this PR.

vercel · 2026-05-28T09:11:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
ade	Ignored	Preview	May 29, 2026 12:30am

capy-ai · 2026-05-28T23:01:10Z

Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews.

cursor

PR Review

Scope: 11 file(s), +742 / −32
Verdict: Minor issues

This PR tightens orchestration correctness: explicit plan approval (no regex or planning-phase shortcuts), lead self-approval blocked in patch policy, manifest suspend/conflict handling on external bundle changes, bundle-path repointing when lane placement/worktree moves, and preload project-binding restore on cancelled openRepo. The changes are well-tested; one edge case in cold-session repointing is worth fixing.

🐛 Functionality

[Medium] Cold orchestration sessions can keep a stale bundle path after lane placement changes

File: apps/desktop/src/main/services/chat/agentChatService.ts:23807-23836
Issue: When repointing persisted (non-managed) orchestration sessions, the handler scans sessionService.list({ limit: 500 }) ordered by started_at desc. Any chat session with an orchestrationRunId that is not in the newest 500 rows is skipped, so its orchestrationBundlePath stays on the pre-move path after VM detach or worktree relocation.
Repro: Create 501+ chat sessions in a project; start an orchestration lead run, dispose the session (cold), change the lane worktree or detach from Mac VM, then reopen the cold session and invoke orchestration tools — metadata still points at the old bundle directory.
Fix: Query only sessions that carry orchestration metadata (e.g. filter persisted rows / metadata files with orchestrationRunId), or pass laneId into sessionService.list and drop the fixed 500 cap for this path.

Notes

Good hardening overall: requestPlanApproval requiring decision === "accept" | "accept_for_session", isOrchestrationPlanApproved no longer treating planning phase done as approval, persistManifest TOCTOU guard + suspend on runId mismatch, and watcher callbacks serialized under the run mutex.
Bundle relocation assumes orchestration files already exist at the destination worktree (mirror sync / manual copy); that matches VM mirror behavior (.ade/orchestration is not rsync-excluded).
VM detach still depends on mirror→lane flush before the share directory is removed; stopMirrorSyncForLane is optional on the Mac VM service — worth verifying separately if detach-related manifest loss is reported.

_{Sent by Cursor Automation: BUGBOT in Versic}

Fold the validated orchestration fixes into one lane: require explicit plan approval, block lead self-approval patches, suspend on foreign bundle swaps, detect on-disk manifest conflicts, relocate run bundles after lane placement changes, and restore preload project bindings after cancelled openRepo.

greptile-apps · 2026-05-29T00:36:19Z

+      } catch (err) {
+        if (err instanceof OrchestrationRunSuspendedError) {
+          return {
+            ok: false,
+            error: "validation_failed",
+            message: RUN_SUSPENDED_MESSAGE,
+          };
+        }
+        throw err;
+      }


Missing OrchestrationPersistConflictError handler in recordValidation

directPatch → persistManifest now throws OrchestrationPersistConflictError (new in this PR) when the second rejectIfDiskAdvanced check inside atomicWrite.beforeCommit detects a concurrent write. The catch block here only handles OrchestrationRunSuspendedError and rethrows everything else. Any concurrent manifest mutation during a recordValidation call will therefore surface as an uncaught exception to the IPC caller instead of returning the structured { ok: false, error: "etag_conflict" } response that the function's return type implies. The sibling call-sites externalManifestPatch and agentHeartbeat both handle OrchestrationPersistConflictError explicitly — this one was missed.

Prompt To Fix With AI

This is a comment left during a code review. Path: apps/desktop/src/main/services/orchestration/orchestrationService.ts Line: 1399-1408 Comment: **Missing `OrchestrationPersistConflictError` handler in `recordValidation`** `directPatch` → `persistManifest` now throws `OrchestrationPersistConflictError` (new in this PR) when the second `rejectIfDiskAdvanced` check inside `atomicWrite.beforeCommit` detects a concurrent write. The catch block here only handles `OrchestrationRunSuspendedError` and rethrows everything else. Any concurrent manifest mutation during a `recordValidation` call will therefore surface as an uncaught exception to the IPC caller instead of returning the structured `{ ok: false, error: "etag_conflict" }` response that the function's return type implies. The sibling call-sites `externalManifestPatch` and `agentHeartbeat` both handle `OrchestrationPersistConflictError` explicitly — this one was missed. How can I resolve this? If you propose a fix, please make it concise.

arul28 changed the title ~~Fix orchestration plan approval false positives from rejection text~~ Harden orchestration approval and manifest state May 28, 2026

arul28 force-pushed the cursor/critical-correctness-bugs-5028 branch from 9e250d3 to 888598e Compare May 28, 2026 23:00

arul28 marked this pull request as ready for review May 28, 2026 23:00

arul28 self-requested a review as a code owner May 28, 2026 23:00

cursor Bot commented May 28, 2026

View reviewed changes

greptile-apps Bot reviewed May 28, 2026

View reviewed changes

Comment thread apps/desktop/src/main/services/orchestration/orchestrationService.ts Outdated

Comment thread apps/desktop/src/main/services/orchestration/orchestrationService.ts Outdated

Comment thread apps/desktop/src/main/services/chat/agentChatService.ts Outdated

arul28 force-pushed the cursor/critical-correctness-bugs-5028 branch from 888598e to 8d0384b Compare May 28, 2026 23:28

greptile-apps Bot reviewed May 28, 2026

View reviewed changes

Comment thread apps/desktop/src/main/services/orchestration/orchestrationService.ts

Comment thread apps/desktop/src/main/services/orchestration/orchestrationService.ts Outdated

arul28 force-pushed the cursor/critical-correctness-bugs-5028 branch from 8d0384b to a1763c5 Compare May 28, 2026 23:52

greptile-apps Bot reviewed May 29, 2026

View reviewed changes

Comment thread apps/desktop/src/main/services/orchestration/orchestrationService.ts

arul28 force-pushed the cursor/critical-correctness-bugs-5028 branch 2 times, most recently from 9c69b32 to ce54f9e Compare May 29, 2026 00:13

arul28 force-pushed the cursor/critical-correctness-bugs-5028 branch from ce54f9e to e919c82 Compare May 29, 2026 00:30

greptile-apps Bot reviewed May 29, 2026

View reviewed changes

arul28 merged commit 4a44e4a into main May 29, 2026
27 checks passed

arul28 deleted the cursor/critical-correctness-bugs-5028 branch May 29, 2026 01:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden orchestration approval and manifest state#382

Harden orchestration approval and manifest state#382
arul28 merged 1 commit into
mainfrom
cursor/critical-correctness-bugs-5028

cursor Bot commented May 28, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

vercel Bot commented May 28, 2026 •

edited

Loading

Uh oh!

capy-ai Bot commented May 28, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cursor Bot commented May 28, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Related

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

vercel Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

capy-ai Bot commented May 28, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

PR Review

🐛 Functionality

[Medium] Cold orchestration sessions can keep a stale bundle path after lane placement changes

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor Bot commented May 28, 2026 •

edited by greptile-apps Bot

Loading

vercel Bot commented May 28, 2026 •

edited

Loading