fix: real-SDK integration tests — skip plan-mode test, fix subagent busy-spin#311993
Merged
roblourens merged 2 commits intomainfrom Apr 22, 2026
Merged
fix: real-SDK integration tests — skip plan-mode test, fix subagent busy-spin#311993roblourens merged 2 commits intomainfrom
roblourens merged 2 commits intomainfrom
Conversation
- Skip 'planning-mode session-state writes are auto-approved in default mode' because @github/copilot-sdk lacks agentMode:'plan' on MessageOptions and respondToExitPlanMode() on the session. Add a TODO comment tracking what needs to change in the SDK before this can be re-enabled. - Fix busy-spin in the subagent approval loop: track confirmed tool call IDs in a Set and skip re-approving ones already handled. waitForNotification returns matching entries from the queue without consuming them, so without this guard the loop would repeatedly re-approve the same tool call. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adjusts the real Copilot SDK agent-host integration tests to avoid a currently-unimplementable “plan mode” scenario and to prevent a subagent approval helper loop from repeatedly matching the same queued notification.
Changes:
- Skips the
planning-mode session-state writes are auto-approved in default modetest with a TODO explaining missing public SDK capabilities. - Updates the subagent test’s background approval loop to avoid re-processing already-seen
session/toolCallReadynotifications.
Show a summary per file
| File | Description |
|---|---|
| src/vs/platform/agentHost/test/node/protocol/toolApprovalRealSdk.integrationTest.ts | Skips a plan-mode test blocked by public SDK gaps; adds de-duping to the subagent tool approval loop to prevent busy-spin. |
Copilot's findings
Comments suppressed due to low confidence (1)
src/vs/platform/agentHost/test/node/protocol/toolApprovalRealSdk.integrationTest.ts:704
approvalLoopis started as a background async task, but it is only stopped/awaited on the success path (after the parentturnComplete). If an assertion throws beforeapprovalsActive = false, the loop continues running and its internal timeouts can keep the test process alive/hanging. Wrap the body after starting the loop in atry/finallythat flipsapprovalsActive = falseandawaitsapprovalLoop.
let approvalsActive = true;
let approvalSeq = 1000;
const confirmed = new Set<string>();
const approvalLoop = (async () => {
while (approvalsActive) {
try {
const ready = await client.waitForNotification(n => {
if (!isActionNotification(n, 'session/toolCallReady')) {
return false;
}
const a = getActionEnvelope(n).action as { toolCallId: string; confirmed?: string };
return !a.confirmed && !confirmed.has(a.toolCallId);
}, 2_000);
const action = getActionEnvelope(ready).action as { session: string; turnId: string; toolCallId: string; confirmed?: string };
if (!action.confirmed && !confirmed.has(action.toolCallId)) {
confirmed.add(action.toolCallId);
client.notify('dispatchAction', {
clientSeq: ++approvalSeq,
action: {
type: 'session/toolCallConfirmed',
session: action.session,
turnId: action.turnId,
toolCallId: action.toolCallId,
approved: true,
},
});
}
} catch {
// Timeout — re-poll. Loop exits when approvalsActive flips.
}
}
})();
// Encourage the model to delegate via the `task` subagent tool. The exact
// behaviour is non-deterministic — if the model declines we fail the test
// with a clear message rather than silently passing.
dispatchTurn(client, sessionUri, 'turn-sa',
'Use the `task` tool to spawn a subagent to list the files in the current working directory. ' +
'The subagent should call a single read-only tool (e.g. `view` or `bash` with `ls`) to enumerate the directory. ' +
'Do not enumerate the directory yourself — delegate to the subagent.',
1);
// Wait for the parent's `task` tool call to expose a Subagent content
// block carrying the subagent session URI.
const subagentContentNotif = await client.waitForNotification(n => {
if (!isActionNotification(n, 'session/toolCallContentChanged')) {
return false;
}
const action = getActionEnvelope(n).action as { session: string; content: readonly IToolResultContent[] };
return action.session === sessionUri && action.content.some(c => c.type === ToolResultContentType.Subagent);
}, 120_000);
const parentContent = (getActionEnvelope(subagentContentNotif).action as { content: readonly IToolResultContent[] }).content;
const subagentRef = parentContent.find((c): c is IToolResultSubagentContent => c.type === ToolResultContentType.Subagent)!;
const subagentSessionUri = subagentRef.resource as unknown as string;
assert.ok(typeof subagentSessionUri === 'string' && isSubagentSession(subagentSessionUri),
`subagent session URI should be subagent-shaped, got: ${JSON.stringify(subagentSessionUri)}`);
// Subscribe so we receive the subagent session's own action broadcasts.
await client.call<ISubscribeResult>('subscribe', { resource: subagentSessionUri });
// Wait for the parent turn to complete (with a generous timeout — the
// subagent's turn must finish first).
await client.waitForNotification(n => {
if (!isActionNotification(n, 'session/turnComplete')) {
return false;
}
return (getActionEnvelope(n).action as { session: string }).session === sessionUri;
}, 150_000);
approvalsActive = false;
await approvalLoop;
- Files reviewed: 1/1 changed files
- Comments generated: 1
The same toolCallId can appear in multiple notifications (e.g. a legitimate re-confirmation while the tool is running). Using toolCallId as the dedup key would silently drop those. Switch to serverSeq, which is unique per notification, so each distinct toolCallReady event is processed exactly once. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
Yoyokrazy
approved these changes
Apr 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two fixes to the real-SDK agent host integration tests. (Written by Copilot)
Changes
Skip
planning-modetestThe
planning-mode session-state writes are auto-approved in default modetest is skipped with aTODOcomment. The public@github/copilot-sdkcurrently lacks:agentMode: 'plan'onMessageOptions(no way to put the session into plan mode)respondToExitPlanMode()on the session class (no way to respond when the event fires)Without these the model never calls
exit_plan_modeandsawInputRequestnever becomestrue, so the test would always fail.Fix subagent approval loop busy-spin
The approval loop in the subagent test was busy-spinning because
waitForNotificationreturns matching entries from the notification queue without consuming them. Once a tool call was approved, the sametoolCallReadynotification kept matching, causing the loop to re-approve the same tool call repeatedly.Fix: track approved tool call IDs in a
Setand filter them out of both thewaitForNotificationpredicate and the dispatch guard.