Skip to content

fix: real-SDK integration tests — skip plan-mode test, fix subagent busy-spin#311993

Merged
roblourens merged 2 commits intomainfrom
roblou/agents/fix-sdk-agent-host-tests
Apr 22, 2026
Merged

fix: real-SDK integration tests — skip plan-mode test, fix subagent busy-spin#311993
roblourens merged 2 commits intomainfrom
roblou/agents/fix-sdk-agent-host-tests

Conversation

@roblourens
Copy link
Copy Markdown
Member

Two fixes to the real-SDK agent host integration tests. (Written by Copilot)

Changes

Skip planning-mode test

The planning-mode session-state writes are auto-approved in default mode test is skipped with a TODO comment. The public @github/copilot-sdk currently lacks:

  • agentMode: 'plan' on MessageOptions (no way to put the session into plan mode)
  • respondToExitPlanMode() on the session class (no way to respond when the event fires)

Without these the model never calls exit_plan_mode and sawInputRequest never becomes true, so the test would always fail.

Fix subagent approval loop busy-spin

The approval loop in the subagent test was busy-spinning because waitForNotification returns matching entries from the notification queue without consuming them. Once a tool call was approved, the same toolCallReady notification kept matching, causing the loop to re-approve the same tool call repeatedly.

Fix: track approved tool call IDs in a Set and filter them out of both the waitForNotification predicate and the dispatch guard.

- Skip 'planning-mode session-state writes are auto-approved in default mode'
  because @github/copilot-sdk lacks agentMode:'plan' on MessageOptions and
  respondToExitPlanMode() on the session. Add a TODO comment tracking what
  needs to change in the SDK before this can be re-enabled.

- Fix busy-spin in the subagent approval loop: track confirmed tool call IDs
  in a Set and skip re-approving ones already handled. waitForNotification
  returns matching entries from the queue without consuming them, so without
  this guard the loop would repeatedly re-approve the same tool call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 22, 2026 20:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the real Copilot SDK agent-host integration tests to avoid a currently-unimplementable “plan mode” scenario and to prevent a subagent approval helper loop from repeatedly matching the same queued notification.

Changes:

  • Skips the planning-mode session-state writes are auto-approved in default mode test with a TODO explaining missing public SDK capabilities.
  • Updates the subagent test’s background approval loop to avoid re-processing already-seen session/toolCallReady notifications.
Show a summary per file
File Description
src/vs/platform/agentHost/test/node/protocol/toolApprovalRealSdk.integrationTest.ts Skips a plan-mode test blocked by public SDK gaps; adds de-duping to the subagent tool approval loop to prevent busy-spin.

Copilot's findings

Comments suppressed due to low confidence (1)

src/vs/platform/agentHost/test/node/protocol/toolApprovalRealSdk.integrationTest.ts:704

  • approvalLoop is started as a background async task, but it is only stopped/awaited on the success path (after the parent turnComplete). If an assertion throws before approvalsActive = false, the loop continues running and its internal timeouts can keep the test process alive/hanging. Wrap the body after starting the loop in a try/finally that flips approvalsActive = false and awaits approvalLoop.
		let approvalsActive = true;
		let approvalSeq = 1000;
		const confirmed = new Set<string>();
		const approvalLoop = (async () => {
			while (approvalsActive) {
				try {
					const ready = await client.waitForNotification(n => {
						if (!isActionNotification(n, 'session/toolCallReady')) {
							return false;
						}
						const a = getActionEnvelope(n).action as { toolCallId: string; confirmed?: string };
						return !a.confirmed && !confirmed.has(a.toolCallId);
					}, 2_000);
					const action = getActionEnvelope(ready).action as { session: string; turnId: string; toolCallId: string; confirmed?: string };
					if (!action.confirmed && !confirmed.has(action.toolCallId)) {
						confirmed.add(action.toolCallId);
						client.notify('dispatchAction', {
							clientSeq: ++approvalSeq,
							action: {
								type: 'session/toolCallConfirmed',
								session: action.session,
								turnId: action.turnId,
								toolCallId: action.toolCallId,
								approved: true,
							},
						});
					}
				} catch {
					// Timeout — re-poll. Loop exits when approvalsActive flips.
				}
			}
		})();

		// Encourage the model to delegate via the `task` subagent tool. The exact
		// behaviour is non-deterministic — if the model declines we fail the test
		// with a clear message rather than silently passing.
		dispatchTurn(client, sessionUri, 'turn-sa',
			'Use the `task` tool to spawn a subagent to list the files in the current working directory. ' +
			'The subagent should call a single read-only tool (e.g. `view` or `bash` with `ls`) to enumerate the directory. ' +
			'Do not enumerate the directory yourself — delegate to the subagent.',
			1);

		// Wait for the parent's `task` tool call to expose a Subagent content
		// block carrying the subagent session URI.
		const subagentContentNotif = await client.waitForNotification(n => {
			if (!isActionNotification(n, 'session/toolCallContentChanged')) {
				return false;
			}
			const action = getActionEnvelope(n).action as { session: string; content: readonly IToolResultContent[] };
			return action.session === sessionUri && action.content.some(c => c.type === ToolResultContentType.Subagent);
		}, 120_000);

		const parentContent = (getActionEnvelope(subagentContentNotif).action as { content: readonly IToolResultContent[] }).content;
		const subagentRef = parentContent.find((c): c is IToolResultSubagentContent => c.type === ToolResultContentType.Subagent)!;
		const subagentSessionUri = subagentRef.resource as unknown as string;
		assert.ok(typeof subagentSessionUri === 'string' && isSubagentSession(subagentSessionUri),
			`subagent session URI should be subagent-shaped, got: ${JSON.stringify(subagentSessionUri)}`);

		// Subscribe so we receive the subagent session's own action broadcasts.
		await client.call<ISubscribeResult>('subscribe', { resource: subagentSessionUri });

		// Wait for the parent turn to complete (with a generous timeout — the
		// subagent's turn must finish first).
		await client.waitForNotification(n => {
			if (!isActionNotification(n, 'session/turnComplete')) {
				return false;
			}
			return (getActionEnvelope(n).action as { session: string }).session === sessionUri;
		}, 150_000);

		approvalsActive = false;
		await approvalLoop;

  • Files reviewed: 1/1 changed files
  • Comments generated: 1

The same toolCallId can appear in multiple notifications (e.g. a legitimate
re-confirmation while the tool is running). Using toolCallId as the
dedup key would silently drop those. Switch to serverSeq, which is unique
per notification, so each distinct toolCallReady event is processed
exactly once.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Screenshot Changes

Base: f6c99af3 Current: 82d10b99

Changed (1)

chat/aiCustomizations/aiCustomizationManagementEditor/McpBrowseMode/Light
Before After
before after

@roblourens roblourens enabled auto-merge (squash) April 22, 2026 20:33
@roblourens roblourens merged commit c10232d into main Apr 22, 2026
26 checks passed
@roblourens roblourens deleted the roblou/agents/fix-sdk-agent-host-tests branch April 22, 2026 20:37
@vs-code-engineering vs-code-engineering Bot added this to the 1.118.0 milestone Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants