E2E test infrastructure improvements by SteveSandersonMS · Pull Request #26 · github/copilot-sdk

SteveSandersonMS · 2026-01-16T00:34:59Z

Summary

Two improvements to E2E test infrastructure:

1. Skip writing snapshots on test failure

Prevents corrupted snapshots from being written when tests fail. Each language uses its native test framework hooks:

Node.js: vitest \onTestFailed()\ hook
Python: \pytest_runtest_makereport\ hook
Go: \ .Failed()\ check in cleanup
.NET: Checks \CI\ env var (xUnit lacks failure detection hooks)

2. Update snapshots for Anthropic extended thinking

The CLI now coalesces tool calls into single assistant messages for Anthropic extended thinking compatibility. This updates all affected snapshots to match the new format.

Pull request overview

This PR improves E2E test infrastructure with two main enhancements: preventing corrupted snapshot writes on test failures and updating snapshots to reflect the CLI's new behavior of coalescing tool calls into single assistant messages for Anthropic extended thinking compatibility.

Changes:

Implements test failure detection across all SDK languages to prevent writing corrupted snapshots
Updates snapshot files to include both old (separate assistant messages) and new (coalesced tool calls) conversation formats
Modifies proxy stop endpoints to accept a skipWritingCache parameter

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
test/snapshots/tools/invokes_built_in_tools.yaml	Adds new conversation format with coalesced tool calls and updates response content
test/snapshots/session/should_create_session_with_custom_tool.yaml	Removes intermediate assistant content message
test/snapshots/permissions/*.yaml	Adds new conversation snapshots with coalesced tool calls for all permission tests
test/harness/replayingCapiProxy.ts	Adds skipWritingCache query parameter support to /stop endpoint
python/e2e/testharness/proxy.py	Adds skip_writing_cache parameter to stop() method with documentation
python/e2e/testharness/context.py	Adds test_failed parameter to teardown() to control snapshot writing
python/e2e/conftest.py	Implements pytest hook to track test failures globally
nodejs/test/e2e/harness/sdkTestContext.ts	Implements vitest onTestFailed hook to track failures
nodejs/test/e2e/harness/CapiProxy.ts	Adds skipWritingCache parameter to stop() method
go/e2e/testharness/proxy.go	Adds StopWithOptions() method with skipWritingCache support
go/e2e/testharness/context.go	Uses t.Failed() to detect test failures in cleanup
dotnet/test/Harness/E2ETestContext.cs	Uses CI environment variable as workaround for test failure detection
dotnet/test/Harness/CapiProxy.cs	Adds skipWritingCache parameter to StopAsync() method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-16T00:37:39Z

+# Track if any test failed to avoid writing corrupted snapshots
+_any_test_failed = False


The global _any_test_failed variable is not thread-safe. If pytest runs tests in parallel (e.g., with pytest-xdist), multiple test processes could have race conditions. Consider using pytest's built-in mechanisms or a thread-safe approach to track test failures.

Copilot · 2026-01-16T00:37:39Z

+        // Skip writing snapshots in CI to avoid corrupting them on test failures
+        var isCI = !string.IsNullOrEmpty(Environment.GetEnvironmentVariable("CI"));
+        await _proxy.StopAsync(skipWritingCache: isCI);


The .NET implementation uses a CI environment variable check as a workaround for lack of test failure detection, while other languages (Go, Python, Node.js) properly detect actual test failures. This means snapshots will always be skipped in CI even for passing tests, which differs from the behavior in other languages and may not align with the PR's stated goal of only skipping on failure. Consider using a more sophisticated approach like capturing test status through ITestOutputHelper or other xUnit extensibility points.

Suggested change

// Skip writing snapshots in CI to avoid corrupting them on test failures

var isCI = !string.IsNullOrEmpty(Environment.GetEnvironmentVariable("CI"));

await _proxy.StopAsync(skipWritingCache: isCI);

await _proxy.StopAsync(skipWritingCache: false);

Copilot · 2026-01-16T00:37:39Z

+          options.requestOptions.path?.startsWith("/stop") &&
          options.requestOptions.method === "POST"
        ) {
+          const skipWritingCache = options.requestOptions.path.includes("skipWritingCache=true");


Using string .includes() to parse query parameters is fragile and could match unintended patterns (e.g., ?foo=skipWritingCache=true or ?skipWritingCache=true&bar=1). Consider using a proper URL query parameter parser like new URLSearchParams() for more robust parameter extraction.

Suggested change

const skipWritingCache = options.requestOptions.path.includes("skipWritingCache=true");

let skipWritingCache = false;

const path = options.requestOptions.path;

const queryIndex = path?.indexOf("?");

if (queryIndex !== undefined && queryIndex !== -1) {

const searchParams = new URLSearchParams(path.substring(queryIndex + 1));

skipWritingCache = searchParams.get("skipWritingCache") === "true";

}

* Expose permissions requests in SDK client * bump copilot version * update package.json * Fix go client * fix lint and formatting * Add dotnet support * format python * Generate stubs * Make the tests work * reformat python * formatting * linting * Remove resume test * Backslashes

Add 10 Java scenario implementations covering: - callbacks: hooks (pre/post tool use, session start/end), permissions - prompts: attachments - sessions: concurrent-sessions, session-resume - tools: custom-agents, tool-overrides, mcp-servers, skills - auth: gh-app (OAuth device flow) All scenarios compile successfully with mvn compile. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SteveSandersonMS added 2 commits January 15, 2026 16:34

Skip writing snapshots when tests fail

643e69f

Update snapshots for new runtime Anthropic client behavior (coalesces…

103a399

… tool calls)

SteveSandersonMS requested a review from a team as a code owner January 16, 2026 00:35

Copilot AI review requested due to automatic review settings January 16, 2026 00:35

Copilot started reviewing on behalf of SteveSandersonMS January 16, 2026 00:35 View session

SteveSandersonMS closed this Jan 16, 2026

SteveSandersonMS deleted the stevesa/e2e-infra-snapshots branch January 16, 2026 00:36

Copilot AI reviewed Jan 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E test infrastructure improvements#26

E2E test infrastructure improvements#26
SteveSandersonMS wants to merge 2 commits into
mainfrom
stevesa/e2e-infra-snapshots

SteveSandersonMS commented Jan 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Copilot AI Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Track if any test failed to avoid writing corrupted snapshots
		_any_test_failed = False

-          const skipWritingCache = options.requestOptions.path.includes("skipWritingCache=true");
+          let skipWritingCache = false;
+          const path = options.requestOptions.path;
+          const queryIndex = path?.indexOf("?");
+          if (queryIndex !== undefined && queryIndex !== -1) {
+            const searchParams = new URLSearchParams(path.substring(queryIndex + 1));
+            skipWritingCache = searchParams.get("skipWritingCache") === "true";
+          }

Conversation

SteveSandersonMS commented Jan 16, 2026

Summary

1. Skip writing snapshots on test failure

2. Update snapshots for Anthropic extended thinking

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants