Skip to content

E2E test infrastructure improvements#26

Closed
SteveSandersonMS wants to merge 2 commits into
mainfrom
stevesa/e2e-infra-snapshots
Closed

E2E test infrastructure improvements#26
SteveSandersonMS wants to merge 2 commits into
mainfrom
stevesa/e2e-infra-snapshots

Conversation

@SteveSandersonMS
Copy link
Copy Markdown
Contributor

Summary

Two improvements to E2E test infrastructure:

1. Skip writing snapshots on test failure

Prevents corrupted snapshots from being written when tests fail. Each language uses its native test framework hooks:

  • Node.js: vitest \onTestFailed()\ hook
  • Python: \pytest_runtest_makereport\ hook
  • Go: \ .Failed()\ check in cleanup
  • .NET: Checks \CI\ env var (xUnit lacks failure detection hooks)

2. Update snapshots for Anthropic extended thinking

The CLI now coalesces tool calls into single assistant messages for Anthropic extended thinking compatibility. This updates all affected snapshots to match the new format.

Related

Mirrors changes from https://github.com/github/copilot-agent-runtime/pull/1508

@SteveSandersonMS SteveSandersonMS requested a review from a team as a code owner January 16, 2026 00:35
Copilot AI review requested due to automatic review settings January 16, 2026 00:35
@SteveSandersonMS SteveSandersonMS deleted the stevesa/e2e-infra-snapshots branch January 16, 2026 00:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves E2E test infrastructure with two main enhancements: preventing corrupted snapshot writes on test failures and updating snapshots to reflect the CLI's new behavior of coalescing tool calls into single assistant messages for Anthropic extended thinking compatibility.

Changes:

  • Implements test failure detection across all SDK languages to prevent writing corrupted snapshots
  • Updates snapshot files to include both old (separate assistant messages) and new (coalesced tool calls) conversation formats
  • Modifies proxy stop endpoints to accept a skipWritingCache parameter

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/snapshots/tools/invokes_built_in_tools.yaml Adds new conversation format with coalesced tool calls and updates response content
test/snapshots/session/should_create_session_with_custom_tool.yaml Removes intermediate assistant content message
test/snapshots/permissions/*.yaml Adds new conversation snapshots with coalesced tool calls for all permission tests
test/harness/replayingCapiProxy.ts Adds skipWritingCache query parameter support to /stop endpoint
python/e2e/testharness/proxy.py Adds skip_writing_cache parameter to stop() method with documentation
python/e2e/testharness/context.py Adds test_failed parameter to teardown() to control snapshot writing
python/e2e/conftest.py Implements pytest hook to track test failures globally
nodejs/test/e2e/harness/sdkTestContext.ts Implements vitest onTestFailed hook to track failures
nodejs/test/e2e/harness/CapiProxy.ts Adds skipWritingCache parameter to stop() method
go/e2e/testharness/proxy.go Adds StopWithOptions() method with skipWritingCache support
go/e2e/testharness/context.go Uses t.Failed() to detect test failures in cleanup
dotnet/test/Harness/E2ETestContext.cs Uses CI environment variable as workaround for test failure detection
dotnet/test/Harness/CapiProxy.cs Adds skipWritingCache parameter to StopAsync() method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/e2e/conftest.py
Comment on lines +9 to +10
# Track if any test failed to avoid writing corrupted snapshots
_any_test_failed = False
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global _any_test_failed variable is not thread-safe. If pytest runs tests in parallel (e.g., with pytest-xdist), multiple test processes could have race conditions. Consider using pytest's built-in mechanisms or a thread-safe approach to track test failures.

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +106
// Skip writing snapshots in CI to avoid corrupting them on test failures
var isCI = !string.IsNullOrEmpty(Environment.GetEnvironmentVariable("CI"));
await _proxy.StopAsync(skipWritingCache: isCI);
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .NET implementation uses a CI environment variable check as a workaround for lack of test failure detection, while other languages (Go, Python, Node.js) properly detect actual test failures. This means snapshots will always be skipped in CI even for passing tests, which differs from the behavior in other languages and may not align with the PR's stated goal of only skipping on failure. Consider using a more sophisticated approach like capturing test status through ITestOutputHelper or other xUnit extensibility points.

Suggested change
// Skip writing snapshots in CI to avoid corrupting them on test failures
var isCI = !string.IsNullOrEmpty(Environment.GetEnvironmentVariable("CI"));
await _proxy.StopAsync(skipWritingCache: isCI);
await _proxy.StopAsync(skipWritingCache: false);

Copilot uses AI. Check for mistakes.
options.requestOptions.path?.startsWith("/stop") &&
options.requestOptions.method === "POST"
) {
const skipWritingCache = options.requestOptions.path.includes("skipWritingCache=true");
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using string .includes() to parse query parameters is fragile and could match unintended patterns (e.g., ?foo=skipWritingCache=true or ?skipWritingCache=true&bar=1). Consider using a proper URL query parameter parser like new URLSearchParams() for more robust parameter extraction.

Suggested change
const skipWritingCache = options.requestOptions.path.includes("skipWritingCache=true");
let skipWritingCache = false;
const path = options.requestOptions.path;
const queryIndex = path?.indexOf("?");
if (queryIndex !== undefined && queryIndex !== -1) {
const searchParams = new URLSearchParams(path.substring(queryIndex + 1));
skipWritingCache = searchParams.get("skipWritingCache") === "true";
}

Copilot uses AI. Check for mistakes.
jmoseley pushed a commit that referenced this pull request Jan 21, 2026
* Expose permissions requests in SDK client

* bump copilot version

* update package.json

* Fix go client

* fix lint and formatting

* Add dotnet support

* format python

* Generate stubs

* Make the tests work

* reformat python

* formatting

* linting

* Remove resume test

* Backslashes
edburns added a commit that referenced this pull request May 26, 2026
Add 10 Java scenario implementations covering:
- callbacks: hooks (pre/post tool use, session start/end), permissions
- prompts: attachments
- sessions: concurrent-sessions, session-resume
- tools: custom-agents, tool-overrides, mcp-servers, skills
- auth: gh-app (OAuth device flow)

All scenarios compile successfully with mvn compile.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants