.NET: Python: Add dotnet integration test report to CI by giles17 · Pull Request #5515 · microsoft/agent-framework

giles17 · 2026-04-27T15:06:32Z

Motivation and Context

Improve .NET integration test reliability and add visibility into test results across CI runs. This PR:

Adds a dotnet integration test report job (mirroring the existing Python report)
Adds retry logic ([RetryFact]) to flaky LLM-dependent integration tests to reduce spurious failures
Fixes minor CI issues (credential hygiene, test timeouts)

Description

Integration Test Report (`dotnet-build-and-test.yml`)

Add --report-junit flag to integration test steps to generate JUnit XML output
Add --results-directory ../IntegrationTestResults/ to centralize output (separate from unit test TRX results)
Upload JUnit XML artifacts from each matrix leg (net10.0/ubuntu, net472/windows)
Add new dotnet-integration-test-report job that aggregates results, generates a trend report, and posts to Job Summary
Add persist-credentials: false to checkout step for security consistency

dotnet-test (net10.0/ubuntu) ──┐
                                ├──> dotnet-integration-test-report
dotnet-test (net472/windows) ──┘    (downloads JUnit XML, runs aggregate.py,
                                     posts to Job Summary, caches history)

The report job is not in the merge gate (dotnet-build-and-test-check doesn't depend on it) and only runs on non-PR events.

RetryFact for Flaky Integration Tests

DurableTask (ConsoleAppSamplesValidation.cs, ExternalClientTests.cs, WorkflowConsoleAppSamplesValidation.cs): All LLM-dependent tests now use [RetryFact(2, 5000)] — retries once after 5s delay on transient LLM failures
AzureFunctions (SamplesValidation.cs): Same retry pattern applied to all 7 active tests; LongRunningTools timeouts increased from 90s to 180s

Aggregate Script (`python/scripts/integration_test_report/aggregate.py`)

Refactor XML file discovery to support both pytest (pytest.xml) and xunit (*.junit) layouts via _discover_xml_files()
Handle dotnet artifact naming convention (dotnet-test-results-{framework}-{os})
Fix node-id collision when the same test runs under multiple frameworks by qualifying keys with provider
Generate per-framework tables for dotnet results
Improve module extraction for dotnet C# classnames (recognizes IntegrationTests/UnitTests namespace segments)

Key Design Decisions

JUnit XML via --report-junit instead of parsing TRX — xunit v3 supports native JUnit generation, allowing reuse of the existing Python report script
Integration-only — unit tests run on different cadence (PRs vs merges); mixing would make trends incomparable
Separate history stream — distinct cache key (dotnet-integration-report-history-) prevents dotnet/Python history from interleaving
RetryFact over CI-level retry — per-test retry is faster and more targeted than re-running the entire job

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? No

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 90% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach

Automated review by giles17's agents

Copilot

Pull request overview

This PR adds CI visibility for .NET integration test outcomes by publishing JUnit XML from the existing dotnet-test matrix legs and reusing the existing Python trend-aggregation script to generate a Job Summary report with cached history.

Changes:

Update the .NET integration test step to emit JUnit XML into a dedicated IntegrationTestResults/ directory and upload those XML files as per-matrix artifacts.
Add a new dotnet-integration-test-report job that downloads the artifacts, aggregates them into a trend report, posts it to the GitHub Actions Job Summary, and caches history.
Refactor python/scripts/flaky_report/aggregate.py to discover both pytest.xml and *.junit.xml, derive dotnet “provider” labels, and avoid nodeid collisions across providers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`python/scripts/flaky_report/aggregate.py`	Extends report discovery/parsing to support dotnet xUnit JUnit XML and multi-provider collision handling.
`.github/workflows/dotnet-build-and-test.yml`	Generates/uploads dotnet integration JUnit XML and adds a reporting job to aggregate and publish a trend report.

- Add --report-junit flag to dotnet integration test step to generate JUnit XML alongside TRX, with explicit --results-directory to centralize output in IntegrationTestResults/ - Upload JUnit XML artifacts from each matrix leg (net10.0/ubuntu, net472/windows) as dotnet-test-results-{framework}-{os} - Add dotnet-integration-test-report job that downloads artifacts, runs the existing aggregate.py script, posts markdown to Job Summary, and saves trend history via actions/cache - Refactor aggregate.py to discover JUnit XML files recursively, supporting both pytest (pytest.xml) and xunit (*.junit.xml) layouts - Handle provider name derivation for dotnet artifact naming convention - Fix nodeid collision when same test runs under multiple frameworks by qualifying keys with provider when collisions are detected - Improve module extraction for dotnet C# classnames (recognizes IntegrationTests/UnitTests namespace segments) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

xUnit v3 generates files with .junit extension, not .junit.xml. Update upload glob and aggregate.py discovery to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Always prefix dotnet test keys with provider (e.g. net10.0 (ubuntu)::TestName) to ensure stable, comparable counts across runs regardless of file parse order. Also show Executed (passed+failed) instead of Total in summary table. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The LongRunningToolsSampleValidationAsync test in the AzureFunctions integration tests was failing in CI with TimeoutException at the 'Content published notification is logged' step. The 90-second timeouts are too tight for CI environments where LLM calls and orchestration overhead can be slow. Increased all three WaitForConditionAsync timeouts from 90s to 180s: - Waiting for human feedback notification - Waiting for publish notification (the step that was failing) - Waiting for orchestration completion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge upstream/main which renamed scripts/flaky_report/ to scripts/integration_test_report/ (from Python PR #5454). Update the dotnet-build-and-test workflow to reference the new path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

These tests interact with LLMs via stdin/stdout (DurableTask) or HTTP (AzureFunctions) and are inherently non-deterministic. Unlike the Python side which uses pytest-retry, the dotnet tests had no retry mechanism and a single transient failure would fail the entire CI run. Changes: - Switch [Fact] to [RetryFact(2, 5000)] on all LLM-dependent tests across ConsoleAppSamplesValidation, ExternalClientTests, WorkflowConsoleAppSamplesValidation, and AzureFunctions SamplesValidation - Add re-prompt mechanism to LongRunningToolsSampleValidationAsync: if the LLM doesn't invoke the tool within 60s, re-send the prompt (up to 2 retries) instead of burning the full timeout - Reduce LongRunningTools timeout from 240s to 180s (re-prompt makes the extra buffer unnecessary) - Leave simple/deterministic tests as [Fact] (SingleAgent, unit tests) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Matches the convention used by other checkout steps in this workflow to avoid leaving GITHUB_TOKEN credentials in the local git config. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 90% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach

Automated review by giles17's agents

Copilot AI review requested due to automatic review settings April 27, 2026 15:06

moonbox3 added the python label Apr 27, 2026

github-actions Bot changed the title ~~Add dotnet integration test report to CI~~ Python: Add dotnet integration test report to CI Apr 27, 2026

Copilot started reviewing on behalf of giles17 April 27, 2026 15:07 View session

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Comment thread .github/workflows/dotnet-build-and-test.yml

Comment thread python/scripts/flaky_report/aggregate.py Outdated

Comment thread python/scripts/flaky_report/aggregate.py

moonbox3 added documentation Improvements or additions to documentation .NET labels Apr 27, 2026

giles17 temporarily deployed to integration April 27, 2026 15:37 — with GitHub Actions Inactive

github-actions Bot changed the title ~~Python: Add dotnet integration test report to CI~~ .NET: Python: Add dotnet integration test report to CI Apr 27, 2026

giles17 had a problem deploying to integration April 27, 2026 15:37 — with GitHub Actions Failure

giles17 marked this pull request as draft April 27, 2026 15:44

giles17 and others added 2 commits April 27, 2026 08:45

chore: trigger dotnet CI for report validation

450eab4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17 force-pushed the dotnet-test-report branch from 584a2ae to 450eab4 Compare April 27, 2026 15:45

giles17 temporarily deployed to integration April 27, 2026 15:47 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 16:09 — with GitHub Actions Inactive

giles17 had a problem deploying to integration April 27, 2026 16:09 — with GitHub Actions Failure

fix: use .junit extension (not .junit.xml) for xunit v3 output

f48c8b3

xUnit v3 generates files with .junit extension, not .junit.xml. Update upload glob and aggregate.py discovery to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

giles17 temporarily deployed to integration April 27, 2026 16:29 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 16:53 — with GitHub Actions Inactive

giles17 temporarily deployed to integration April 27, 2026 19:27 — with GitHub Actions Inactive

giles17 had a problem deploying to integration April 30, 2026 22:38 — with GitHub Actions Failure

giles17 temporarily deployed to integration May 1, 2026 01:24 — with GitHub Actions Inactive

giles17 had a problem deploying to integration May 1, 2026 01:43 — with GitHub Actions Failure

giles17 temporarily deployed to integration May 1, 2026 01:43 — with GitHub Actions Inactive

Merge remote-tracking branch 'upstream/main' into dotnet-test-report

c1486c8

giles17 temporarily deployed to integration May 1, 2026 06:30 — with GitHub Actions Inactive

giles17 temporarily deployed to integration May 1, 2026 06:31 — with GitHub Actions Inactive

giles17 and others added 4 commits April 30, 2026 23:50

Merge branch 'main' into dotnet-test-report

caec7cb

Add persist-credentials: false to Integration Test Report checkout step

4288879

Matches the convention used by other checkout steps in this workflow to avoid leaving GITHUB_TOKEN credentials in the local git config. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions Bot reviewed May 4, 2026

View reviewed changes

lokitoth approved these changes May 5, 2026

View reviewed changes

westey-m reviewed May 6, 2026

View reviewed changes

Comment thread dotnet/tests/AnthropicChatCompletion.IntegrationTests/AnthropicChatCompletionFixture.cs

westey-m reviewed May 6, 2026

View reviewed changes

Comment thread .github/workflows/dotnet-build-and-test.yml

westey-m approved these changes May 6, 2026

View reviewed changes

giles17 added 2 commits May 6, 2026 08:38

Merge branch 'main' into dotnet-test-report

9372fd7

small fixes

2aacb42

westey-m approved these changes May 6, 2026

View reviewed changes

Merge branch 'main' into dotnet-test-report

458a378

peibekwe approved these changes May 6, 2026

View reviewed changes

disable anthropic failing tests

2867351

peibekwe approved these changes May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.NET: Python: Add dotnet integration test report to CI#5515

.NET: Python: Add dotnet integration test report to CI#5515
giles17 merged 22 commits intomainfrom
dotnet-test-report

giles17 commented Apr 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

giles17 commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Integration Test Report (dotnet-build-and-test.yml)

RetryFact for Flaky Integration Tests

Aggregate Script (python/scripts/integration_test_report/aggregate.py)

Key Design Decisions

Contribution Checklist

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

giles17 commented Apr 27, 2026 •

edited

Loading

Integration Test Report (`dotnet-build-and-test.yml`)

Aggregate Script (`python/scripts/integration_test_report/aggregate.py`)