Skip to content

Add AI-powered build failure analysis with NuGet MCP for VMR insertion PRs#6711

Open
YuliiaKovalova wants to merge 4 commits into
dotnet:mainfrom
YuliiaKovalova:feature/build-failure-analysis
Open

Add AI-powered build failure analysis with NuGet MCP for VMR insertion PRs#6711
YuliiaKovalova wants to merge 4 commits into
dotnet:mainfrom
YuliiaKovalova:feature/build-failure-analysis

Conversation

@YuliiaKovalova
Copy link
Copy Markdown
Member

@YuliiaKovalova YuliiaKovalova commented May 20, 2026

Summary

Adds AI-powered build failure analysis for the dotnet/dotnet VMR using a GitHub Actions agentic workflow — the same architecture as microsoft/testfx.

When a Maestro insertion PR fails the build, the AzDO pipeline automatically dispatches a GitHub Actions workflow where an AI agent (claude-opus-4.6) downloads the binlog, diagnoses the root cause, and posts a summary comment with inline suggestion blocks. The agent also has access to the NuGet MCP Server for resolving package version conflicts.

How it runs

Maestro insertion PR (darc-* branch)
  |
  v
AzDO: VMR build stages (existing)
  |
  +-- Succeeds -> nothing happens
  |
  +-- Fails -> TriggerBuildFailureAnalysis stage (new)
                |
                v  curl -> GitHub API workflow_dispatch
                |    (pr-number + azdo-build-id)
                v
              GitHub Actions: build-failure-analysis.md
                +-- Download failed job's BuildLogs from AzDO API
                +-- Select top-level VMR orchestration binlog
                +-- Dump errors via C# MCP client (ModelContextProtocol SDK)
                +-- Delegate to build-failure-analyst agent
                     +-- Root cause analysis (VMR-specific)
                     +-- NuGet conflict resolution via NuGet MCP Server
                     +-- PR comment with error table
                     +-- Inline suggestion blocks on diff lines

MCP Servers available to the agent

Server Tools Use case
binlog-mcp binlog_overview, binlog_errors, binlog_warnings Extract build errors from MSBuild binary logs
NuGet MCP Server nuget_fix_vulnerable_packages, nuget_update-package, nuget_get-latest-package-version Resolve NU1605/NU1608 version conflicts with concrete remediation plans

Trigger conditions

The AzDO stage runs only when:

  • Public project (System.PullRequest.PullRequestNumber = GitHub PR number)
  • PR trigger with darc-* or release-pr-* branch
  • At least one build stage failed

Also available manually: gh workflow run build-failure-analysis.md -f pr-number=1234 -f azdo-build-id=5678

VMR-specific agent expertise

  • Maestro/Darc insertion patterns (eng/Version.Details.xml, eng/Versions.props)
  • Multi-pass builds (runtime pass 1/2, aspnetcore pass 1/2)
  • Source-build compatibility, TFM mismatches
  • NuGet conflict resolution via MCP (not just error reporting)
  • When fixes belong in the upstream component repo vs dotnet/dotnet

Key design decisions

Decision Rationale
AzDO -> GitHub dispatch VMR builds in AzDO; analysis runs in GitHub Actions where gh-aw agents are available
Download binlogs Reuses existing AzDO build artifacts instead of re-building the VMR
Failed job detection Queries AzDO timeline API to find the failed job and download only its artifact
Top-level binlog artifacts/log/Release/Build.binlog aggregates all errors across repos
C# MCP client Uses ModelContextProtocol NuGet package natively (no Node.js)
NuGet MCP Server Agent can resolve version conflicts, not just report them
Agentic workflow Multi-turn reasoning with source file access and tool use

Files

File Purpose
eng/pipelines/pr.yml New TriggerBuildFailureAnalysis stage
.github/workflows/build-failure-analysis.md Agentic workflow: download binlogs, dump, delegate
.github/workflows/shared/build-failure-analysis-shared.md Shared agent delegation
.github/agents/build-failure-analyst.agent.md VMR-specific agent with NuGet MCP integration
.github/workflows/scripts/DumpBinlog/ C# MCP client (ModelContextProtocol SDK)

Prerequisites

  • GitHubCommentToken secret variable in public AzDO pipeline with issues:write, pull-requests:write, actions:write

Validated end-to-end

Tested on fork simulation PR using real AzDO build artifacts from failed builds. The agent successfully downloaded binlogs, analyzed errors, and posted diagnostic comments.

Prior art

@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch 2 times, most recently from 6afc8e5 to 6f5cabb Compare May 20, 2026 14:33
@YuliiaKovalova YuliiaKovalova changed the title Add AI-powered build failure analysis for insertion PRs Add AI-powered build failure analysis for Maestro insertion PRs May 20, 2026
@YuliiaKovalova YuliiaKovalova marked this pull request as ready for review May 20, 2026 14:47
@YuliiaKovalova YuliiaKovalova requested review from a team as code owners May 20, 2026 14:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Azure DevOps “Build Failure Analysis” stage intended to run on failed Maestro/Darc insertion PR builds, extract/aggregate MSBuild errors (preferably from binlogs), optionally analyze them with GitHub Models, and post a consolidated diagnostic comment plus inline fix suggestions back to the corresponding GitHub PR.

Changes:

  • Introduces a new failure-only pipeline stage + step templates to download build logs/binlogs, extract/dedupe errors, run AI analysis, and post PR feedback.
  • Adds Node-based helper scripts (MCP client, merge/dedupe, AI prompt/report generator, GitHub comment + review suggestion posters) plus a small unit-test script.
  • Adds a reusable template for publishing binlogs/logs as pipeline artifacts and a GitHub Actions “simulation” workflow for end-to-end validation.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
eng/pipelines/pr.yml Wires the new Build Failure Analysis stage into the PR pipeline.
eng/pipelines/templates/stages/build-failure-analysis.yml Defines the analysis stage, artifact download, and GitHub PR resolution.
eng/pipelines/templates/steps/build-failure-analysis.yml Implements binlog discovery, extraction, fallback parsing, AI analysis, and PR posting steps.
eng/pipelines/templates/steps/publish-binlogs.yml Adds a reusable step template for publishing binlogs/logs as pipeline artifacts.
eng/build-failure-analysis/scripts/package.json Defines Node dependencies for the analysis scripts.
eng/build-failure-analysis/scripts/package-lock.json Locks transitive dependencies for reproducible installs.
eng/build-failure-analysis/scripts/.gitignore Ignores node_modules for the scripts directory.
eng/build-failure-analysis/scripts/extract-binlog-errors.js MCP client to extract overview/errors/warnings from a binlog.
eng/build-failure-analysis/scripts/merge-errors.js Merges and deduplicates errors across multiple binlog extractions.
eng/build-failure-analysis/scripts/analyze-errors.js Builds the prompt, calls GitHub Models (with fallback), and generates a markdown report.
eng/build-failure-analysis/scripts/post-pr-comment.js Posts/updates a single diagnostic PR comment (marker-based update-in-place).
eng/build-failure-analysis/scripts/post-suggestions-azdo.js Posts inline fix suggestions via GitHub Reviews API using PR diff-line mapping.
eng/build-failure-analysis/scripts/test-utils.js Simple unit tests for shared path/linkification + merge behavior.
.github/workflows/build-failure-analysis-simulation.yml Adds a GitHub Actions workflow to simulate a failure and exercise the scripts.
Files not reviewed (1)
  • eng/build-failure-analysis/scripts/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread eng/pipelines/templates/stages/build-failure-analysis.yml Outdated
Comment thread eng/pipelines/templates/stages/build-failure-analysis.yml Outdated
Comment thread eng/pipelines/templates/stages/build-failure-analysis.yml Outdated
Comment thread eng/pipelines/templates/steps/build-failure-analysis.yml Outdated
Comment thread .github/workflows/build-failure-analysis-simulation.yml Outdated
Comment thread eng/pipelines/templates/steps/publish-binlogs.yml Outdated
Comment thread .github/workflows/build-failure-analysis-simulation.yml Outdated
Comment thread .github/workflows/scripts/package-lock.json Outdated
Comment thread eng/pipelines/templates/stages/build-failure-analysis.yml Outdated
@YuliiaKovalova YuliiaKovalova changed the title Add AI-powered build failure analysis for Maestro insertion PRs Add AI-powered build failure analysis for insertion PRs (agentic workflow) May 20, 2026
Comment thread .github/workflows/scripts/dump-binlog.js Outdated
Comment thread .github/workflows/build-failure-analysis.md Outdated
Comment thread .github/workflows/build-failure-analysis.md
@YuliiaKovalova YuliiaKovalova changed the title Add AI-powered build failure analysis for insertion PRs (agentic workflow) Add AI-powered build failure analysis for VMR (agentic workflow) May 21, 2026
@YuliiaKovalova YuliiaKovalova changed the title Add AI-powered build failure analysis for VMR (agentic workflow) Add AI-powered build failure analysis for VMR insertion PRs May 21, 2026
@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch from a670cdd to 0c003e4 Compare May 21, 2026 12:42
Adds an agentic workflow that automatically analyzes failed Maestro
insertion PR builds. When a darc-*/release-pr-* PR fails in the
public AzDO pipeline, a new stage dispatches a GitHub Actions
workflow where an AI agent (claude-opus-4.6) downloads the binlog,
diagnoses root causes, and posts a PR comment with inline suggestions.

Architecture:
- eng/pipelines/pr.yml: TriggerBuildFailureAnalysis stage that
  dispatches the GH Actions workflow via curl on build failure
- .github/workflows/build-failure-analysis.md: gh-aw agentic
  workflow that downloads binlogs from AzDO and delegates to agent
- .github/agents/build-failure-analyst.agent.md: VMR-specific
  build failure analyst with insertion pattern expertise
- .github/workflows/scripts/DumpBinlog/: C# MCP client using
  ModelContextProtocol SDK (no Node.js dependency)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch from 0c003e4 to 906666b Compare May 21, 2026 12:45
YuliiaKovalova and others added 2 commits May 21, 2026 14:47
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch from 3070ec6 to b6acbcd Compare May 21, 2026 15:47
@YuliiaKovalova
Copy link
Copy Markdown
Member Author

YuliiaKovalova commented May 21, 2026

the fresh output result: YuliiaKovalova#3 (comment)

@YuliiaKovalova YuliiaKovalova changed the title Add AI-powered build failure analysis for VMR insertion PRs Add AI-powered build failure analysis with NuGet MCP for VMR insertion PRs May 21, 2026
@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch 2 times, most recently from f6d1dfd to 90e650c Compare May 21, 2026 17:15
Integrates the official NuGet.Mcp.Server (v1.4.3) giving the agent
access to nuget_fix_vulnerable_packages, nuget_update-package, and
other tools for resolving NU1605/NU1608 version conflicts with
concrete remediation plans.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@YuliiaKovalova YuliiaKovalova force-pushed the feature/build-failure-analysis branch from 9818a70 to 48f0741 Compare May 21, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants