Refactor MSBuild agentic workflows to share expertise via reusable agents by Evangelink · Pull Request #8389 · microsoft/testfx

Evangelink · 2026-05-20T13:51:15Z

Summary

Refactors the two recently-merged MSBuild-related workflows so they extend the canonical �xpert-reviewer agent and share their domain expertise via reusable subagents under .github/agents/. Each surface (PR review, scheduled scan, build-failure analysis) keeps its own trigger/limits but draws from the same catalog of rules.

Before

Workflow	Form	Lines	Notes
msbuild-quality-review.md	Agentic, weekly	223	All 5 rule categories inlined in the workflow prompt. Expert reviewer is unaware MSBuild authoring exists.
�uild-failure-analysis.yml + 3 JS scripts + `package.json / package-lock.json`	Traditional GH Actions, PR-triggered	~660	Calls `binlog-mcp` via Node + custom GH Models API calls for analysis and inline `suggestion blocks. No reuse of the gh-aw safe-output / agent infrastructure.

After

Layer	File	What it does
Reusable agent	.github/agents/msbuild-reviewer.agent.md (new)	Full 5-category rule catalog. Two modes: diff (read-only, called by �xpert-reviewer, bounded to 10 findings) and scan (called by the weekly workflow, self-posts).
Reusable agent	.github/agents/build-failure-analyst.agent.md (new)	Reads pre-dumped binlog JSON, groups errors by root cause, posts one summary �dd-comment plus up to 10 inline `create-pull-request-review-comment` `suggestion blocks.
Expert-reviewer hook	.github/agents/expert-reviewer.agent.md (modified)	New "Supplemental review: MSBuild authoring" subsection after Wave 1 delegates to msbuild-reviewer in diff mode when the PR diff touches `.props`, `.targets`, `Directory.Build.`, `Directory.Packages.props`, or anything under `build/` / `buildTransitive/` / `buildMultiTargeting/`. Framed as a supplemental* review — NOT a 22nd dimension — so the existing "21 dimensions" counters and summary table stay correct. Also extends the Knowledge Areas + Folder Hotspot Mapping tables.
Workflow shell	.github/workflows/msbuild-quality-review.md (slimmed 223 → 36 lines)	Weekly trigger + `imports: shared/msbuild-review-shared.md`. All rules moved to the agent.
Workflow shell	.github/workflows/build-failure-analysis.md (new, replaces `.yml`)	`pull_request` trigger; pre-agent steps build with `--binaryLog`, install `binlog-mcp`, dump JSON via `scripts/dump-binlog.js`, then delegate to `build-failure-analyst`. Advisory — does NOT fail the workflow on build failure (gh-aw has no post-agent step hook; deterministic build gate stays in `azure-pipelines.yml`).
Workflow shell	.github/workflows/build-failure-analysis-command.md (new)	Sibling workflow for the `/analyze-build-failure` slash-command (gh-aw forbids mixing `slash_command` with `pull_request` in one `on:` block).
Shared body	.github/workflows/shared/msbuild-review-shared.md (new)	Permissions, tools, safe-outputs, and the "launch agent in background then `noop`" prompt. Mirrors the `shared/review-shared.md` pattern used by the existing expert-review trio.
Shared body	.github/workflows/shared/build-failure-analysis-shared.md (new)	Same for the build-failure pair.
Helper	.github/workflows/scripts/dump-binlog.js (renamed from `extract-binlog-errors.js`)	~110 lines. Calls `binlog-mcp` via stdio MCP and writes `overview`/`errors`/`warnings` JSON to `/tmp/binlog-data/`. Scope tightened — no LLM calls.

Deleted

.github/workflows/build-failure-analysis.yml
.github/workflows/scripts/analyze-errors.js
.github/workflows/scripts/extract-binlog-errors.js
.github/workflows/scripts/post-suggestions.js

(≈470 lines of hand-rolled GH Models API plumbing replaced by the agent + gh-aw safe-outputs.)

Key design choices (validated with a rubber-duck critique before implementation)

MSBuild expertise extracted to a subagent, not merged into expert-reviewer — keeps the rule catalog reusable from the scheduled scan, the PR review, and any future trigger.
Diff mode is strictly read-only — parent reviewer owns posting, so MSBuild findings flow through Wave 2 validation and share the same max: 30 comment budget instead of bypassing it.
Slash-command and pull_request split into sibling workflows — gh-aw rejects them in the same on: block. The shared body file is the existing repo idiom.
Binlog data dumped to JSON in a pre-agent step, not exposed via an MCP server — the gh-aw MCP gateway rejects uncontainerized stdio servers, so binlog-mcp (a dotnet global tool) is run once up front and the agent cats the JSON.
Build-failure workflow is advisory, not gating — gh-aw has no post-agent step hook to re-surface the build failure after the agent posts. Azure Pipelines remains the deterministic build gate. Documented inline.

Compile + lock files

All three changed/new .md sources compile cleanly with gh aw compile --strict (verified locally with gh-aw v0.74.4: 0 errors, 12 warnings — all benign shell-injection auto-fixes plus the standing centralized-routing suggestion).

This commit deliberately does not regenerate any .lock.yml. My current dev environment cannot reach github/gh-aw-actions to resolve SHA pins for github/gh-aw-actions/setup (SAML), so the generated lock files would fall back to tag pins (@v0.74.4) instead of SHA pins — degrading supply-chain pinning. Before this PR can be merged, somebody with auth needs to run:

gh aw compile --strict

That will produce build-failure-analysis.lock.yml, build-failure-analysis-command.lock.yml, and refresh msbuild-quality-review.lock.yml with proper SHA pins. Until then the existing msbuild-quality-review.lock.yml (compiled from the old inline body) keeps running, and the new build-failure workflows simply do not exist on Actions yet — so this PR is a no-op at runtime until the lock files are committed.

Out of scope (intentionally not in this PR)

Re-pinning every other .lock.yml in the repo to gh-aw v0.74.4 (unrelated to MSBuild expertise).
Re-enabling a PR-blocking gate for build failures (would need a separate non-agentic build.yml — happy to do as a follow-up if the team wants it).

cc @YuliiaKovalova as author of the original two workflows being refactored.

…ents Extracts MSBuild authoring expertise and build-failure-analysis expertise from the two recently-merged workflows on main (msbuild-quality-review and build-failure-analysis) into reusable subagents under .github/agents/, and hooks them into the canonical expert-reviewer pipeline so they extend it rather than duplicate review logic. Goals: * Same MSBuild expertise reachable from three triggers (PR review via expert-reviewer, weekly scheduled scan, manual slash command). * Build-failure analysis replaces 470+ lines of hand-rolled Node.js that called the GitHub Models API directly with a small data-extraction helper + an agent that uses gh-aw safe-outputs for posting. * Each surface (scheduled scan, PR-level review, build-failure analysis) can evolve its triggers/limits independently without touching the shared rule catalog. New reusable subagents: * .github/agents/msbuild-reviewer.agent.md - Two operating modes: 'diff' (called from expert-reviewer; read-only; emits ISSUE/LGTM blocks; bounded to 10 findings) and 'scan' (called from the weekly workflow; self-posts via create-issue / safe auto-fix create-pull-request). - Full rule catalog (5 categories, A-E) extracted verbatim from the original msbuild-quality-review body, plus explicit severity mapping to expert-reviewer's BLOCKING/MODERATE/NIT taxonomy. * .github/agents/build-failure-analyst.agent.md - Reads pre-dumped binlog JSON files produced by the workflow's pre-agent steps. Groups errors by root cause, proposes minimal fixes, posts one summary comment + up to 10 inline 'suggestion' blocks via gh-aw safe-outputs. - The gh-aw MCP gateway does not support uncontainerized stdio MCP servers, so the binlog-mcp dotnet global tool is invoked once from a pre-agent step (via scripts/dump-binlog.js) to emit overview/errors/ warnings JSON the agent can simply 'cat'. expert-reviewer extension: * Adds a 'Supplemental review: MSBuild authoring' section after Wave 1 that delegates to msbuild-reviewer in diff mode when the PR diff touches *.props, *.targets, Directory.Build.*, Directory.Packages.props, or anything under build/, buildTransitive/, buildMultiTargeting/. * Framed as a *supplemental* review, NOT a 22nd dimension, so the existing '21 dimensions' counters and summary table stay correct. * Adds an entry to TestFx-Specific Knowledge Areas and the Folder Hotspot Mapping pointing at the new agent. Workflow shells (each thin and trigger-focused): * .github/workflows/msbuild-quality-review.md — slimmed from 223 to 36 lines; imports shared/msbuild-review-shared.md; weekly trigger. * .github/workflows/build-failure-analysis.md — replaces build-failure-analysis.yml. Runs build with binlog, dumps JSON, delegates to build-failure-analyst agent. Advisory: posts comments and inline suggestions but does NOT fail the workflow on build failure (gh-aw has no post-agent step hook; the deterministic build gate lives in azure-pipelines.yml). * .github/workflows/build-failure-analysis-command.md — slash-command sibling (/analyze-build-failure on PR comments) for maintainer- initiated reruns. * .github/workflows/shared/msbuild-review-shared.md — shared permissions/tools/safe-outputs + the 'launch agent in background then noop' prompt (mirrors shared/review-shared.md). * .github/workflows/shared/build-failure-analysis-shared.md — same for the build-failure pair. Deleted (replaced by agent + 1 small helper script): * .github/workflows/build-failure-analysis.yml * .github/workflows/scripts/analyze-errors.js * .github/workflows/scripts/extract-binlog-errors.js * .github/workflows/scripts/post-suggestions.js Kept and renamed: * .github/workflows/scripts/dump-binlog.js (renamed from extract-binlog- errors.js, scope tightened to data extraction only). Lock files: The gh-aw extension cannot be installed in the dev environment used to produce this commit (SAML enforcement on the github/gh-aw repo blocks 'gh extension install'). All .md sources compile cleanly with the downloaded gh-aw binary in strict mode (0 errors, 12 warnings — all benign shell-injection auto-fixes and a centralized-slash-routing suggestion). However, fully-resolved action SHAs for gh-aw-actions/setup cannot be looked up in this env, so the generated .lock.yml files would fall back to tag pins (@v0.74.4) instead of SHA pins. To avoid degrading supply-chain pinning in committed lock files, this commit does NOT include regenerated .lock.yml files. Before pushing, run: gh aw compile --strict This will regenerate the lock files (build-failure-analysis.lock.yml, build-failure-analysis-command.lock.yml) and update msbuild-quality-review.lock.yml to match the new .md sources, with proper SHA pins resolved against actions-lock.json. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Refactors MSBuild-related GitHub agentic workflows to reuse domain expertise via dedicated sub-agents, simplifying workflow shells and replacing custom Node/LLM plumbing with reusable agents and shared workflow bodies.

Changes:

Added reusable agents for MSBuild authoring review and build-failure analysis, and wired them into existing/updated workflows.
Replaced the legacy build-failure-analysis GH Actions workflow + scripts with gh-aw workflow shells plus a small binlog JSON dump helper.
Extracted common workflow configuration into shared imported markdown files.

Show a summary per file

File	Description
.github/workflows/shared/msbuild-review-shared.md	New shared import for MSBuild scan workflow (permissions/tools/safe-outputs + background-agent launch instructions).
.github/workflows/shared/build-failure-analysis-shared.md	New shared import that delegates to build-failure analyst agent (background mode + noop behavior).
.github/workflows/scripts/package.json	Points script entry at new dump helper and clarifies its purpose.
.github/workflows/scripts/dump-binlog.js	New helper to dump binlog overview/errors/warnings to JSON for the agent to consume.
.github/workflows/msbuild-quality-review.md	Slimmed workflow shell now importing shared MSBuild review body.
.github/workflows/build-failure-analysis.yml	Deletes legacy non-agentic workflow implementation.
.github/workflows/build-failure-analysis.md	New gh-aw workflow shell to run build, dump binlog JSON, then delegate to agent.
.github/workflows/build-failure-analysis-command.md	New slash-command companion workflow sharing the same agent delegation body.
.github/workflows/scripts/extract-binlog-errors.js	Removed (superseded by dump-binlog.js).
.github/workflows/scripts/analyze-errors.js	Removed (analysis moved into the agent).
.github/workflows/scripts/post-suggestions.js	Removed (inline suggestions moved into the agent).
.github/agents/msbuild-reviewer.agent.md	New reusable MSBuild authoring reviewer agent (diff + scan modes).
.github/agents/build-failure-analyst.agent.md	New reusable build-failure analyst agent (posts summary + inline suggestions).
.github/agents/expert-reviewer.agent.md	Delegates supplemental MSBuild authoring review to msbuild-reviewer agent when relevant files change.

Copilot's findings

Files reviewed: 14/14 changed files
Comments generated: 6

Address review comment on #8389: For workflow_dispatch runs in build-failure-analysis.md that pass inputs.pr-number, and for the slash-command sibling build-failure-analysis-command.md (triggered by pull_request_comment), github.sha is the default-branch tip rather than the PR head. That breaks permalinks in the analysis comment and inline review-comment placement. Add a 'Resolve PR head SHA' step that uses 'gh api' to fetch the real PR head SHA whenever the PR number is known but the github.event.pull_request payload is not present, and feed it into GH_AW_PR_HEAD_SHA via the existing fall-back chain. The pull_request trigger path is unchanged (still uses github.event.pull_request.head.sha, which is correct for that event). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI review requested due to automatic review settings May 20, 2026 13:51

Copilot AI reviewed May 20, 2026

View reviewed changes

Evangelink marked this pull request as ready for review May 20, 2026 15:02

Copilot AI review requested due to automatic review settings May 20, 2026 15:02

Copilot started reviewing on behalf of Evangelink May 20, 2026 15:02 View session

Evangelink enabled auto-merge (squash) May 20, 2026 15:21

Copilot AI reviewed May 20, 2026

View reviewed changes

YuliiaKovalova approved these changes May 20, 2026

View reviewed changes

Evangelink merged commit dd13fca into main May 20, 2026
80 of 84 checks passed

Evangelink deleted the dev/amauryleve/msbuild-workflows-refactor branch May 20, 2026 15:26

This was referenced May 20, 2026

Improve inline suggestion instructions for build failure analyst agent #8392

Merged

Add AI-powered build failure analysis with NuGet MCP for VMR insertion PRs dotnet/dotnet#6711

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor MSBuild agentic workflows to share expertise via reusable agents#8389

Refactor MSBuild agentic workflows to share expertise via reusable agents#8389
Evangelink merged 2 commits into
mainfrom
dev/amauryleve/msbuild-workflows-refactor

Evangelink commented May 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Evangelink commented May 20, 2026

Summary

Before

After

Deleted

Key design choices (validated with a rubber-duck critique before implementation)

Compile + lock files

Out of scope (intentionally not in this PR)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants