Context
The "Mobile Platform Failure Scanner" agentic workflow (.github/workflows/mobile-scan.md, run 25430707131) auto-files tracking issues for CI failures. Issue #127859 (filed by this scanner against runtime-diagnostics def 309) is a useful case study because it exhibits three concrete defects that are likely to recur:
- Sample window too narrow. The issue cited 5 builds and "past week"; the failure has actually been 100% red on
main since at least build 1390492 (Apr 21) — over 2 weeks. The scanner's caveat ("computed within the scanned window and may not be the true origin") is correct but the upstream ~20 builds look-back undersamples persistent failures.
- Recommended fix doesn't reflect existing code. The issue's "preferred" fix was "split the Helix payload into per-platform jobs" — but
cdac-dump-xplat-test-helix.proj already does exactly that and the file's header comment says so explicitly. The scanner did not read the cited file before recommending a fix.
- Mis-routed area label. The failure is in
Microsoft.DotNet.Helix.Sdk / arcade payload upload (MemoryStream 2 GiB ceiling in DirectoryPayload.DoUploadAsync), not a cDAC product issue, but the issue is labeled area-Diagnostics-coreclr. The scanner has no notion of "this stack trace points at arcade infrastructure, not the test under test".
Bonus oddity: this issue was filed by mobile-scan even though runtime-diagnostics (def 309) is not a mobile pipeline.
Proposal
Pilot the dotnet-dnceng skills plugin as a second-opinion verifier before the workflow files an issue:
ci-analysis would catch the area mis-label by running stack-trace → owner mapping.
pipeline-investigation is the correct route for non-Helix-test errors (build-time MSBuild task failures like this one) and would have surfaced the "Send cDAC X-Plat Dump Tests to Helix (Unix)" timeline record with its succeededWithIssues result and proper recordId/logId.
known-issue-history would compute a real failure-rate baseline by mining the build-analysis bot's hit-count edits, instead of approximating from a 20-build slice.
- The
CiInvestigator agent at plugins/dotnet-dnceng/agents/CiInvestigator.agent.md already encodes the routing this scanner is missing.
Obstacles to integration
The skills plugin isn't a drop-in: it depends on MCP servers (hlx, maestro, mcp-binlog-tool, mihubot) and the gh/az CLIs, while the gh-aw runtime today has a strict bash allowlist (no gh, no pwsh, no python, no $(...)) and doesn't currently load MCP servers. Two paths:
- (A) Wire MCP servers into the gh-aw engine config.
- (B) Port the skill scripts under
plugins/dotnet-dnceng/skills/*/scripts/ to the workflow's allowlist (most are bash + curl + jq today).
Recommended next step
Run ci-analysis + pipeline-investigation against the proposed issue body before the workflow files it. Even if the workflow keeps producing the body itself, this gate would have rejected #127859 for defects (2) and (3) above. Lower effort than full migration, gives concrete signal on whether deeper integration is worth doing.
cc @steveisok @dotnet/runtime-infrastructure
Context
The "Mobile Platform Failure Scanner" agentic workflow (
.github/workflows/mobile-scan.md, run 25430707131) auto-files tracking issues for CI failures. Issue #127859 (filed by this scanner againstruntime-diagnosticsdef 309) is a useful case study because it exhibits three concrete defects that are likely to recur:mainsince at least build 1390492 (Apr 21) — over 2 weeks. The scanner's caveat ("computed within the scanned window and may not be the true origin") is correct but the upstream~20 buildslook-back undersamples persistent failures.cdac-dump-xplat-test-helix.projalready does exactly that and the file's header comment says so explicitly. The scanner did not read the cited file before recommending a fix.Microsoft.DotNet.Helix.Sdk/ arcade payload upload (MemoryStream2 GiB ceiling inDirectoryPayload.DoUploadAsync), not a cDAC product issue, but the issue is labeledarea-Diagnostics-coreclr. The scanner has no notion of "this stack trace points at arcade infrastructure, not the test under test".Bonus oddity: this issue was filed by
mobile-scaneven thoughruntime-diagnostics(def 309) is not a mobile pipeline.Proposal
Pilot the
dotnet-dncengskills plugin as a second-opinion verifier before the workflow files an issue:ci-analysiswould catch the area mis-label by running stack-trace → owner mapping.pipeline-investigationis the correct route for non-Helix-test errors (build-time MSBuild task failures like this one) and would have surfaced the "Send cDAC X-Plat Dump Tests to Helix (Unix)" timeline record with itssucceededWithIssuesresult and proper recordId/logId.known-issue-historywould compute a real failure-rate baseline by mining the build-analysis bot's hit-count edits, instead of approximating from a 20-build slice.CiInvestigatoragent atplugins/dotnet-dnceng/agents/CiInvestigator.agent.mdalready encodes the routing this scanner is missing.Obstacles to integration
The skills plugin isn't a drop-in: it depends on MCP servers (
hlx,maestro,mcp-binlog-tool,mihubot) and thegh/azCLIs, while the gh-aw runtime today has a strict bash allowlist (nogh, nopwsh, nopython, no$(...)) and doesn't currently load MCP servers. Two paths:plugins/dotnet-dnceng/skills/*/scripts/to the workflow's allowlist (most are bash + curl + jq today).Recommended next step
Run
ci-analysis+pipeline-investigationagainst the proposed issue body before the workflow files it. Even if the workflow keeps producing the body itself, this gate would have rejected #127859 for defects (2) and (3) above. Lower effort than full migration, gives concrete signal on whether deeper integration is worth doing.cc @steveisok @dotnet/runtime-infrastructure