Add gh-aw-lockfile skill for auto-compiling agentic workflow lock files#681
Closed
YuliiaKovalova wants to merge 5 commits into
Closed
Add gh-aw-lockfile skill for auto-compiling agentic workflow lock files#681YuliiaKovalova wants to merge 5 commits into
YuliiaKovalova wants to merge 5 commits into
Conversation
Experiment branch to isolate the impact of "invoke prompted subagents more often" from the impact of "make those subagents do richer work." Same baseline as dev/ykovalova/cta-prompt-tuning (main = 66628b6), but strips out every content/quality rule and keeps only the dispatch plumbing. Comparison branch: dev/ykovalova/cta-prompt-tuning (HEAD: bd530be) which contains both the dispatch mechanics AND content/quality rules. Files modified (3 vs 5 in cta-prompt-tuning): - code-testing-generator.agent.md +110 lines - code-testing-implementer.agent.md +11 lines - code-testing-fixer.agent.md +1 line - code-testing-researcher.agent.md UNTOUCHED (baseline) - code-testing-planner.agent.md UNTOUCHED (baseline) KEPT (invocation / dispatch mechanics): Generator: - Rule 1: every task() call MUST use agent_type "dotnet-test:code-testing-..." (without this, calls dispatch generic built-ins and never reach the named CTA agents) - Rule 2: routing table -- which named agent for which job - Rule 3: prefer one named-agent dispatch over many tool calls - Rule 4: orchestrator MUST NOT edit/create test files itself (forces implementer dispatch) - Rule 5: orchestrator MUST NOT run builds/tests via terminal (forces builder/tester dispatch) - Rule 6: every run MUST dispatch the planner (no exceptions for "small" scope; Direct still goes through planner) - Rule 7: every build/test failure MUST dispatch the fixer - Step 1b: mandatory initial researcher dispatch (every strategy) - Direct strategy rewritten: dispatches planner -> implementer -> builder -> tester -> fixer -> linter (was "Skip Steps 3-5, write tests inline") - All Step 3/4/5/6/7/8/9 dispatches converted from runSubagent({agent:...}) to task({ agent_type: "dotnet-test:code-testing-...", name:..., prompt:...}) - Step 9 validator dispatch (forces builder dispatch for cleanup) - Steps 6/7 mandatory builder/tester dispatch wrapper Implementer: - Section 5: "you MUST dispatch fixer for build errors" + no-inline-edit block (forces fixer dispatch on build failures) - Section 6: "you MUST dispatch fixer for test failures" + no-inline-edit block (forces fixer dispatch on test failures) - Section 7: "Format Code (mandatory if a lint command exists)" (was "Optional"; mandatory firing of linter) - Rule 6: never declare SUCCESS while build/tests fail (gates SUCCESS on fixer dispatch) - Rule 7: no inline test-file edits between failed dispatch and fixer Fixer: - Frontmatter description widened to advertise handling of failing tests (without this, the orchestrator's routing logic does not select the fixer for test failures, so even Rule 7's mandate produces no firing -- this is the change that took fixer firing from 0.00/inst to 0.39/inst in earlier iterations) DROPPED (content / quality rules -- in cta-prompt-tuning, NOT here): Generator: - Test-strength rules embedded in implementer dispatch prompt - Test-design rules embedded in implementer dispatch prompt (OFAT, mutation self-check, never mock subject under test) - File-location rules embedded in implementer dispatch prompt - TARGET ENTITIES / PHASE CHECKLIST / TEST TRACEABILITY blocks in implementer dispatch prompt - CHECKLIST format spec in planner dispatch prompt - Step 9 validator's detailed cleanup classification Implementer: - Section 4b "Verify CHECKLIST coverage" pre-completion check - Section 8 "CHECKLIST COVERAGE" report block - "Honor the CHECKLIST" rule Fixer: - "Process -- Failing Tests" section (5-step diagnosis flow) - All anti-weakening / anti-skipping rules - "Re-derive expected from production source" guidance Planner: - CHECKLIST format ("one item per TARGET BEHAVIOR, Source/Variants/ Expected mandatory") - "Test name from research.md conventions" rule - "At least 2 phases" rule Researcher: - Section 8 "Extract Local Test Naming & Style Conventions" - TARGET ENTITIES / TARGET BEHAVIORS / TEST INFRASTRUCTURE structure in research.md - Test naming pattern extraction WHAT THE SUBAGENTS WILL ACTUALLY DO: The researcher / planner / implementer / fixer all operate at baseline behavior -- they receive the same prompts they receive in the upstream "vanilla" runs. The only difference vs vanilla is that the orchestrator ACTUALLY DISPATCHES THEM (where vanilla often inlines the work or skips sub-agent dispatch entirely). EXPECTED COMPARISON: If quality on this branch is similar to or higher than dev/ykovalova/ cta-prompt-tuning (bd530be), then "more dispatches" is the dominant quality lever and the content/quality rules in cta-prompt-tuning are adding marginal or noise-level value. If quality on this branch is materially lower than cta-prompt-tuning, then the content/quality rules are doing the heavy lifting and the dispatch mechanics alone are insufficient. If quality on this branch matches or exceeds vanilla but trails cta-prompt-tuning, then the dispatch mechanics provide a baseline lift and the content rules add an incremental quality layer on top. Rubber-duck check passed (validated dispatch-vs-content classification; fixer frontmatter is routing metadata not a runtime gate; surviving dispatch prompts contain no dangling references to removed CHECKLIST / TARGET ENTITIES / TEST STRENGTH / naming-convention concepts). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Step 1b prompt: explicitly request unit-under-test (file:line) and behaviors so the verification gate rarely needs re-dispatch. - Step 3: rename to 'Deep Research Phase', mark skipped for Direct strategy, and switch from overwriting research.md to extending it (no double-research). - Step references: update '6-9' -> '6-10' and 'Step 9' -> 'Steps 9-10' in the strategy table and the All-strategies-MUST line, since reporting is Step 10. - Step 6 builder prompt: drop '*.sln' glob (could expand to multiple args); use 'dotnet build --no-incremental' (auto-discovers .sln) per dotnet.md. - Step 9: stop overloading the builder agent; perform diff/cleanup directly in the orchestrator (Rule 5 forbids inline build/test, not git/fs hygiene). - Fixer agent: update mission text to cover failing tests and assertion correction (front-matter description already mentioned this; body now matches), with explicit no-Ignore/no-Skip/no-production-rewrite guardrails. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a skill that activates whenever .github/workflows/*.md or .github/agents/*.agent.md files are edited. It reminds and guides the agent to run 'gh aw compile --strict' and commit the regenerated .lock.yml files in the same change. This prevents the common 'ERR_CONFIG: Lock file is outdated' error that occurs when .md workflow sources are edited without recompiling. Discovered from: microsoft/testfx#8456, microsoft/testfx#8455 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new skill
gh-aw-lockfileto thedotnet-aiplugin that automatically detects and compiles stale GitHub Agentic Workflow lock files.Problem
When editing
.github/workflows/*.mdor.github/agents/*.agent.mdfiles, the compiled.lock.ymlfiles must be regenerated withgh aw compile --strictand committed in the same change. Forgetting this causesERR_CONFIG: Lock file is outdatedfailures at runtime.This is a common mistake — it happened in:
Solution
The skill activates whenever agentic workflow
.mdfiles are edited and guides the agent to:gh aw compile <workflow-id> --strictfor each affected workflow.lock.ymlfiles alongside.mdchangesSkill activation triggers
.github/workflows/*.mdfiles.github/workflows/shared/*.mdfiles.github/agents/*.agent.mdfilesERR_CONFIG: Lock file is outdatederrors