Add validate-audit skill: parallel post-audit finding validation#528
Conversation
- Update test_skills.py: add validate-audit to BUNDLED_SKILLS, AUDIT_SKILL_NAMES, and test_all_audit_skills_have_audit_category; bump skill count assertions to 58/60 - Create tests/skills/test_validate_audit_contracts.py with T-VAL-006 through T-VAL-015 contract tests for the forthcoming validate-audit SKILL.md (all tests fail until Part B creates the skill directory) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Create src/autoskillit/skills_extended/validate-audit/SKILL.md implementing post-audit finding validation with 9–10 parallel subagents. Supports all three audit formats (arch, tests, cohesion), produces validated_report and contested_findings output files, and handles interactive/headless mode. Update docs skill count from 60 to 61 throughout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested (10 blocking findings)
| class TestValidateAuditContent: | ||
| # T-VAL-008 | ||
| def test_validated_true_marker_required(self) -> None: | ||
| assert "validated: true" in _skill_text() |
There was a problem hiding this comment.
[warning] tests: T-VAL-008: asserting 'validated: true' is a raw string-in-file check that will silently pass if the string appears anywhere (e.g. in a comment or example), providing no structural guarantee about SKILL.md frontmatter. Consider parsing the YAML frontmatter and asserting the field value.
There was a problem hiding this comment.
Investigated — this is intentional. T-VAL-008 is a contract test on the SKILL.md text itself, not on any output YAML file. It verifies the skill spec contains the 'validated: true' requirement (SKILL.md line 49 and output template). Parsing YAML frontmatter of the skill description would be wrong — that frontmatter has no 'validated' key.
| assert "validated: true" in _skill_text() | ||
|
|
||
| # T-VAL-009 | ||
| def test_parallel_single_message_dispatch(self) -> None: |
There was a problem hiding this comment.
[warning] tests: T-VAL-009: asserting 'single message' and 'parallel' appear anywhere in lowercased text is too broad. Both terms appear in many unrelated contexts. The test cannot distinguish the intended dispatch contract from incidental mentions.
There was a problem hiding this comment.
Investigated — this is intentional. 'single message' is a domain-specific phrase appearing only in the Step 3 SINGLE MESSAGE section header. These terms are tightly scoped contract markers for the parallel dispatch contract, matching the established pattern for SKILL.md contract tests in this codebase.
|
|
||
| ### Step 6 — Interactive vs Headless Approval | ||
|
|
||
| Detect headless mode: run `echo "${AUTOSKILLIT_HEADLESS:-0}"` via Bash. Output `1` means |
There was a problem hiding this comment.
[warning] defense: Headless mode detection relies on AUTOSKILLIT_HEADLESS env var but no guard is documented for unexpected values. The skill should specify the safe default (interactive or headless) when the variable check produces neither '0' nor '1'.
There was a problem hiding this comment.
Investigated — this is intentional. The bash ${AUTOSKILLIT_HEADLESS:-0} expression IS the guard: it normalizes unset/empty to '0' (safe default). Any unexpected non-empty value (e.g. 'true', '2') does not equal '1' and naturally falls to interactive mode — the safe default. The behavior is safe and unambiguous for all possible inputs.
| class TestValidateAuditContent: | ||
| # T-VAL-008 | ||
| def test_validated_true_marker_required(self) -> None: | ||
| assert "validated: true" in _skill_text() |
There was a problem hiding this comment.
[warning] tests: T-VAL-008: asserting 'validated: true' is a raw string-in-file check that will silently pass if the string appears anywhere (e.g. in a comment or example), providing no structural guarantee about SKILL.md frontmatter. Consider parsing the YAML frontmatter and asserting the field value.
There was a problem hiding this comment.
Investigated — this is intentional. T-VAL-008 is a contract test on the SKILL.md text itself, not on any output YAML file. It verifies the skill spec contains the 'validated: true' requirement (SKILL.md line 49 and output template). Parsing YAML frontmatter of the skill description would be wrong — that frontmatter has no 'validated' key.
| assert "validated: true" in _skill_text() | ||
|
|
||
| # T-VAL-009 | ||
| def test_parallel_single_message_dispatch(self) -> None: |
There was a problem hiding this comment.
[warning] tests: T-VAL-009: asserting 'single message' and 'parallel' appear anywhere in lowercased text is too broad. Both terms appear in many unrelated contexts. The test cannot distinguish the intended dispatch contract from incidental mentions.
There was a problem hiding this comment.
Investigated — this is intentional. 'single message' is a domain-specific phrase appearing only in the Step 3 SINGLE MESSAGE section header. These terms are tightly scoped contract markers for the parallel dispatch contract, matching the established pattern for SKILL.md contract tests in this codebase.
|
|
||
| ### Step 6 — Interactive vs Headless Approval | ||
|
|
||
| Detect headless mode: run `echo "${AUTOSKILLIT_HEADLESS:-0}"` via Bash. Output `1` means |
There was a problem hiding this comment.
[warning] defense: Headless mode detection relies on AUTOSKILLIT_HEADLESS env var but no guard is documented for unexpected values. The skill should specify the safe default (interactive or headless) when the variable check produces neither '0' nor '1'.
There was a problem hiding this comment.
Investigated — this is intentional. The bash ${AUTOSKILLIT_HEADLESS:-0} expression IS the guard: it normalizes unset/empty to '0' (safe default). Any unexpected non-empty value (e.g. 'true', '2') does not equal '1' and naturally falls to interactive mode — the safe default. The behavior is safe and unambiguous for all possible inputs.
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit review found 10 blocking issues. See inline comments. (Note: GitHub prevents requesting changes on your own PR — submitting as comment.)
…3/T-VAL-015 assertions - Add @functools.cache to _skill_text() to eliminate 7+ redundant file-system reads across the test class (reviewer finding: 7 redundant reads per run) - T-VAL-013: replace broad 'headless'/'interactive' substring checks with the specific contract phrase 'Interactive vs Headless' from Step 6 section header - T-VAL-015: replace weak 'history' check with specific phrase 'history research agent' that unambiguously targets the named agent contract
- Add no-files fallback: when audit_report_path is omitted and no files exist under any of the three temp directories, print error and exit non-zero - Add unrecognized format guard: when title/preamble matches none of the three detection patterns, print a specific error message and exit non-zero - Fix logic contradiction in Step 2: merging clusters cannot increase the group count from <8 to ≥8; replace with 'assign each area its own batch' rule
CLAUDE.md specifies 42 MCP tools: 40 kitchen-tagged + 1 headless (test_check, also kitchen-tagged) + 2 free-range (open_kitchen, close_kitchen).
Summary
Implements the
validate-auditskill — a parallel subagent audit validator that examines findings fromaudit-arch,audit-tests, andaudit-cohesionagainst actual code, git history, and design intent. Dispatches 9–10 agents simultaneously (8–9 code validators + 1 history researcher) and produces a validated report with avalidated: truemarker plus a separate contested findings file. This PR covers both the test infrastructure (15 T-VAL contract tests) and the SKILL.md implementation that makes them all pass.Individual Group Plans
Group 1: validate-audit Skill — Post-Audit Finding Validation with Parallel Subagents — PART A ONLY
Create the
validate-auditskill that validates audit findings fromaudit-arch,audit-tests,and
audit-cohesionagainst actual code, git history, and design intent. The skill dispatches9–10 parallel subagents in a single message — 8–9 code validation agents (one per thematic
finding batch, grouped by code area) and 1 history research agent. Verdicts are VALID,
VALID BUT EXCEPTION WARRANTED, or CONTESTED. Output: a validated report with
validated: trueheader marker (CONTESTED findings removed, exceptions documented, severity adjustments noted)
plus a separate contested findings file. Interactive sessions prompt for approval; headless
auto-writes.
Part A covers: Failing test infrastructure (update
tests/workspace/test_skills.py+ createtests/skills/test_validate_audit_contracts.py). After Part A, all new tests fail as expectedbecause the skill directory does not yet exist.
Group 2: validate-audit Skill — Post-Audit Finding Validation with Parallel Subagents — PART B ONLY
Part B covers the primary deliverable: creating
src/autoskillit/skills_extended/validate-audit/SKILL.md. After this part, all 15 T-VAL testsfrom Part A pass.
Part A (separate task) covered: updating
tests/workspace/test_skills.pyand creatingtests/skills/test_validate_audit_contracts.py. Do NOT re-do those changes here; assume theyare already in place and the 15 tests are failing, awaiting this SKILL.md.
Requirements
PARSE — Report Parsing
audit-arch,audit-tests, andaudit-cohesionskills.AGENT — Subagent Orchestration
git log/git blame, checkgh pr list --state=merged, and verify caller dependencies.OUTPUT — Report Generation
validated: truemarker in the Validation Status section header. This marker signals to downstream tools and recipes that the report has been through the validation process. Unvalidated reports must NOT have this marker.temp/directory adjacent to the input report.VERDICT — Classification Criteria
POST — Post-Validation Actions
contested_findings_<date>.md) adjacent to the original report, preserving original finding text alongside the reason each was contested./prepare-issuewith the validated report as the issue description, routing it asrecipe:implementation.Architecture Impact
Process Flow Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%% flowchart TB classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; START([START: audit_report_path]) END([END: output files written]) subgraph Sequential ["★ MAIN THREAD — Sequential"] direction TB DETECT{"★ Detect Format<br/>━━━━━━━━━━<br/>audit-arch?<br/>audit-tests?<br/>audit-cohesion?"} PARSE["★ Parse Findings<br/>━━━━━━━━━━<br/>Extract ID, severity,<br/>file:line, category"] BATCH["★ Thematic Batcher<br/>━━━━━━━━━━<br/>Group by code area<br/>8–9 batches + history slot"] FORK["★ Single-Message Fork<br/>━━━━━━━━━━<br/>ALL Task calls<br/>in one message"] BARRIER["★ Barrier / Collect<br/>━━━━━━━━━━<br/>Wait for all agents<br/>Merge verdicts + context"] SYNTH["★ Synthesize<br/>━━━━━━━━━━<br/>Tally N_valid, N_exception,<br/>N_contested; upgrade findings<br/>if open PR/issue found"] GATE{"★ Interactive<br/>or Headless?<br/>━━━━━━━━━━<br/>AUTOSKILLIT_HEADLESS"} PROMPT["★ Approval Prompt<br/>━━━━━━━━━━<br/>Show verdict table<br/>Write? [Y/n]<br/>Offer prepare-issue"] WRITE["★ Write Output Files<br/>━━━━━━━━━━<br/>validated_report_{source}_{ts}.md<br/>contested_findings_{source}_{ts}.md"] end subgraph Workers ["★ PARALLEL SUBAGENT POOL — Single Message"] direction LR CV1["★ Code Validator 1<br/>━━━━━━━━━━<br/>Batch 1: pipeline/<br/>Read code + git log<br/>VALID / EXCEPTION / CONTESTED"] CV2["★ Code Validator 2<br/>━━━━━━━━━━<br/>Batch 2: execution/<br/>Read code + git log"] CVN["★ Code Validators 3–9<br/>━━━━━━━━━━<br/>One per code area<br/>8–9 total agents"] HR["★ History Researcher<br/>━━━━━━━━━━<br/>ALL findings<br/>git log 90d + gh issue list<br/>Surface in-progress fixes"] end START --> DETECT DETECT -->|"arch/tests/cohesion"| PARSE PARSE --> BATCH BATCH --> FORK FORK --> CV1 & CV2 & CVN & HR CV1 & CV2 & CVN & HR --> BARRIER BARRIER --> SYNTH SYNTH --> GATE GATE -->|"headless"| WRITE GATE -->|"interactive"| PROMPT PROMPT -->|"Y / confirmed"| WRITE WRITE --> END class START,END terminal; class DETECT,GATE stateNode; class PARSE,BATCH phase; class FORK,BARRIER detector; class CV1,CV2,CVN,HR,SYNTH,PROMPT,WRITE newComponent;Color Legend:
Operational Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%% flowchart TB classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; subgraph Invocation ["INVOCATION PATHS"] direction TB COOK["autoskillit cook<br/>━━━━━━━━━━<br/>Interactive cook session<br/>All 61 skills via --plugin-dir"] HEADLESS["run_skill<br/>━━━━━━━━━━<br/>Automated headless session<br/>--add-dir skills_extended/"] CMD["★ /autoskillit:validate-audit<br/>━━━━━━━━━━<br/>[audit_report_path]<br/>categories: [audit]"] end subgraph Config ["CONFIGURATION"] direction TB HEADLESS_FLAG["AUTOSKILLIT_HEADLESS<br/>━━━━━━━━━━<br/>0 = interactive (default)<br/>1 = headless auto-write"] AUTO_DETECT["★ Auto-detect input<br/>━━━━━━━━━━<br/>Most recent mtime from:<br/>temp/audit-arch/<br/>temp/audit-tests/<br/>temp/audit-cohesion/"] end subgraph Discovery ["SKILL DISCOVERY"] direction TB RESOLVER["SkillResolver.list_all()<br/>━━━━━━━━━━<br/>Scans skills/ + skills_extended/<br/>Reads YAML frontmatter"] SKILLSEXT["★ skills_extended/<br/>validate-audit/SKILL.md<br/>━━━━━━━━━━<br/>SkillSource.BUNDLED_EXTENDED<br/>categories: [audit]"] WRITERECIPE["● write-recipe/SKILL.md<br/>━━━━━━━━━━<br/>validate-audit added to<br/>canonical skills list"] end subgraph Upstream ["UPSTREAM AUDIT SKILLS"] direction LR AUDITARCH["audit-arch<br/>━━━━━━━━━━<br/>Produces arch audit reports"] AUDITTESTS["audit-tests<br/>━━━━━━━━━━<br/>Produces test audit reports"] AUDITCOH["audit-cohesion<br/>━━━━━━━━━━<br/>Produces cohesion audit reports"] end subgraph Output ["OUTPUT ARTIFACTS"] direction TB VALRPT["★ validated_report_{source}_{ts}.md<br/>━━━━━━━━━━<br/>.autoskillit/temp/validate-audit/<br/>First line: validated: true"] CONTESTED["★ contested_findings_{source}_{ts}.md<br/>━━━━━━━━━━<br/>.autoskillit/temp/validate-audit/<br/>Written only when N_contested > 0"] PREPISSUE["prepare-issue<br/>━━━━━━━━━━<br/>Optional: file contested findings<br/>as GitHub issue (interactive only)"] end COOK --> CMD HEADLESS --> CMD CMD --> AUTO_DETECT CMD --> HEADLESS_FLAG AUDITARCH & AUDITTESTS & AUDITCOH -.->|"produces input for"| AUTO_DETECT AUTO_DETECT --> CMD RESOLVER --> SKILLSEXT SKILLSEXT --> CMD CMD --> VALRPT CMD --> CONTESTED CONTESTED -.->|"offer interactive"| PREPISSUE class COOK,HEADLESS,CMD cli; class HEADLESS_FLAG,AUTO_DETECT phase; class RESOLVER stateNode; class SKILLSEXT,VALRPT,CONTESTED newComponent; class WRITERECIPE handler; class AUDITARCH,AUDITTESTS,AUDITCOH detector; class PREPISSUE output;Color Legend:
Closes #429
Implementation Plan
Plan files:
/home/talon/projects/autoskillit-runs/impl-20260326-164058-424464/.autoskillit/temp/make-plan/validate_audit_skill_plan_2026-03-26_000100_part_a.md/home/talon/projects/autoskillit-runs/impl-20260326-164058-424464/.autoskillit/temp/make-plan/validate_audit_skill_plan_2026-03-26_000100_part_b.md🤖 Generated with Claude Code via AutoSkillit