Skip to content

Implementation Plan: Add Exception Whitelists to Project-Local Audit Skills + Refine P12 Composition-Boundary Rules#736

Merged
Trecek merged 10 commits intointegrationfrom
add-exception-whitelists-to-project-local-audit-skills-refin/723
Apr 11, 2026
Merged

Implementation Plan: Add Exception Whitelists to Project-Local Audit Skills + Refine P12 Composition-Boundary Rules#736
Trecek merged 10 commits intointegrationfrom
add-exception-whitelists-to-project-local-audit-skills-refin/723

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented Apr 11, 2026

Summary

Project-local overrides for audit-arch, audit-cohesion, and audit-tests already exist in .claude/skills/, but recurring false positives across three architecture audit rounds and two cohesion audit rounds revealed that:

  1. audit-arch Principle 12 (Composition Root) has no concept of legitimate composition boundaries beyond server/_factory.py, generating ~5 contested findings per run.
  2. None of the project-local audit skills carry the documented exception whitelist (GE-1..GE-17, PS-1..PS-8) that the audit-validate rounds keep re-applying by hand.
  3. The project-local audit-arch lost two quality mechanisms (Pre-Flight Verification Checklist, Self-Validation Pass) that exist in the bundled version (src/autoskillit/skills_extended/audit-arch/SKILL.md) and originally cut the contest rate from 31% → 18%.
  4. The "Principle Suggestion (Optional)" workflow is unbounded — sub-principle subagents can invent and enforce new principles mid-audit.
  5. validate-audit has no project-local override, so its known-exceptions table is rebuilt from scratch each run.
  6. All three project-local audit skills hardcode temp/ instead of {{AUTOSKILLIT_TEMP}} and lack categories: [audit] frontmatter.

This plan changes only project-local skill markdown files under .claude/skills/ and adds one new test file. It does not touch the bundled skills_extended/ versions, source code, or recipes.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 45, 'rankSpacing': 55, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;

    %% TERMINALS %%
    START([START])
    END_AUDIT([AUDIT REPORT WRITTEN])
    END_VALIDATE([★ VALIDATE-AUDIT COMPLETE])

    %% ─────────────────────────────────── %%
    %% PHASE A: AUDIT SKILL EXECUTION      %%
    %% ─────────────────────────────────── %%
    subgraph PhaseA ["● Audit Skill Execution  (audit-arch | audit-cohesion | audit-tests)"]
        direction TB

        A1["● Invoke Audit Skill<br/>━━━━━━━━━━<br/>/audit-arch, /audit-cohesion,<br/>or /audit-tests"]
        A2["Code-Index Init<br/>━━━━━━━━━━<br/>set_project_path(repo_root)<br/>Fall back to Grep/Glob"]
        A3["Launch Parallel Subagents<br/>━━━━━━━━━━<br/>One per principle (P1–P14)<br/>or dimension (C1–C10)<br/>Each MUST NOT invent extra principles"]
        A4["Subagent: Examine Code<br/>━━━━━━━━━━<br/>Read file:line references<br/>git log --oneline -- {file}"]

        A5{"● Pre-Flight<br/>Verification Gate<br/>━━━━━━━━━━<br/>audit-arch only:<br/>Missing export? Read __init__.py<br/>Missing decorator? Check class def<br/>Enforcement gap? Grep tests/<br/>Duplication? Compare full body<br/>Misplaced file? Check git log"}

        A6["DROP Finding<br/>━━━━━━━━━━<br/>Verification failed —<br/>evidence does not hold"]

        A7{"● Exception<br/>Whitelist Check<br/>━━━━━━━━━━<br/>audit-arch: GE-1,2,7,9,13,16,17<br/>+ PS-1,2,4,7,8<br/>audit-cohesion: GE-10,11,12,13,15<br/>+ PS-3,5,6"}

        A8["Mark: VALID BUT<br/>EXCEPTION WARRANTED<br/>━━━━━━━━━━<br/>Attach matching GE/PS ID<br/>as exception note"]

        A9{"● Self-Validation<br/>Pass (audit-arch)<br/>━━━━━━━━━━<br/>a. HIGH/CRITICAL re-read<br/>b. Concrete-class check<br/>c. Enforcement-search<br/>d. Internal CONFIRMED/REVISED note"}

        A10["Apply Severity Gate<br/>━━━━━━━━━━<br/>CRITICAL requires data loss,<br/>security bypass, or correctness bug<br/>Downgrade if bar not met"]

        A11["★ Principle Suggestion<br/>━━━━━━━━━━<br/>ONE dedicated subagent<br/>After all P1–P14 consolidated<br/>MUST NOT generate findings<br/>MUST NOT be enforced"]

        A12["Write Report<br/>━━━━━━━━━━<br/>{{AUTOSKILLIT_TEMP}}/audit-{source}/<br/>arch_audit_{ts}.md<br/>cohesion_audit_{ts}.md<br/>test_audit_{ts}.md"]
    end

    %% ─────────────────────────────────── %%
    %% PHASE B: VALIDATE-AUDIT PIPELINE    %%
    %% ─────────────────────────────────── %%
    subgraph PhaseB ["★ validate-audit Pipeline  (new project-local skill)"]
        direction TB

        B1["★ Invoke /validate-audit<br/>━━━━━━━━━━<br/>audit_report_path arg OR<br/>most-recent file under<br/>temp/audit-arch|tests|cohesion/"]

        B2{"Detect Audit Format<br/>━━━━━━━━━━<br/>Title: 'Architectural Audit'?<br/>Title: 'Test Suite Audit'?<br/>Title: 'Cohesion Audit'?"}

        B3["Parse Findings<br/>━━━━━━━━━━<br/>Extract: ID, Text, Severity,<br/>Location (file:line), Category"]

        B4["Group into Thematic Batches<br/>━━━━━━━━━━<br/>8–9 code-area batches<br/>by top-level package touched<br/>1 slot reserved for history agent<br/>Merge smallest if >9 areas"]

        B5["★ Launch ALL Subagents<br/>(SINGLE MESSAGE)<br/>━━━━━━━━━━<br/>8–9 Code Validation Agents<br/>+ 1 History Research Agent<br/>model: sonnet, parallel"]

        B6["Code Validation Agent<br/>━━━━━━━━━━<br/>Read source file:line<br/>Check git log -10 -- {file}<br/>Assign VALID | EXCEPTION | CONTESTED"]

        B7{"★ Known Project<br/>Exceptions Check<br/>━━━━━━━━━━<br/>MUST consult before verdict<br/>PS-1: smoke_utils.py at root<br/>PS-2: hook_registry.py at root<br/>PS-3: test_check dual-tag<br/>PS-4: _llm_triage.py at root<br/>PS-5: CLAUDE.md → #713<br/>PS-6: remove_clone booleans<br/>PS-7: SkillResolver naming<br/>PS-8: L3 server/cli exclusion"}

        B8["History Research Agent<br/>━━━━━━━━━━<br/>Receives ALL findings<br/>git log last 90 days<br/>gh issue list --search keyword<br/>Returns: context text only, no files"]

        B9["Synthesize All Results<br/>━━━━━━━━━━<br/>Merge code verdict + history<br/>Open PR/issue → EXCEPTION<br/>Tally N_valid, N_exception, N_contested<br/>Collect severity adjustments"]

        B10["★ Write Output Files<br/>━━━━━━━━━━<br/>validated_report_{source}_{ts}.md<br/>first line: 'validated: true'<br/>contested_findings_{source}_{ts}.md<br/>(only when N_contested > 0)"]

        B11{"Detect Headless Mode<br/>━━━━━━━━━━<br/>echo ${AUTOSKILLIT_HEADLESS:-0}<br/>== '1'?"}

        B12["Headless: Write Immediately<br/>━━━━━━━━━━<br/>Print: Valid / Exceptions / Contested<br/>Print: Report path(s)<br/>No user prompt"]

        B13["Interactive: Show Status Table<br/>━━━━━━━━━━<br/>Display verdict counts<br/>Ask: 'Write files? [Y/n]'<br/>On Y: write files<br/>Offer prepare-issue if N_contested > 0"]

        B14["UNRECOGNIZED FORMAT<br/>━━━━━━━━━━<br/>Print error message<br/>Exit non-zero"]
    end

    %% ─────────────────────────────────── %%
    %% PHASE C: TEST COMPLIANCE GUARD      %%
    %% ─────────────────────────────────── %%
    subgraph PhaseC ["★ Test Compliance Guard  (test_project_local_audit_skill_content.py)"]
        direction LR

        C1["★ pytest parameterized<br/>audit-arch / cohesion / tests<br/>/ validate-audit<br/>━━━━━━━━━━<br/>Parse YAML frontmatter<br/>Strip fenced code blocks"]

        C2{"Check: categories=[audit]<br/>AUTOSKILLIT_TEMP placeholder<br/>No literal temp/{skill}/ path<br/>GE/PS exception IDs present<br/>Pre-Flight & Self-Validation<br/>Principle Suggestion constraints"}

        C3["PASS — Skill content<br/>compliant with issue #723"]

        C4["FAIL — pytest assertion<br/>with descriptive message<br/>pinpointing missing content"]
    end

    %% ─────────────────────────────────── %%
    %% CONNECTIONS                         %%
    %% ─────────────────────────────────── %%
    START --> A1
    A1 --> A2
    A2 --> A3
    A3 --> A4
    A4 --> A5
    A5 -->|"evidence holds"| A7
    A5 -->|"evidence fails"| A6
    A7 -->|"GE/PS match"| A8
    A7 -->|"no match"| A9
    A8 --> A9
    A9 -->|"confirmed / revised"| A10
    A10 --> A11
    A11 --> A12
    A12 --> END_AUDIT

    END_AUDIT --> B1
    B1 --> B2
    B2 -->|"format recognized"| B3
    B2 -->|"unknown format"| B14
    B3 --> B4
    B4 --> B5
    B5 --> B6
    B5 --> B8
    B6 --> B7
    B7 -->|"PS-1..PS-8 match"| B6
    B7 -->|"no exception match"| B9
    B8 --> B9
    B9 --> B10
    B10 --> B11
    B11 -->|"headless=1"| B12
    B11 -->|"headless=0"| B13
    B12 --> END_VALIDATE
    B13 --> END_VALIDATE

    END_AUDIT -.->|"CI gate: validates SKILL.md content"| C1
    C1 --> C2
    C2 -->|"all checks pass"| C3
    C2 -->|"any check fails"| C4

    %% CLASS ASSIGNMENTS %%
    class START terminal;
    class END_AUDIT,END_VALIDATE terminal;
    class A1 handler;
    class A2 phase;
    class A3,A4 handler;
    class A5,A7,A9 detector;
    class A6 detector;
    class A8 stateNode;
    class A10 phase;
    class A11 newComponent;
    class A12 output;
    class B1 newComponent;
    class B2 detector;
    class B3,B4 phase;
    class B5 newComponent;
    class B6 handler;
    class B7 detector;
    class B8 handler;
    class B9 phase;
    class B10 newComponent;
    class B11 detector;
    class B12,B13 handler;
    class B14 detector;
    class C1 newComponent;
    class C2 detector;
    class C3 output;
    class C4 detector;
Loading

Closes #723

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260411-114314-366490/.autoskillit/temp/make-plan/audit_whitelists_p12_refine_plan_2026-04-11_115420.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step uncached output cache_read cache_write count time
plan 199 14.9k 962.4k 97.7k 1 4m 15s
verify 132 14.9k 774.0k 57.9k 1 4m 6s
implement 498 31.0k 4.2M 82.2k 1 15m 46s
prepare_pr 76 4.2k 248.0k 35.0k 1 1m 22s
run_arch_lenses 2.4k 14.2k 449.5k 96.9k 2 5m 36s
compose_pr 59 6.7k 187.6k 23.7k 1 1m 57s
Total 3.4k 85.9k 6.8M 393.4k 33m 4s

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

9 actionable findings (warnings) require fixes before merging. 15 info findings for awareness. See inline comments.

Comment thread .claude/skills/validate-audit/SKILL.md Outdated

Before assigning a final verdict, code validation agents MUST consult this table. A finding that matches a Known Project Exception receives verdict **VALID BUT EXCEPTION WARRANTED** with the matching PS ID as the exception note.

| ID | Exception | Rationale |
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: The Known Project Exceptions table mixes PS IDs from two skill domains (PS-1/2/4/7/8 from audit-arch; PS-3/5/6 from audit-cohesion) with no column indicating origin. Add an 'Origin' column (arch/cohesion) so reviewers know which skill each exception comes from.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated — this is intentional. validate-audit is an explicit cross-domain post-processor that handles output from all three audit skills (arch, tests, cohesion). The unified PS table was deliberately designed as a single authoritative cross-skill lookup that agents consult unconditionally — adding an Origin column would imply per-skill filtering, which contradicts the design intent. The Related Skills section (lines 296-299) already documents which PS IDs originate from which skill.

Comment thread .claude/skills/audit-cohesion/SKILL.md
Comment thread .claude/skills/audit-cohesion/SKILL.md Outdated
SKILLS_DIR = REPO_ROOT / ".claude" / "skills"


def _read_skill(skill_name: str) -> str:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _read_skill lets a bare FileNotFoundError propagate without context. If the file is missing, the OS error shows only the path, not which skill_name was requested. Add a descriptive wrapper to re-raise with the skill name included.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated — this is intentional. The path.read_text() OS error already includes the full constructed path (SKILLS_DIR / skill_name / 'SKILL.md'), which contains skill_name verbatim. A re-raise would add no diagnostic information beyond what the stdlib error surfaces.

content = _read_skill(skill_name)

# Strip fenced code blocks so we don't false-positive on examples
stripped = re.sub(r"```.*?```", "", content, flags=re.DOTALL)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] tests: The fenced-block stripping regex does not strip inline single-backtick spans. If a skill file uses a backtick-quoted path, the literal-path check could false-positive. Currently no such pattern exists, but it is a latent fragility.

# ---------------------------------------------------------------------------


def test_validate_audit_project_local_override_exists() -> None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] tests: path.exists() assertion has a minimal error message. Consider: assert path.exists(), f'.claude/skills/validate-audit/SKILL.md not found at {path}' for clearer diagnostics.



def test_validate_audit_uses_autoskillit_temp_placeholder() -> None:
path = SKILLS_DIR / "validate-audit" / "SKILL.md"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] slop: path.exists() and path.read_text() are repeated in test_validate_audit_uses_autoskillit_temp_placeholder (L235-L237), duplicating boilerplate from test_validate_audit_project_local_override_exists (L212-L214). Extract a shared helper or drop the redundant guard.

```

- `audit_report_path` — absolute path to an audit report produced by `audit-arch`,
`audit-tests`, or `audit-cohesion`. If omitted, use the most recent file under
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] defense: The 'most recent mtime wins' fallback does not cover the case where directories exist but contain no .md files. Add a guard: if directories exist but have no markdown files, apply the same 'no files exist' error.

- **audit-tests**: Title contains "Test Suite Audit" or findings reference issue categories
- **audit-cohesion**: Title contains "Cohesion Audit" or findings reference "Dimension C{N}"

If none of the three patterns match, print:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[info] defense: The unrecognized-format error says 'exit with a non-zero status', which is ambiguous for a skill. Add explicit instructions: 'stop immediately; do not proceed to Step 2 or later steps' to prevent a skill agent from printing the error but continuing.

Trecek and others added 2 commits April 11, 2026 15:44
…gex GE assertion, PS comment

- Wrap content.index('---', 3) in try/except with descriptive message for malformed frontmatter
- Tighten GE assertion to re.search(r'\*\*GE-N\*\*', content) to prevent substring false-matches
- Add comment explaining PS-3/5/6 are cohesion-specific, tested elsewhere

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tnotes + heading consistency

validate-audit/SKILL.md:
- Remove duplicate paragraph at line 79 (near-verbatim repeat of lines 66-67)

audit-cohesion/SKILL.md:
- Add *Source:* footnotes to GE-12 and GE-13 (matching GE-10/11/15 pattern)
- Align Project-Specific Exceptions heading to '### Project-Specific Exceptions (PS)'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek enabled auto-merge April 11, 2026 22:59
@Trecek Trecek disabled auto-merge April 11, 2026 23:00
@Trecek Trecek added this pull request to the merge queue Apr 11, 2026
Merged via the queue into integration with commit ebadb71 Apr 11, 2026
2 checks passed
@Trecek Trecek deleted the add-exception-whitelists-to-project-local-audit-skills-refin/723 branch April 11, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant