Skip to content

Create validate-review-decisions Skill#2224

Merged
Trecek merged 4 commits into
developfrom
create-validate-review-decisions-skill-review-decision-valid/1962
May 8, 2026
Merged

Create validate-review-decisions Skill#2224
Trecek merged 4 commits into
developfrom
create-validate-review-decisions-skill-review-decision-valid/1962

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 8, 2026

Summary

Create a new validate-review-decisions skill at skills_extended/validate-review-decisions/SKILL.md that adds mandatory intent analysis and seven evidence-gathering rules to the validation workflow for review-decisions audit reports. Update full-audit.yaml to route review-decisions validation through this new skill instead of the generic validate-audit. Add contract tests and update all test manifests.

The skill follows the architecture established by validate-test-audit (domain-specific semantic rules + intent analysis) while preserving full output compatibility with validate-audit (same directory, naming convention, validated: true sentinel, AUTOSKILLIT_AUDIT_RUN_DIR support).

Requirements

REQ-VRD-1: Skill structure

The skill MUST be placed at skills_extended/validate-review-decisions/SKILL.md with categories: [audit] frontmatter.

REQ-VRD-2: Intent analysis as mandatory step

Code validation subagents MUST perform intent analysis (docstring check, git provenance, test coverage, contract analysis, architectural constraint check, behavioral simulation) before assigning a verdict to ANY finding. This is not optional — every finding must have an intent analysis section in the subagent's reasoning.

REQ-VRD-3: Evidence-gathering rules in subagent instructions

The skill MUST include the seven evidence-gathering rules in the code validation subagent prompt. Rules MUST be generalizable (no references to specific finding IDs).

REQ-VRD-4: Output compatibility

Output files MUST use the same directory, naming convention, and format as validate-audit.

REQ-VRD-5: Standalone invocability

The skill MUST accept an {audit_report_path} argument and auto-discover the most recent review-decisions audit report when omitted.

REQ-VRD-6: Full-audit recipe routing

Update full-audit.yaml to dispatch review-decisions audit validation to validate-review-decisions and all other audit types to validate-audit (or validate-test-audit for tests).

REQ-VRD-7: No pack changes

The skill MUST use the existing audit pack.

REQ-VRD-8: Consider generalizing intent analysis to validate-audit (follow-up)

After validate-review-decisions and validate-test-audit both establish intent analysis patterns, evaluate merging the common intent analysis rules back into the generic validate-audit skill. This is a follow-up, not a blocker.

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260508-010014-405487/.autoskillit/temp/make-plan/validate_review_decisions_skill_plan_2026-05-08_010600.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
plan claude-opus-4-6 1 2.6k 14.9k 1.8M 93.1k 105 79.9k 9m 1s
verify claude-opus-4-6 1 1.7k 11.6k 1.4M 63.3k 116 50.4k 6m 22s
implement* MiniMax-M2.7-highspeed 1 3.6M 22.5k 1.9M 28.7k 180 16.2k 12m 59s
prepare_pr* MiniMax-M2.7-highspeed 1 102.6k 4.2k 261.0k 34.0k 24 27.4k 1m 37s
compose_pr* MiniMax-M2.7-highspeed 1 57.5k 1.8k 227.0k 28.7k 16 15.1k 49s
review_pr claude-sonnet-4-6 3 414 101.0k 2.9M 96.5k 162 236.7k 23m 19s
resolve_review claude-sonnet-4-6 1 273 21.6k 1.9M 76.8k 90 63.8k 7m 37s
Total 3.7M 177.5k 10.4M 96.5k 489.5k 1h 1m

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
plan 0
verify 0
implement 766 2508.8 21.2 29.4
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 9 208657.2 7084.0 2395.3
Total 775 13399.1 631.6 229.1

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
claude-opus-4-6 2 4.3k 26.5k 3.1M 130.2k 15m 24s
MiniMax-M2.7-highspeed 3 3.7M 28.5k 2.4M 58.8k 15m 25s
claude-sonnet-4-6 1 174 7.5k 874.3k 42.9k 3m 33s

Trecek and others added 4 commits May 8, 2026 01:28
- Create new validate-review-decisions skill with 7 evidence-gathering rules:
  Rule 1: Docstring-as-contract recognition
  Rule 2: Deliberate-change detection
  Rule 3: Test-as-intent-signal
  Rule 4: Consumer-impact verification
  Rule 5: Architectural feasibility check
  Rule 6: Behavioral simulation
  Rule 7: Symmetry-as-design recognition

- Add mandatory intent analysis requiring 6 techniques before verdict:
  docstring check, git provenance, test coverage, contract analysis,
  architectural constraint check, behavioral simulation

- Route review-decisions validation through new specialized skill
  in full-audit.yaml (previously routed through generic validate-audit)

- Add 33 contract tests (T-VRD-001 through T-VRD-033)

- Update all test manifests and documentation

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T-VRD-021: Remove trivial 'or "test" in text' disjunction — 'test' appears in
any skill document, making the second branch always pass and the first branch
dead. Assert only 'test coverage' which is the meaningful phrase being checked.

T-VRD-027: Replace three separate assertions (including trivially-true 'test' in
text) with a single compound assertion on 'test-as-intent-signal', the exact
concept name used in SKILL.md Rule 3 (line 157).
…e-* entries in catalog

Placing it on its own line between validate-test-audit and audit-claims broke
the cohesive validate-* grouping in the Audit suite section.
@Trecek Trecek added this pull request to the merge queue May 8, 2026
Merged via the queue into develop with commit 24ddb46 May 8, 2026
2 checks passed
@Trecek Trecek deleted the create-validate-review-decisions-skill-review-decision-valid/1962 branch May 8, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant