Implementation Plan: Citation Integrity Gates for Research Pipeline#661
Conversation
… gate routing Adds two new Tier 3 research skills and restructures the research recipe PR+Review phase so both read-only audit gates (review_research_pr and audit_claims) complete before any resolution step begins. Replaces check_escalations with merge_escalations which consumes rerun flags from both resolution skills. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… skills - Remove stale test_review_research_pr_routes_to_begin_archival and test_resolve_research_review_routes_to_begin_archival (superseded by deferred routing to audit_claims/route_claims_resolve) - Add skill_contracts.yaml entries for review-research-pr, audit-claims, and resolve-claims-review to satisfy undeclared-capture-key rule - Add pseudocode allowlist entries for audit-claims and resolve-claims-review bash placeholder tokens (mirrors review-research-pr/resolve-research-review patterns already in the allowlist) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested
25 actionable findings (critical/warning) requiring fixes before merge. See inline comments.
| assert step.on_failure == "begin_archival" | ||
| assert step.on_context_limit == "begin_archival" | ||
|
|
||
| def test_research_no_issue_number_ingredient(self, recipe) -> None: |
There was a problem hiding this comment.
[warning] tests: Deleted test test_research_review_step_routes_to_begin_archival_on_any_outcome asserted review_research_pr.on_failure and on_context_limit. The new routing targets ('audit_claims') are not directly tested on the review_research_pr step — only the downstream audit_claims step's own routes are tested. Add assertions for review_research_pr.on_failure == 'audit_claims' and review_research_pr.on_context_limit == 'audit_claims'.
| assert matching, "No condition with changes_requested" | ||
| assert matching[0] == "resolve_research_review" | ||
|
|
||
| def test_research_validates_cleanly(self, recipe) -> None: |
There was a problem hiding this comment.
[warning] tests: Deleted test test_review_research_pr_has_on_result_routing asserted changes_requested routes to resolve_research_review; deleted test test_review_research_pr_routes_to_begin_archival asserted the default on_result route. No new test verifies the on_result default route of review_research_pr now points to audit_claims.
| expected_output_patterns: | ||
| - "verdict\\s*=\\s*(approved|changes_requested|needs_human)" | ||
| pattern_examples: | ||
| - "verdict = approved\n%%ORDER_UP%%" |
There was a problem hiding this comment.
[warning] cohesion: audit-claims contract's pattern_examples omits the 'needs_human' example. The skill emits three verdict values (approved, changes_requested, needs_human) per allowed_values, but pattern_examples only covers two. The review-pr contract serves as the established template and includes all three examples.
| If `pr_url` is missing or positional args are insufficient, abort with: | ||
| `"Usage: /autoskillit:audit-claims <worktree_path> <base_branch> <pr_url>"` | ||
|
|
||
| ### Step 0.5 — Code-Index Initialization (required before any code-index tool call) |
There was a problem hiding this comment.
[warning] slop: Step 0.5 (L77-L94) is generic code-index boilerplate repeated verbatim from CLAUDE.md and other skills. The path-format examples add no skill-specific value. Note: CLAUDE.md mandates this step — the fix is to trim the examples to skill-relevant paths only, not to remove the step.
| - `decision_findings` — requires_decision=true (any severity) | ||
| - `info_findings` — severity == "info" AND requires_decision=false | ||
|
|
||
| ### Step 4.5: Echo Primary Obligation |
There was a problem hiding this comment.
[critical] slop: Step 4.5 'Echo Primary Obligation' is an AI prompt-engineering directive instructing the executor to narrate its own obligation aloud. This is scaffolding metadata embedded in the skill spec. If this is intentional (mirrors review-pr SKILL.md), it should be acknowledged; otherwise remove.
There was a problem hiding this comment.
Investigated — this is intentional. Investigated — false_positive_intentional_pattern. Step 4.5 mirrors review-pr/SKILL.md:302-308 which has identical Step 4.5 and Step 6.5 sections. This pattern is intentional prompt-engineering scaffolding to prevent the AI executor from skipping inline comment posting, carried over from the source skill.
| gh pr review {pr_number} --comment --body "{summary_markdown}" | ||
| ``` | ||
|
|
||
| ### Step 6.5: Post-Completion Confirmation |
There was a problem hiding this comment.
[warning] slop: Step 6.5 'Post-Completion Confirmation' instructs the executor to recite a confirmation string aloud after posting comments. This is an AI meta-prompt artifact. If mirroring review-pr SKILL.md intentionally, document why; otherwise remove.
| exit(1) | ||
| # Analyze failures, revert/adjust problematic commit, retry | ||
| ``` | ||
|
|
There was a problem hiding this comment.
[warning] slop: Prose paragraph after the validation code block (lines 283-284) fully restates what the code already shows; AI-generated filler that adds no information.
…ee, scope gh repo view, mkdir before diff write, filter invalid line values in COMMENTS_JSON, track Tier 1 fallback success/failure counts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t rev-parse, null-check PR_NUMBER, discard partial subagent JSON, clarify timeout semantics, document claims_review config, add Step 0.5 code-index init Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_verdict, audit_verdict, review_needs_rerun, claims_needs_rerun in routing step notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te_claims_resolve/merge_escalations; add needs_human pattern_examples to review-research-pr and audit-claims contracts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Add an optional citation-integrity gate to the research pipeline as two new Tier 3 skills:
audit-claims— parallel subagent-driven claim extraction and evidence matching that emitsverdict = {approved|changes_requested|needs_human}. Mirrors thereview-research-prpattern but focused on citation integrity across four claim types:experimental,external,methodological,comparative.resolve-claims-review— ACCEPT/REJECT/DISCUSS intent validation with five fix strategies (add_citation,qualify_claim,remove_claim,rerun_required[escalated],design_flaw[escalated]). Mirrors theresolve-research-reviewpattern.Recipe changes restructure the PR+Review phase so both read-only analysis gates (
review_research_prandaudit_claims) complete before any resolution step begins. This prevents double re-runs when both gates request changes. A newmerge_escalationsroute step (replacingcheck_escalations) triggers at most onere_run_experimentif either resolution skill escalatesneeds_rerun = true.Architecture Impact
Process Flow Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; %% TERMINALS %% START([START]) RERUN([re_run_experiment]) PUSH([re_push_research]) %% PHASE 1: REVIEW GATE %% subgraph ReviewGate ["● Review Gate (research.yaml)"] ReviewPR["● review_research_pr<br/>━━━━━━━━━━<br/>review-research-pr skill<br/>skip_when_false: inputs.review_pr<br/>captures: review_verdict"] end %% PHASE 2: AUDIT CLAIMS (NEW) %% subgraph AuditClaimsGate ["★ Audit Claims Gate (audit-claims/SKILL.md — NEW)"] AuditEntry["★ audit_claims<br/>━━━━━━━━━━<br/>gated: inputs.audit_claims<br/>retries: 1<br/>captures: audit_verdict"] GetDiff["Get PR Diff<br/>━━━━━━━━━━<br/>gh pr diff<br/>save: diff_{pr}.txt"] ClaimExtract["★ Phase 1: Claim Extraction<br/>━━━━━━━━━━<br/>Parallel subagents per diff section<br/>(Executive Summary, Results,<br/>Methodology, Discussion, …)<br/>→ claims_{pr}.json"] EvidenceMatch["★ Phase 2: Evidence Matching<br/>━━━━━━━━━━<br/>Parallel subagents per claim type<br/>external / methodological / comparative<br/>experimental → skipped (self-evidencing)<br/>→ findings_{pr}.json"] AuditAggregate["Aggregate & Deduplicate<br/>━━━━━━━━━━<br/>Deduplicate by (file, line)<br/>Bucket: actionable / decision / info"] AuditVerdictDet{"Verdict<br/>Determination<br/>actionable_findings?<br/>decision_findings?"} PostInlineReview["★ Post Inline Review<br/>━━━━━━━━━━<br/>GitHub Reviews API<br/>APPROVE / COMMENT / REQUEST_CHANGES<br/>Fallback: individual → summary dump"] AuditEntry --> GetDiff GetDiff --> ClaimExtract ClaimExtract --> EvidenceMatch EvidenceMatch --> AuditAggregate AuditAggregate --> AuditVerdictDet AuditVerdictDet -->|"actionable → changes_requested"| PostInlineReview AuditVerdictDet -->|"decision only → needs_human"| PostInlineReview AuditVerdictDet -->|"none → approved"| PostInlineReview end %% PHASE 3: RESOLUTION ROUTING %% subgraph ResolutionRouting ["● Resolution Routing (research.yaml — MODIFIED)"] RouteReviewResolve{"● route_review_resolve<br/>review_verdict?"} ResolveResearch["resolve_research_review<br/>━━━━━━━━━━<br/>retries: 2<br/>captures: review_needs_rerun"] RouteClaimsResolve{"● route_claims_resolve<br/>audit_verdict?"} RouteReviewResolve -->|"review == changes_requested"| ResolveResearch RouteReviewResolve -->|"else (approved / needs_human)"| RouteClaimsResolve ResolveResearch -->|"any exit → review_needs_rerun captured"| RouteClaimsResolve end %% PHASE 4: RESOLVE CLAIMS REVIEW (NEW) %% subgraph ResolveClaimsGate ["★ Resolve Claims Review (resolve-claims-review/SKILL.md — NEW)"] FetchComments["Fetch Review Comments<br/>━━━━━━━━━━<br/>REST: inline comments + reviews<br/>GraphQL: thread node IDs<br/>(cursor-paginated, skip resolved)"] ParseDimGroup["Parse & Dimension-Group<br/>━━━━━━━━━━<br/>Extract [severity] dimension:<br/>citations / methodology /<br/>comparisons / unknown"] IntentVal["★ Parallel Intent Validation<br/>━━━━━━━━━━<br/>Subagents per dimension group<br/>→ ACCEPT / REJECT / DISCUSS<br/>+ fix_strategy + escalate flag"] FixRoute{"fix_strategy?<br/>ACCEPT / REJECT / DISCUSS"} ApplyEdits["Apply Edits<br/>━━━━━━━━━━<br/>add_citation / qualify_claim /<br/>remove_claim<br/>→ pre-commit + git commit"] EscalateFind["★ Escalate Finding<br/>━━━━━━━━━━<br/>rerun_required / design_flaw<br/>→ escalation_records_{pr}.json"] SkipFix["Skip (REJECT / DISCUSS)<br/>━━━━━━━━━━<br/>No code change<br/>Record skip"] RunValCmd["Validation Command<br/>━━━━━━━━━━<br/>claims_review.validation_command<br/>max 3 iterations | null → SKIPPED"] ThreadResolve["Resolve Threads + Post Replies<br/>━━━━━━━━━━<br/>GraphQL resolveReviewThread<br/>+ inline reply per comment<br/>Best-effort (never affects exit code)"] RerunCheck{"Any rerun_required<br/>in escalation_records?"} FetchComments --> ParseDimGroup ParseDimGroup --> IntentVal IntentVal --> FixRoute FixRoute -->|"add_citation / qualify_claim / remove_claim"| ApplyEdits FixRoute -->|"rerun_required / design_flaw"| EscalateFind FixRoute -->|"REJECT / DISCUSS"| SkipFix ApplyEdits --> RunValCmd EscalateFind --> ThreadResolve SkipFix --> ThreadResolve RunValCmd --> ThreadResolve ThreadResolve --> RerunCheck end %% PHASE 5: MERGE ESCALATIONS %% MergeEsc{"● merge_escalations<br/>review_needs_rerun<br/>OR claims_needs_rerun?"} %% TOP-LEVEL FLOW %% START --> ReviewPR ReviewPR -->|"any exit (success / failure / context_limit)"| AuditEntry PostInlineReview -->|"verdict = approved / changes_requested / needs_human"| RouteReviewResolve RouteClaimsResolve -->|"audit == changes_requested"| FetchComments RouteClaimsResolve -->|"else (approved / needs_human)"| MergeEsc RerunCheck -->|"needs_rerun = true / false"| MergeEsc MergeEsc -->|"true"| RERUN MergeEsc -->|"false"| PUSH %% CLASS ASSIGNMENTS %% class START,RERUN,PUSH terminal; class ReviewPR,ResolveResearch handler; class AuditEntry,ClaimExtract,EvidenceMatch,PostInlineReview,IntentVal,EscalateFind newComponent; class GetDiff,AuditAggregate,FetchComments,ParseDimGroup,ApplyEdits,ThreadResolve output; class AuditVerdictDet,RouteReviewResolve,RouteClaimsResolve,FixRoute,RerunCheck,MergeEsc stateNode; class RunValCmd detector; class SkipFix gap;Scenarios Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%% flowchart LR %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; %% ─────────────────────────────────────── %% subgraph S1 ["SCENARIO 1: Citation Audit Passes (Happy Path)"] direction LR S1_PR["review_research_pr<br/>━━━━━━━━━━<br/>reads: worktree diff<br/>writes: review_verdict"] S1_AUDIT["★ audit_claims<br/>━━━━━━━━━━<br/>reads: PR diff via gh<br/>writes: claims JSON, findings JSON<br/>emits: verdict=approved"] S1_ROUTE_REV["route_review_resolve<br/>━━━━━━━━━━<br/>reads: review_verdict<br/>routes: approved → route_claims_resolve"] S1_ROUTE_CL["route_claims_resolve<br/>━━━━━━━━━━<br/>reads: audit_verdict=approved<br/>routes: → merge_escalations"] S1_MERGE["merge_escalations<br/>━━━━━━━━━━<br/>reads: needs_rerun flags<br/>routes: → re_push_research"] S1_PUSH["re_push_research<br/>━━━━━━━━━━<br/>writes: origin HEAD push<br/>routes: → begin_archival"] end S1_PR -->|"verdict captured"| S1_AUDIT S1_AUDIT -->|"all paths → route"| S1_ROUTE_REV S1_ROUTE_REV -->|"approved/skipped"| S1_ROUTE_CL S1_ROUTE_CL -->|"approved → skip resolve"| S1_MERGE S1_MERGE -->|"no rerun needed"| S1_PUSH %% ─────────────────────────────────────── %% subgraph S2 ["SCENARIO 2: Citation Fixes Required (changes_requested)"] direction LR S2_AUDIT["★ audit_claims<br/>━━━━━━━━━━<br/>reads: PR diff sections<br/>Phase 1: extract claims by section<br/>Phase 2: match against evidence<br/>emits: verdict=changes_requested"] S2_ROUTE["route_claims_resolve<br/>━━━━━━━━━━<br/>reads: audit_verdict=changes_requested<br/>routes: → resolve_claims_review"] S2_RESOLVE["★ resolve_claims_review<br/>━━━━━━━━━━<br/>reads: inline PR comments<br/>groups by dimension: citations/methodology/comparisons<br/>launches: parallel intent-validation subagents<br/>writes: classification_map, fixes"] S2_INTENT["Intent Validation Subagents<br/>━━━━━━━━━━<br/>reads: actual file content ±30 lines<br/>classifies: ACCEPT / REJECT / DISCUSS<br/>assigns: fix_strategy per finding"] S2_FIX["Apply Fixes<br/>━━━━━━━━━━<br/>add_citation / qualify_claim / remove_claim<br/>writes: git commits in worktree<br/>resolves: review threads via GraphQL"] S2_SUMMARY["★ resolve-claims-review summary<br/>━━━━━━━━━━<br/>writes: report_{pr}_{ts}.md<br/>emits: needs_rerun=false"] end S2_AUDIT -->|"changes_requested"| S2_ROUTE S2_ROUTE -->|"routes to resolve"| S2_RESOLVE S2_RESOLVE -->|"per-dimension subagents"| S2_INTENT S2_INTENT -->|"classification_map"| S2_FIX S2_FIX -->|"commits applied"| S2_SUMMARY %% ─────────────────────────────────────── %% subgraph S3 ["SCENARIO 3: Rerun Escalation (rerun_required fix_strategy)"] direction LR S3_INTENT["Intent Validation Subagents<br/>━━━━━━━━━━<br/>reads: claim at flagged line<br/>classifies: ACCEPT<br/>fix_strategy=rerun_required"] S3_ESCALATE["★ resolve_claims_review<br/>━━━━━━━━━━<br/>escalate=true for rerun_required<br/>writes: escalation_records_{pr}.json<br/>emits: needs_rerun=true"] S3_MERGE["merge_escalations<br/>━━━━━━━━━━<br/>reads: claims_needs_rerun=true<br/>routes: → re_run_experiment"] S3_RERUN["re_run_experiment<br/>━━━━━━━━━━<br/>runs: run-experiment --adjust<br/>writes: updated results_path"] S3_REPORT["re_write_report<br/>━━━━━━━━━━<br/>reads: updated experiment_results<br/>writes: new research report"] S3_PUSH["re_push_research<br/>━━━━━━━━━━<br/>pushes: updated branch<br/>routes: → begin_archival"] end S3_INTENT -->|"rerun_required found"| S3_ESCALATE S3_ESCALATE -->|"needs_rerun=true"| S3_MERGE S3_MERGE -->|"rerun triggered"| S3_RERUN S3_RERUN -->|"new results"| S3_REPORT S3_REPORT -->|"updated report"| S3_PUSH %% ─────────────────────────────────────── %% subgraph S4 ["SCENARIO 4: Dual Gate Sequencing (both gates before resolution)"] direction LR S4_REV["● review_research_pr<br/>━━━━━━━━━━<br/>skip_when_false: inputs.review_pr<br/>on_context_limit/failure: audit_claims<br/>on_result: → audit_claims"] S4_AUDIT["★ audit_claims<br/>━━━━━━━━━━<br/>skip_when_false: inputs.audit_claims<br/>on_exhausted/failure: route_review_resolve<br/>on_result: → route_review_resolve"] S4_GATE1["route_review_resolve<br/>━━━━━━━━━━<br/>reads: review_verdict<br/>deferred gate 1: review changes?"] S4_GATE2["route_claims_resolve<br/>━━━━━━━━━━<br/>reads: audit_verdict<br/>deferred gate 2: claims changes?"] S4_RESOLVE_REV["resolve_research_review<br/>━━━━━━━━━━<br/>conditional: review_verdict=changes_requested<br/>on_success: → route_claims_resolve"] end S4_REV -->|"all paths"| S4_AUDIT S4_AUDIT -->|"all paths"| S4_GATE1 S4_GATE1 -->|"changes_requested"| S4_RESOLVE_REV S4_GATE1 -->|"other verdicts"| S4_GATE2 S4_RESOLVE_REV -->|"after resolve"| S4_GATE2 %% ─────────────────────────────────────── %% subgraph S5 ["SCENARIO 5: Graceful Degradation (audit disabled or gh unavailable)"] direction LR S5_INPUT["● research.yaml ingredient<br/>━━━━━━━━━━<br/>audit_claims: false (default)<br/>reads: inputs.audit_claims"] S5_SKIP["★ audit_claims<br/>━━━━━━━━━━<br/>skip_when_false: inputs.audit_claims<br/>OR gh unavailable:<br/>verdict=approved, exit 0"] S5_ROUTE["route_claims_resolve<br/>━━━━━━━━━━<br/>reads: audit_verdict=approved/empty<br/>routes: → merge_escalations (bypass)"] S5_END["merge_escalations → re_push<br/>━━━━━━━━━━<br/>no claims_needs_rerun flag set<br/>proceeds directly to archival"] end S5_INPUT -->|"false → skip"| S5_SKIP S5_SKIP -->|"approved emitted"| S5_ROUTE S5_ROUTE -->|"no resolve step"| S5_END %% CLASS ASSIGNMENTS %% class S1_PR,S4_REV handler; class S1_AUDIT,S2_AUDIT,S5_SKIP newComponent; class S1_ROUTE_REV,S1_ROUTE_CL,S1_MERGE,S4_GATE1,S4_GATE2,S5_ROUTE phase; class S1_PUSH,S3_PUSH output; class S2_RESOLVE,S3_ESCALATE newComponent; class S2_ROUTE,S3_MERGE phase; class S2_INTENT,S3_INTENT handler; class S2_FIX,S3_RERUN,S3_REPORT handler; class S2_SUMMARY output; class S3_INTENT detector; class S4_AUDIT newComponent; class S4_RESOLVE_REV handler; class S5_INPUT stateNode; class S5_END output;State Lifecycle Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; START([RESEARCH RECIPE STATE]) %% ─── INIT_PRESERVE: Recipe Context Fields ─── %% subgraph InitPreserve ["INIT_PRESERVE — Recipe Context (● modified)"] direction LR WT["● worktree_path<br/>━━━━━━━━━━<br/>Set at create_worktree<br/>Never re-derived"] PRURL["● pr_url<br/>━━━━━━━━━━<br/>Set at compose_research_pr<br/>Passed to audit-claims"] BB["● base_branch<br/>━━━━━━━━━━<br/>From ingredient<br/>Never changes"] end %% ─── DEFERRED GATE PATTERN ─── %% subgraph DeferredGates ["DEFERRED GATE ROUTING (● research.yaml) — both gates complete before routing"] direction TB RVW_GATE{"skip_when_false<br/>inputs.review_pr<br/>━━━━━━━━━━<br/>INIT_ONLY per recipe run"} RVW_RESULT["● review_verdict<br/>━━━━━━━━━━<br/>MUTABLE<br/>approved | changes_requested | needs_human"] AUD_GATE{"★ skip_when_false<br/>inputs.audit_claims<br/>━━━━━━━━━━<br/>INIT_ONLY per recipe run"} AUD_RESULT["★ audit_verdict<br/>━━━━━━━━━━<br/>MUTABLE<br/>approved | changes_requested | needs_human"] ROUTE_RVW["● route_review_resolve<br/>━━━━━━━━━━<br/>DERIVED routing<br/>reads: review_verdict"] ROUTE_CLM["★ route_claims_resolve<br/>━━━━━━━━━━<br/>DERIVED routing<br/>reads: audit_verdict"] end %% ─── AUDIT-CLAIMS STATE MACHINE ─── %% subgraph AuditClaimsBlock ["★ audit-claims Skill — Claim Extraction State"] direction TB AC_PH1["★ Phase 1: Extract Claims<br/>━━━━━━━━━━<br/>Parallel subagents by section<br/>INIT_ONLY: pr_url, diff_{pr}.txt"] AC_CLAIMS["★ claims_{pr}.json<br/>━━━━━━━━━━<br/>APPEND_ONLY<br/>Built from Phase 1 subagents<br/>{file, line, claim_text, claim_type}"] AC_PH2["★ Phase 2: Evidence Match<br/>━━━━━━━━━━<br/>Parallel subagents by claim_type<br/>experimental → always skip"] AC_FINDINGS["★ findings_{pr}.json<br/>━━━━━━━━━━<br/>APPEND_ONLY (then deduped)<br/>Dedup: keep highest severity<br/>per (file, line) pair"] AC_VERDICT{"★ Verdict Derivation<br/>━━━━━━━━━━<br/>DERIVED (never stored mutable)<br/>actionable? → changes_requested<br/>decision_only? → needs_human<br/>none? → approved"} AC_OUT["★ verdict = {value}<br/>━━━━━━━━━━<br/>Structured output token<br/>Captured as audit_verdict"] end %% ─── RESOLVE-CLAIMS-REVIEW STATE MACHINE ─── %% subgraph ResolveBlock ["★ resolve-claims-review Skill — Fix Application State"] direction TB RC_CFG["★ claims_review config<br/>━━━━━━━━━━<br/>INIT_ONLY for session<br/>validation_command (null=skip)<br/>validation_timeout (default: 120)"] RC_THREADS["★ Thread + Comment Fetch<br/>━━━━━━━━━━<br/>inline_comments_{pr}.json<br/>threads_{pr}.json<br/>dimension_groups_{pr}.json"] RC_INTENT["★ Intent Validation Gate<br/>━━━━━━━━━━<br/>BEFORE any code changes<br/>Parallel by dimension group<br/>Classifies: ACCEPT | REJECT | DISCUSS"] RC_CLASSMAP["★ classification_map_{pr}.json<br/>━━━━━━━━━━<br/>MUTABLE (built from validation)<br/>Keys: comment_id → verdict_entry<br/>fix_strategy per ACCEPT"] RC_ADDR["★ addressed_thread_ids<br/>━━━━━━━━━━<br/>APPEND_ONLY (in-memory)<br/>Grows per ACCEPT fix applied<br/>Consumed at Step 6: thread resolve"] RC_ESC["★ escalation_records<br/>━━━━━━━━━━<br/>APPEND_ONLY (in-memory)<br/>strategy: rerun_required | design_flaw<br/>Consumed at output determination"] RC_VGATE{"★ Validation Command Gate<br/>━━━━━━━━━━<br/>null → SKIP entirely<br/>else: max 3 retry iterations<br/>FAIL after 3 → exit non-zero"} RC_DERIVE{"★ needs_rerun Derivation<br/>━━━━━━━━━━<br/>DERIVED from escalation_records<br/>any strategy==rerun_required → true<br/>all design_flaw or none → false"} RC_OUT["★ needs_rerun = {true|false}<br/>━━━━━━━━━━<br/>Structured output token<br/>Captured as claims_needs_rerun"] end %% ─── MERGE ESCALATIONS ─── %% subgraph MergeBlock ["● merge_escalations — Combined Rerun Gate (● modified)"] direction TB MERGE_GATE{"● review_needs_rerun == true<br/>OR claims_needs_rerun == true<br/>━━━━━━━━━━<br/>DERIVED routing (reads 2 MUTABLE fields)"} RERUN["re_run_experiment"] PUSH["re_push_research"] end %% ─── RESUME DETECTION ─── %% subgraph ResumeBlock ["RESUME DETECTION STRATEGY"] direction LR R1["Tier 1: worktree_path present<br/>━━━━━━━━━━<br/>Resume from execution phase"] R2["Tier 2: pr_url present<br/>━━━━━━━━━━<br/>Resume from PR review phase"] R3["Tier 3: audit_verdict present<br/>━━━━━━━━━━<br/>Skip audit → route_claims_resolve"] R4["Tier 4: claims_needs_rerun present<br/>━━━━━━━━━━<br/>Skip resolution → merge_escalations"] end %% ─── CONNECTIONS ─── %% START --> InitPreserve WT --> RVW_GATE BB --> RVW_GATE PRURL --> AUD_GATE RVW_GATE -->|"skip/run"| RVW_RESULT RVW_RESULT --> ROUTE_RVW AUD_GATE -->|"skip (audit_claims=false)"| AUD_RESULT AUD_GATE -->|"run (audit_claims=true)"| AC_PH1 AC_PH1 --> AC_CLAIMS --> AC_PH2 --> AC_FINDINGS --> AC_VERDICT --> AC_OUT AC_OUT --> AUD_RESULT AUD_RESULT --> ROUTE_CLM ROUTE_RVW -->|"changes_requested"| RC_CFG ROUTE_RVW -->|"other → skip resolve_research_review"| ROUTE_CLM ROUTE_CLM -->|"changes_requested"| RC_CFG ROUTE_CLM -->|"approved | needs_human | skipped"| MERGE_GATE RC_CFG --> RC_THREADS --> RC_INTENT --> RC_CLASSMAP RC_CLASSMAP -->|"ACCEPT: add_citation, qualify_claim, remove_claim"| RC_ADDR RC_CLASSMAP -->|"ACCEPT: rerun_required, design_flaw → ESCALATE"| RC_ESC RC_ADDR --> RC_VGATE RC_ESC --> RC_VGATE RC_VGATE -->|"SKIP or PASS"| RC_DERIVE RC_VGATE -->|"FAIL (3 retries exhausted)"| FAIL_EXIT([exit non-zero]) RC_DERIVE --> RC_OUT --> MERGE_GATE MERGE_GATE -->|"either true"| RERUN MERGE_GATE -->|"both false"| PUSH R1 --> R2 --> R3 --> R4 %% ─── CLASS ASSIGNMENTS ─── %% class START terminal; class WT,PRURL,BB stateNode; class RVW_GATE,AUD_GATE detector; class RVW_RESULT,AUD_RESULT stateNode; class ROUTE_RVW phase; class ROUTE_CLM newComponent; class AC_PH1,AC_PH2 newComponent; class AC_CLAIMS,AC_FINDINGS output; class AC_VERDICT,AC_OUT newComponent; class RC_CFG stateNode; class RC_THREADS handler; class RC_INTENT newComponent; class RC_CLASSMAP stateNode; class RC_ADDR,RC_ESC handler; class RC_VGATE newComponent; class RC_DERIVE,RC_OUT newComponent; class MERGE_GATE phase; class RERUN,PUSH output; class R1,R2,R3,R4 cli; class FAIL_EXIT terminal;Closes #657
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-20260407-144555-361346/.autoskillit/temp/make-plan/audit_claims_plan_2026-04-07_120100.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary