Rectify: Sub-Skill Refusal Handling and Diagram Styling Contract Enforcement#640
Merged
Trecek merged 11 commits intointegrationfrom Apr 6, 2026
Conversation
… palette contract Adds three tests that expose two gaps: (1) no refusal handler documented in open-research-pr/open-pr SKILL.md for when Skill tool gates exp-lens/arch-lens sub-skills, and (2) no canonical classDef palette embedded in open-research-pr. All three tests fail before SKILL.md changes are applied (red phase). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… open-research-pr and open-pr When exp-lens/arch-lens sub-skills are gated (disable-model-invocation), the Skill tool refuses the invocation and the model previously improvised freehand — producing gray unstyled diagrams with invented class names. - open-research-pr Step 4: document refusal handler — do NOT write freehand; discard the lens iteration silently; emit diagnostic file if all lenses refused - open-research-pr: add canonical 9-class classDef palette as reference to prevent freehand improvisation from inventing non-canonical styling - open-pr Step 5: apply identical refusal handler for arch-lens invocations Closes all three contract test failures introduced in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… handling and mermaid palette compliance Adds two self-updating contract test files that enforce architectural ratchets: - test_sub_skill_refusal_contracts.py: parametrized over all SKILL.md files that invoke sub-skills via the Skill tool; fails CI for any qualifying skill that lacks refusal handler documentation. Fixes 38 skills (all arch-lens, all exp-lens, make-experiment-diag, make-plan, migrate-recipes, open-integration-pr, rectify, setup-project, write-recipe). - test_mermaid_palette_contracts.py: parametrized over all diagram-generating SKILL.md files; requires either ≥7 canonical classDef names or a mermaid skill delegation phrase. Fixes 4 skills: make-arch-diag, open-integration-pr, open-pr, verify-diag. Both tests scan the filesystem at collection time — no manual skill enumeration required. New sub-skill-calling or diagram-generating skills that omit the required language fail CI immediately. Closes the latent class of bugs exposed by issue #637. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…write-exit false positive
The phrase 'skip the diagram step' matched the semantic rule's
`\bskip\b.{0,30}\bstep\b` pattern, causing two test failures.
Rewording to 'proceed without the architectural diagram' preserves
both the refusal signal (disable-model-invocation) and action signal
(proceed without) required by the refusal contracts test.
Also applies ruff formatting fixes to the two new contract test files.
Trecek
commented
Apr 6, 2026
Collaborator
Author
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested
12 blocking issues found (see inline comments). Running on own PR — using COMMENT instead of REQUEST_CHANGES.
Trecek
commented
Apr 6, 2026
Collaborator
Author
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit review found 12 blocking issues. See inline comments. Note: REQUEST_CHANGES was downgraded to COMMENT due to own-PR restriction — verdict is changes_requested.
…ding in sub-skill refusal contracts
…o match tighter action pattern
…ette check to classDef context, add utf-8 encoding
…icate palette test, trim docstring narrative
…in open-pr contracts
…— belongs at call site
…oad mandate in open-research-pr
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
open-research-pris invoked headlessly, the exp-lens sub-skills it depends on remain gated (disable-model-invocation: true) becauseactivate_tier2()un-gates exactly one skill per session by design. The SKILL.md provides no instruction for when the Skill tool refuses a sub-skill invocation, so the model improvises freehand — inventing non-canonical class names and never applying them to nodes. All nodes render gray.The direct bugs are: (1) no refusal handler in
open-research-prStep 4, and (2) no canonical palette embedded as a reference fallback. The arch-lens equivalent skillopen-prhas the identical gap in Step 5. The root architectural weakness is that all contract tests use vocabulary-only assertions — checking if a word appears, not whether a mechanism exists.Part A adds the two focused failing tests for
open-research-prandopen-pr, then fixes both SKILL.md files. Part B (separate task) adds the cross-skill parametrized ratchet tests that enforce the same contract across all sub-skill-calling skills in the codebase.Architecture Impact
Error/Resilience Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; subgraph Callers ["● MODIFIED SKILL CALLERS"] direction LR ORP["● open-research-pr/SKILL.md<br/>━━━━━━━━━━<br/>Step 4: exp-lens invocation<br/>+refusal handler added"] OP["● open-pr/SKILL.md<br/>━━━━━━━━━━<br/>Step 5: arch-lens invocation<br/>+refusal handler added"] end subgraph LensGate ["LENS INVOCATION GATE"] SKILL_CALL["Skill tool call<br/>━━━━━━━━━━<br/>/autoskillit:{lens-slug}"] GATE{"Error contains<br/>'disable-model-invocation'<br/>or 'cannot be used'?"} end subgraph RefusalPath ["● REFUSAL HANDLER (added to both skills)"] direction TB DISCARD["Discard lens silently<br/>━━━━━━━━━━<br/>do NOT write freehand<br/>Continue to next lens"] ALL_REFUSED{"All lens invocations<br/>refused?"} DIAG["● lens_unavailable_{ts}.txt<br/>━━━━━━━━━━<br/>Diagnostic artifact<br/>open-research-pr only"] EMPTY["validated_diagrams = []<br/>━━━━━━━━━━<br/>Diagram section omitted<br/>from PR body"] end subgraph HappyPath ["UNAFFECTED SUCCESS PATH"] direction TB EXECUTE["Lens executes<br/>━━━━━━━━━━<br/>Canonical palette applied"] VALIDATE["Marker validation<br/>━━━━━━━━━━<br/>★ / ● symbols checked"] VALIDATED["validated_diagrams<br/>━━━━━━━━━━<br/>Styled mermaid blocks added"] end subgraph ContractRatchet ["★ NEW CI CONTRACT ENFORCEMENT"] direction TB REFUSAL_RATCHET["★ test_sub_skill_refusal_contracts.py<br/>━━━━━━━━━━<br/>Auto-discovers ALL sub-skill callers<br/>Enforces refusal handler at CI time"] PALETTE_RATCHET["★ test_mermaid_palette_contracts.py<br/>━━━━━━━━━━<br/>Auto-discovers diagram generators<br/>Enforces palette or mermaid-load"] ORP_TESTS["● test_open_research_pr_contracts.py<br/>━━━━━━━━━━<br/>+test_handles_skill_tool_refusal_for_exp_lens<br/>+test_embeds_canonical_classdef_palette"] OP_TESTS["● test_open_pr_contracts.py<br/>━━━━━━━━━━<br/>+test_handles_skill_tool_refusal_for_arch_lens"] end T_PR_STYLED([PR: styled diagram section]) T_PR_CLEAN([PR: section omitted — clean]) T_CI_FAIL([CI FAIL: contract violation]) T_CI_PASS([CI PASS: contracts satisfied]) ORP --> SKILL_CALL OP --> SKILL_CALL SKILL_CALL --> GATE GATE -->|"no — ungated"| EXECUTE GATE -->|"yes — gated/refused"| DISCARD DISCARD --> ALL_REFUSED ALL_REFUSED -->|"no — more lenses"| SKILL_CALL ALL_REFUSED -->|"yes — all refused"| DIAG DIAG --> EMPTY EMPTY --> T_PR_CLEAN EXECUTE --> VALIDATE VALIDATE -->|"contains ★ or ●"| VALIDATED VALIDATED --> T_PR_STYLED ORP_TESTS --> REFUSAL_RATCHET OP_TESTS --> REFUSAL_RATCHET REFUSAL_RATCHET -->|"no handler found"| T_CI_FAIL PALETTE_RATCHET -->|"no palette found"| T_CI_FAIL REFUSAL_RATCHET -->|"all callers compliant"| T_CI_PASS PALETTE_RATCHET -->|"all generators compliant"| T_CI_PASS class ORP,OP handler; class SKILL_CALL phase; class GATE detector; class DISCARD,ALL_REFUSED stateNode; class DIAG gap; class EMPTY output; class EXECUTE,VALIDATE phase; class VALIDATED output; class REFUSAL_RATCHET,PALETTE_RATCHET newComponent; class ORP_TESTS,OP_TESTS handler; class T_PR_STYLED,T_PR_CLEAN,T_CI_FAIL,T_CI_PASS terminal;Process/Execution Flow Diagram
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; START([run_skill open-research-pr / open-pr]) subgraph LensLoop ["● LENS ITERATION LOOP (Step 4 / Step 5)"] direction TB ITER["For each lens_slug<br/>━━━━━━━━━━<br/>Iterate exp-lens / arch-lens list"] INVOKE["Skill tool call<br/>━━━━━━━━━━<br/>/autoskillit:{lens-slug}"] GATED{"Skill tool response<br/>contains 'disable-model-invocation'<br/>or 'cannot be used'?"} end subgraph RefusalRouting ["● REFUSAL ROUTING (added to both skills)"] direction TB SKIP["Discard silently<br/>━━━━━━━━━━<br/>do NOT write freehand"] MORE{"More lens slugs<br/>remaining?"} WRITE_DIAG["Write diagnostic<br/>━━━━━━━━━━<br/>lens_unavailable_{ts}.txt<br/>(open-research-pr only)"] SET_EMPTY["validated_diagrams = []<br/>━━━━━━━━━━<br/>Propagate to composition step"] end subgraph SuccessRouting ["UNMODIFIED SUCCESS ROUTING"] direction TB EXECUTE["Lens executes<br/>━━━━━━━━━━<br/>Diagram generated<br/>Canonical palette applied"] CHECK_MARKERS{"Block contains<br/>★ or ● markers?"} APPEND["Append to<br/>validated_diagrams<br/>━━━━━━━━━━<br/>Styled block collected"] end subgraph Composition ["PR BODY COMPOSITION (Step 6)"] direction TB DIAGS_EMPTY{"validated_diagrams<br/>empty?"} INCLUDE["Include diagram section<br/>━━━━━━━━━━<br/>## Architecture Impact<br/>## Experiment Design"] OMIT["Omit section entirely<br/>━━━━━━━━━━<br/>No placeholder in PR body"] end subgraph CIRatchet ["★ NEW CI CONTRACT RATCHET"] direction TB DISCOVER["★ Scan skills_extended/<br/>━━━━━━━━━━<br/>Auto-discover sub-skill callers<br/>and diagram generators"] CHECK_REFUSAL{"★ All callers have<br/>refusal handler?"} CHECK_PALETTE{"★ All generators have<br/>palette or mermaid-load?"} end T_PR_STYLED([PR with styled diagram section]) T_PR_CLEAN([PR without diagram section]) T_CI_PASS([CI PASS]) T_CI_FAIL([CI FAIL]) START --> ITER ITER --> INVOKE INVOKE --> GATED GATED -->|"yes — refused"| SKIP GATED -->|"no — executes"| EXECUTE SKIP --> MORE MORE -->|"yes"| ITER MORE -->|"no — all refused"| WRITE_DIAG WRITE_DIAG --> SET_EMPTY EXECUTE --> CHECK_MARKERS CHECK_MARKERS -->|"yes — valid"| APPEND CHECK_MARKERS -->|"no — invalid"| ITER APPEND --> ITER SET_EMPTY --> DIAGS_EMPTY ITER -->|"loop exhausted"| DIAGS_EMPTY DIAGS_EMPTY -->|"yes — empty"| OMIT DIAGS_EMPTY -->|"no — has diagrams"| INCLUDE INCLUDE --> T_PR_STYLED OMIT --> T_PR_CLEAN DISCOVER --> CHECK_REFUSAL DISCOVER --> CHECK_PALETTE CHECK_REFUSAL -->|"all compliant"| T_CI_PASS CHECK_REFUSAL -->|"gap found"| T_CI_FAIL CHECK_PALETTE -->|"all compliant"| T_CI_PASS CHECK_PALETTE -->|"gap found"| T_CI_FAIL class START terminal; class ITER,INVOKE phase; class GATED,MORE,CHECK_MARKERS,DIAGS_EMPTY detector; class SKIP,WRITE_DIAG gap; class SET_EMPTY,APPEND stateNode; class EXECUTE handler; class INCLUDE,OMIT output; class DISCOVER,CHECK_REFUSAL,CHECK_PALETTE newComponent; class T_PR_STYLED,T_PR_CLEAN,T_CI_PASS,T_CI_FAIL terminal;Closes #637
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/remediation-20260405-212016-569394/.autoskillit/temp/rectify/rectify_sub-skill-refusal-and-palette-contracts_2026-04-05_175800_part_a.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary