Rectify: Skill Contracts Pattern-Example Binding — PART A ONLY by Trecek · Pull Request #425 · TalonT-Org/AutoSkillit

Trecek · 2026-03-17T05:28:56Z

Summary

skill_contracts.yaml holds expected_output_patterns — regexes used at runtime to validate
session output. These regexes are authored manually, independently from the SKILL.md output
specs they're meant to validate. The audit-impl entry used (GO|NO_GO) (underscore) while
audit-impl/SKILL.md line 340 mandates verdict = NO GO (space). When the skill ran to
completion with a NO GO verdict, _check_expected_patterns() returned False, the session was
classified as ContentState.CONTRACT_VIOLATION, and the pipeline hard-stopped instead of routing
to remediation.

Part A fixes the immediate mismatch, expands the test scope guard that should have caught it,
and corrects the seven test locations that propagated the wrong pattern.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef modified fill:#0d47a1,stroke:#42a5f5,stroke-width:3px,color:#fff;

    %% TERMINALS %%
    START([run_skill called])
    PIPELINE_OK([pipeline continues])
    PIPELINE_RETRY([pipeline retries])
    PIPELINE_FAIL([on_failure: escalate_stop])

    subgraph ContractLookup ["Contract Lookup"]
        direction TB
        Contracts["● skill_contracts.yaml<br/>━━━━━━━━━━<br/>expected_output_patterns:<br/>- verdict\\s*=\\s*(GO|NO GO)"]
        Resolver["● tools_execution.py<br/>━━━━━━━━━━<br/>output_pattern_resolver(skill_command)<br/>→ list[str]"]
    end

    subgraph SessionExec ["Session Execution"]
        direction TB
        Headless["● headless.py<br/>━━━━━━━━━━<br/>run_headless_core()<br/>subprocess emits NDJSON"]
        SubResult["SubprocessResult<br/>━━━━━━━━━━<br/>stdout / returncode<br/>termination / channel_confirmation"]
    end

    subgraph Recovery ["Recovery Attempts"]
        direction TB
        Rec1{"Separate Marker?<br/>━━━━━━━━━━<br/>marker in standalone msg?"}
        Rec2{"Pattern Recovery?<br/>━━━━━━━━━━<br/>patterns match assistant_messages<br/>+ channel != UNMONITORED?"}
    end

    subgraph OutcomeComputation ["Outcome Computation"]
        direction TB
        PatternCheck["_check_expected_patterns()<br/>━━━━━━━━━━<br/>re.search(pattern, result)<br/>AND across all patterns"]
        ContentState{"_evaluate_content_state()<br/>━━━━━━━━━━<br/>COMPLETE / ABSENT<br/>CONTRACT_VIOLATION / SESSION_ERROR"}
        Contradiction{"Contradiction guard<br/>━━━━━━━━━━<br/>success=True AND retry=True?"}
        DeadEnd{"Dead-end guard<br/>━━━━━━━━━━<br/>failed + no-retry<br/>+ channel confirmed?"}
    end

    subgraph Outcomes ["SkillResult"]
        direction LR
        Succeeded["SUCCEEDED<br/>━━━━━━━━━━<br/>success=True"]
        Retriable["RETRIABLE<br/>━━━━━━━━━━<br/>needs_retry=True"]
        Failed["FAILED<br/>━━━━━━━━━━<br/>subtype=adjudicated_failure"]
    end

    %% FLOW %%
    START --> Contracts
    Contracts -->|"patterns resolved"| Resolver
    Resolver -->|"patterns injected"| Headless
    Headless -->|"output captured"| SubResult
    SubResult --> Rec1
    Rec1 -->|"yes: combine messages"| PatternCheck
    Rec1 -->|"no"| Rec2
    Rec2 -->|"yes: reconstruct result"| PatternCheck
    Rec2 -->|"no"| PatternCheck
    PatternCheck -->|"all match → COMPLETE"| Contradiction
    PatternCheck -->|"any fail → CONTRACT_VIOLATION"| ContentState
    ContentState -->|"ABSENT → promote"| Retriable
    ContentState -->|"CONTRACT_VIOLATION<br/>SESSION_ERROR → terminal"| Failed
    Contradiction -->|"demote: retry wins"| Retriable
    Contradiction -->|"no conflict"| DeadEnd
    DeadEnd -->|"ABSENT → promote"| Retriable
    DeadEnd -->|"FAILED terminal"| Failed
    DeadEnd -->|"SUCCEEDED"| Succeeded
    Succeeded --> PIPELINE_OK
    Retriable --> PIPELINE_RETRY
    Failed --> PIPELINE_FAIL

    %% CLASS ASSIGNMENTS %%
    class START,PIPELINE_OK,PIPELINE_RETRY,PIPELINE_FAIL terminal;
    class Contracts,Resolver,Headless modified;
    class SubResult stateNode;
    class Rec1,Rec2 phase;
    class PatternCheck handler;
    class ContentState,Contradiction,DeadEnd detector;
    class Succeeded,Retriable,Failed stateNode;

State Lifecycle Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
    classDef modified fill:#0d47a1,stroke:#42a5f5,stroke-width:3px,color:#fff;

    subgraph ContractFields ["SKILL CONTRACT FIELDS (SkillContract)"]
        direction LR
        InitOnly["inputs / outputs<br/>━━━━━━━━━━<br/>INIT_ONLY<br/>Set at YAML load"]
        Patterns["● expected_output_patterns<br/>━━━━━━━━━━<br/>MUTABLE: corrected value<br/>verdict\\s*=\\s*(GO|NO GO)"]
        Examples["pattern_examples<br/>━━━━━━━━━━<br/>MUTABLE: examples list<br/>Binding cross-check target"]
    end

    subgraph Layer1 ["LAYER 1 — Semantic Rules (run_semantic_rules)"]
        direction TB
        MissingPatterns["missing-output-patterns<br/>━━━━━━━━━━<br/>WARNING if file_path output<br/>has no expected_output_patterns"]
        MissingExamples["missing-pattern-examples<br/>━━━━━━━━━━<br/>WARNING if patterns exist<br/>but pattern_examples is empty"]
        PatternMatch["pattern-examples-match<br/>━━━━━━━━━━<br/>ERROR if re.search(pattern, example)<br/>fails for ALL examples → blocks recipe"]
    end

    subgraph Layer2 ["LAYER 2 — Test Suite (pytest invariants)"]
        direction TB
        RegressionGuard["● test_audit_impl_no_go_pattern_matches_literal_output<br/>━━━━━━━━━━<br/>re.search(pattern, 'verdict = NO GO') must succeed<br/>Regression guard for issue #418"]
        AllPatternsTest["● test_every_pattern_example_matches_its_patterns<br/>━━━━━━━━━━<br/>ALL skills: every pattern must match<br/>at least one declared example"]
        EmitConsistency["● test_every_declared_output_has_emit_instruction<br/>━━━━━━━━━━<br/>skills/ + skills_extended/ scanned<br/>SKILL.md must have output_name = ..."]
    end

    subgraph Layer3 ["LAYER 3 — Runtime Enforcement (session.py)"]
        direction TB
        CheckPatterns["_check_expected_patterns()<br/>━━━━━━━━━━<br/>re.search(pattern, result)<br/>AND across all patterns"]
        ContentOutcome{"ContentState<br/>━━━━━━━━━━<br/>COMPLETE if all match<br/>CONTRACT_VIOLATION if any fail"}
    end

    subgraph Outcomes ["PIPELINE OUTCOME"]
        direction LR
        GoodRoute["SUCCEEDED<br/>━━━━━━━━━━<br/>→ on_result routes"]
        BadRoute["FAILED<br/>━━━━━━━━━━<br/>→ escalate_stop<br/>(was bug path)"]
    end

    %% FLOW %%
    InitOnly -->|"loaded once"| MissingPatterns
    Patterns -->|"validated by"| MissingPatterns
    Patterns -->|"cross-checked"| PatternMatch
    Examples -->|"cross-checked"| PatternMatch
    Examples -->|"validated by"| MissingExamples

    MissingPatterns -->|"WARNING only"| MissingExamples
    MissingExamples -->|"WARNING only"| PatternMatch
    PatternMatch -->|"ERROR → blocks recipe"| RegressionGuard

    RegressionGuard -->|"FAIL if NO_GO pattern"| AllPatternsTest
    AllPatternsTest -->|"FAIL if example mismatch"| EmitConsistency
    EmitConsistency -->|"FAIL if no emit line in SKILL.md"| CheckPatterns

    CheckPatterns -->|"match result"| ContentOutcome
    ContentOutcome -->|"COMPLETE"| GoodRoute
    ContentOutcome -->|"CONTRACT_VIOLATION"| BadRoute

    %% CLASS ASSIGNMENTS %%
    class InitOnly detector;
    class Patterns modified;
    class Examples phase;
    class MissingPatterns,MissingExamples gap;
    class PatternMatch stateNode;
    class RegressionGuard,AllPatternsTest,EmitConsistency modified;
    class CheckPatterns handler;
    class ContentOutcome stateNode;
    class GoodRoute output;
    class BadRoute detector;

Closes #418

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-418-20260316-202127-835503/temp/rectify/rectify_skill_contracts_pattern_examples_2026-03-16_000000_part_a.md

Token Usage Summary

Token Summary\n\n(No token data recorded for this pipeline run)

🤖 Generated with Claude Code via AutoSkillit

Trecek

AutoSkillit PR Review — Verdict: changes_requested (5 actionable findings)

src/autoskillit/recipe/rules_contracts.py

tests/contracts/test_skill_contracts.py

Trecek · 2026-03-17T05:38:08Z

tests/contracts/test_skill_contracts.py

+    Regression guard for issue #418: pattern had NO_GO (underscore) while SKILL.md
+    mandates NO GO (space). Must stay RED until skill_contracts.yaml is fixed.
+    """
+    import re


[info] tests: import re inside the test function body is repeated across multiple test functions (lines 67 and 85). Move to module-level imports per Python convention.

Trecek · 2026-03-17T05:38:08Z

tests/contracts/test_skill_contracts.py

+    """For every skill with expected_output_patterns and pattern_examples,
+    every pattern must re.search-match at least one example.
+
+    Permanent architectural guard: pattern/SKILL.md divergence fails CI before production.


[info] slop: Permanent architectural guard is an unnecessary phrase in the docstring — all tests are permanent until removed. Reads as AI-generated justification prose; remove.

Trecek · 2026-03-17T05:38:08Z

tests/recipe/test_rules_contracts.py

    )
+
+
+def test_pattern_examples_match_rule_fires_on_mismatch(monkeypatch) -> None:


[info] cohesion: Rule-level tests live in tests/recipe/test_rules_contracts.py while data-invariant tests live in tests/contracts/test_skill_contracts.py. These are symmetric pairs for the same feature; grouping strategy is inconsistent. Consider whether rule tests or data tests belong together.

Trecek

AutoSkillit review found 5 blocking issues (warning severity). See inline comments for details.

Findings summary:

src/autoskillit/recipe/rules_contracts.py L90 [defense]: Unguarded re.search — re.error from malformed patterns crashes the rule pass
src/autoskillit/recipe/rules_contracts.py L72, L112 [slop]: Function docstrings restate decorator descriptions verbatim
tests/contracts/test_skill_contracts.py L73 [tests]: Test asserts ALL patterns match NO GO output, not just the relevant one
tests/contracts/test_skill_contracts.py L61 [cohesion]: Specific test_audit_impl_no_go_pattern_matches_literal_output is made redundant by the new parametric test_every_pattern_example_matches_its_patterns

Fixes issue #418: `skill_contracts.yaml` had `(GO|NO_GO)` (underscore) while `audit-impl/SKILL.md` mandates `verdict = NO GO` (space). This caused any NO GO verdict to trigger CONTRACT_VIOLATION instead of routing to remediation. Changes: - skill_contracts.yaml: fix expected_output_patterns for audit-impl to use `(GO|NO GO)` instead of `(GO|NO_GO)` - tests/contracts/test_skill_contracts.py: add regression guard `test_audit_impl_no_go_pattern_matches_literal_output` that fails if the pattern stops matching `verdict = NO GO` - tests/recipe/test_skill_emit_consistency.py: expand scan from `skills/` (3 skills) to `skills/ + skills_extended/` (60 skills) - tests/execution/test_session_adjudication.py: fix 4 hardcoded `(GO|NO_GO)` patterns to `(GO|NO GO)` - tests/execution/test_headless.py: fix 2 hardcoded patterns - tests/execution/test_session_debug_logging.py: fix 1 hardcoded pattern Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds four tests that are RED on the post-Part-A codebase: - test_every_pattern_example_matches_its_patterns: CI guard that all expected_output_patterns match at least one declared example via re.search - test_every_skill_with_patterns_has_examples: every skill with patterns must also have pattern_examples - test_pattern_examples_match_rule_fires_on_mismatch: semantic rule ERROR fires when pattern matches none of the examples - test_missing_pattern_examples_rule_fires_when_examples_absent: semantic rule WARNING fires when patterns exist but examples are absent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Extends SkillContract with pattern_examples: list[str], loaded from skill_contracts.yaml by get_skill_contract() and serialized by generate_recipe_card(). Backward-compatible: defaults to [] when absent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

For every skill with expected_output_patterns, add a companion pattern_examples list with canonical literal emit strings drawn from SKILL.md output specifications. All 38 skills covered; all patterns verified to re.search-match at least one of their examples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…c rules Two new semantic rules in rules_contracts.py: - pattern-examples-match (ERROR): fires when an expected_output_patterns regex matches none of the declared pattern_examples. Closes the static validation gap where a pattern could be syntactically valid but never match any real skill output. - missing-pattern-examples (WARNING): fires when a skill has expected_output_patterns but no pattern_examples, signaling that static validation cannot occur. Together with the test invariants and YAML updates, these rules create a triangular binding between patterns, examples, and actual skill output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rbatim

…xample_matches_its_patterns

Trecek commented Mar 17, 2026

View reviewed changes

Trecek enabled auto-merge March 17, 2026 06:11

Trecek and others added 9 commits March 17, 2026 07:58

style: apply ruff format to test_skill_contracts.py

038cef1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(review): guard _re.search against re.error from malformed patterns

c46c6e9

fix(review): remove docstrings that restate decorator descriptions ve…

36d687f

…rbatim

fix(review): remove redundant test superseded by test_every_pattern_e…

b1688d0

…xample_matches_its_patterns

Trecek force-pushed the skill-contracts-yaml-no-go-pattern-mismatch-causes-legitimat/418 branch from 6af2b27 to b1688d0 Compare March 17, 2026 15:12

Trecek added this pull request to the merge queue Mar 17, 2026

Merged via the queue into integration with commit 17ce2d6 Mar 17, 2026
2 checks passed

Trecek deleted the skill-contracts-yaml-no-go-pattern-mismatch-causes-legitimat/418 branch March 17, 2026 15:16

Trecek mentioned this pull request Mar 19, 2026

Promote integration → main (56 PRs, 46 issues) #438

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rectify: Skill Contracts Pattern-Example Binding — PART A ONLY#425

Rectify: Skill Contracts Pattern-Example Binding — PART A ONLY#425
Trecek merged 9 commits intointegrationfrom
skill-contracts-yaml-no-go-pattern-mismatch-causes-legitimat/418

Trecek commented Mar 17, 2026

Uh oh!

Trecek left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Trecek Mar 17, 2026

Uh oh!

Trecek Mar 17, 2026

Uh oh!

Trecek Mar 17, 2026

Uh oh!

Trecek left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		)


		def test_pattern_examples_match_rule_fires_on_mismatch(monkeypatch) -> None:

Conversation

Trecek commented Mar 17, 2026

Summary

Architecture Impact

Process Flow Diagram

State Lifecycle Diagram

Implementation Plan

Token Usage Summary

Token Summary\n\n(No token data recorded for this pipeline run)

Uh oh!

Trecek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Trecek Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Trecek Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Trecek Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Trecek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant