Skip to content

Promote develop to main (200 PRs, 160+ issues, 179 fixes, 480 features, 27 refactors, 22 infra)#2213

Merged
Trecek merged 1205 commits into
mainfrom
develop
May 8, 2026
Merged

Promote develop to main (200 PRs, 160+ issues, 179 fixes, 480 features, 27 refactors, 22 infra)#2213
Trecek merged 1205 commits into
mainfrom
develop

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 8, 2026

Promotion: develop to main

This release represents a major expansion of AutoSkillit with 1,459 commits across 200 PRs, introducing three new core packages: fleet/ for campaign dispatch, workspace/ for clone management, and planner/ for progressive resolution planning. Performance has been significantly improved through adoption of orjson, regex, uvloop, CDumper for YAML writes, CSafeLoader (11.1x speedup), and pre-compiled recipe YAML, collectively reducing hot-path latency across the board. The L2/L3 orchestration layers gained substantial new capabilities including activity-aware fleet dispatch timeouts, resumable quota sleep exits, L2 session resume for context exhaustion, and a full fleet campaign retry and preview system. Reliability was hardened through a dozen targeted Rectify fixes addressing resume gate boot failures, review loop counter bypasses, merge conflict semantic guards, and token summary schema drift. The recipe schema, hook structure, and CLI surface have all been meaningfully extended with new kinds, fields, validation commands, and the sous-chef → admiral rename for L3 terminology clarity.

Stats: 1,550 files changed, +287,137 / -46,998 lines | 1,459 commits | 200 PRs | 179 fixes, 480 features, 27 refactors, 22 infra, 50 tests, 9 docs

Highlights

  • Three new packages introduced: fleet/ (campaign dispatch + semaphore + sidecar + liveness), workspace/ (clone management + worktrees + skill resolution), and planner/ (progressive resolution planner with phases, assignments, and work packages)
  • Performance suite: orjson on JSON hot paths, regex drop-in package, uvloop event loop, CDumper for YAML writes, CSafeLoader (11.1x speedup), pre-compiled bundled recipe YAML to JSON, and mtime/lru_cache for registry loaders — applied across the critical execution path
  • Fleet orchestration maturity: activity-aware dispatch timeout, quota sleep resumable exit, campaign retry unblocking from terminal FAILURE state, pre-launch dispatch preview, and L2 session resume for context exhaustion and API disconnects
  • Recipe schema extended with RecipeKind, RecipeBlock, and CampaignDispatch; migration engine gains AdvisoryMigrationAdapter; hooks restructured into guards/ (17 scripts) and formatters/ (5 files); 9 new runtime dependencies added
  • L3 orchestrator terminology clarified: sous-chef renamed to admiral throughout; import layer (IL-N) vs orchestration level (L-N) notation formally disambiguated in docs and CLAUDE.md

Release Notes

New Features

  • Fleet campaign dispatch — new fleet/ package with campaign dispatch, semaphore, sidecar, liveness probes, and state persistence; includes pre-launch preview and retry unblocking from terminal FAILURE state
  • Workspace and planner packagesworkspace/ (clone management, worktrees, skill resolution) and planner/ (progressive resolution with phases, assignments, WPs, validation) added as first-class packages
  • Activity-aware fleet dispatch timeout + resumable quota sleep exit — fleet sessions now respect activity signals before timing out; quota sleep is resumable across process restarts
  • L2 session resume — headless sessions can resume after context exhaustion or API disconnects without losing orchestration state
  • BEM pre-step gate for multi-issue dispatch — blocks dispatch until batch-eligibility conditions are met, preventing premature multi-issue fan-out
  • Local review rounds before PR creation — recipe-driven local review loop executes before any PR is opened, reducing rework on remote branches
  • Trigger-evaluation ordering mechanism — deterministic ordering of trigger evaluation across recipe steps
  • Content-aware cascade downgrade for additive-only changes — automatically downgrades review intensity when diffs are purely additive
  • Batch issue creation via GraphQL aliases — multi-issue creation in a single API round-trip using aliased mutations
  • User-config validation CLIautoskillit config validate command surfaces misconfiguration before runtime
  • --profile CLI flag for provider selection at invocation time
  • Token summary per-step model column — token usage breakdown now includes the model used per step
  • Ingredient table display in fleet campaign sessions — campaigns surface the resolved ingredient table for operator inspection
  • Skip push_branch when output_mode == local — avoids unnecessary remote pushes in local-only workflows
  • Review-design handling for all-silent types — silent-type constructs now have explicit review design rules
  • Research-recipe smoke test — new smoke test covering the research-family recipe end-to-end
  • Bound unbounded routing loops in research-family recipes — loop guards prevent infinite routing under edge conditions

Bug Fixes

  • Food Truck resume gate boot — resume gate failed to initialize on cold boot in certain fleet configurations
  • Review loop counter bypass — review iteration counter could be bypassed, allowing unbounded review rounds
  • Merge failure domain routing — failures during merge were routed to the wrong error domain, masking root cause
  • Merge-PR conflict detection semantic validation guards — conflict detection now enforces semantic correctness, not just syntactic presence
  • Artifact-dependent routing immunity + batch failure gate — artifact-dependent routes were incorrectly skipped; batch failure gate now enforced
  • Early-stop worktree routing blindness — early-stop signals were not propagated to worktree routing, causing stale sessions
  • Token summary schema drift — token summary output schema drifted from consumer expectations; realigned
  • Fleet tool visibility breach — fleet tools were incorrectly visible outside an open kitchen session
  • Completion marker architectural immunity — completion markers were not immune to architecture-level route overrides
  • Idle stall watchdog immunity — idle stall watchdog was bypassed by certain session states
  • Provider fields silent omission — provider configuration fields were silently dropped when unrecognized
  • run_python type-erasure boundary — type information was erased at the run_python IL boundary, causing downstream type errors

Performance

  • orjson adopted for all JSON hot paths — faster serialization/deserialization across pipeline and fleet layers
  • regex drop-in package replaces stdlib re on hot paths — PCRE2-backed with significant throughput gains on complex patterns
  • uvloop event loop enabled — replaces asyncio default loop for lower per-call overhead in headless sessions
  • CDumper for YAML writes — C-extension YAML dumper replaces Python dumper on all write paths
  • CSafeLoader — 11.1x speedup on YAML load benchmarks; applied to all bundled recipe loading
  • Pre-compiled bundled recipe YAML to JSON — recipes are compiled to JSON at install time; runtime load bypasses YAML parsing entirely
  • mtime/lru_cache for registry loaders — registry files are re-parsed only when mtime changes; session-scope cache added
  • Lazy-import igraph — igraph import deferred to first use, shaving startup time for sessions that never touch the graph layer

Refactoring

  • Decompose tools_execution.py (924 lines) — split into focused modules; no behavioral change
  • Decompose cli/_prompts.py (819 lines) — prompt logic extracted into cohesive sub-modules
  • Split _type_results.py — result types separated by domain for cleaner import boundaries
  • Sous-chef renamed to admiral — L3 orchestrator role renamed throughout code, docs, and skills for terminology clarity
  • Restructure headless prompt — prompt assembly refactored for maintainability and testability
  • Clarify session type labels — session type identifiers normalized across CLI, logs, and telemetry
  • Disambiguate IL-N vs L-N notation — import layer levels (IL-0…IL-3) and orchestration levels (L0…L3) formally separated in all docs and CLAUDE.md
  • docs/CLAUDE.md hub-and-spoke reorganization — top-level CLAUDE.md now acts as a hub linking to per-package CLAUDE.md files

Infrastructure

  • Hooks restructured — hook scripts reorganized into guards/ (17 scripts) and formatters/ (5 files) subdirectories; HOOK_REGISTRY and RETIRED_SCRIPT_BASENAMES updated accordingly
  • 9 new runtime dependencies: orjson, regex, uvloop, lazy-loader, markdown-it-py, pathspec, pygments, pyjwt; plus api-simulator as a dev dependency
  • Recipe schema additions: RecipeKind enum, RecipeBlock, CampaignDispatch type; new fields on Recipe and RecipeStep
  • AdvisoryMigrationAdapter — new migration engine adapter; diagrams are now advisory-only and no longer block migration
  • Quota guard dual-window — quota guard now enforces both short and long observation windows independently
  • safety.protected_branches updated: integrationdevelop
  • New review config section — recipe-level review configuration surface added to AutomationConfig

Breaking Changes

  • Sous-chef skill renamed to admiral — any external references to /autoskillit:sous-chef must be updated to /autoskillit:admiral
  • Hook script paths changed — hooks previously at hooks/*.py now live under hooks/guards/ or hooks/formatters/; any external hook registrations must be updated
  • safety.protected_branches — the branch name integration has been replaced by develop; update any config files that reference the old value
  • Recipe schema — RecipeKind, RecipeBlock, and CampaignDispatch are new required/optional fields; existing recipes without these fields will be processed via AdvisoryMigrationAdapter but authors should update schemas

Attention Required

  • Worktree setup procedure changed — never run autoskillit init from within a git worktree; use task install-worktree instead
  • Session log paths use hyphens — log directory and session folder names are hyphen-separated; any scripts constructing log paths with underscores will silently fail to find sessions
  • Subagent invocations require CLAUDE_CODE_EXIT_AFTER_STOP_DELAY=120000 — without this env var, subagents may not exit cleanly when finished
  • Pre-commit hooks must be re-run after this upgrade — hook script relocations mean existing local hook symlinks may point to deleted paths

Merged PRs

PR Title Author Labels
#2212 Rectify: Food Truck Resume Gate Boot Trecek
#2211 Activity-Aware Fleet Dispatch Timeout + Quota Sleep Resumable Exit Trecek
#2210 Adopt orjson for JSON Hot Paths Trecek
#2209 Bound Unbounded Routing Loops in Research-Family Recipes Trecek
#2207 Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch Trecek
#2206 Perf — Drop-in Regex Package + Enable uvloop for MCP Server Trecek
#2204 Rectify: Review Loop Counter Bypass via Approved-Verdict Catch-All Routing Trecek
#2203 Display ingredient table in fleet campaign sessions Trecek
#2202 Clarify session type labels, docstrings, and variable naming Trecek
#2200 Perf — Add CDumper for YAML Writes + Hoist re.compile() to Module Level Trecek
#2199 Mock asyncio.sleep in TestBulkCloseIssues to eliminate rate-limit delays Trecek
#2195 Research-Recipe Smoke Test (Two Fixtures) Trecek
#2194 Sweep stale documentation: counts, retired names, constraint scopes Trecek
#2187 Rename sous-chef to admiral for L3 orchestrator terminology Trecek
#2184 Tests for methodology-tradition expansion Trecek
#2183 Tests for Experiment-Type Registry Expansion Trecek
#2181 Disambiguate IL-N Import Layers from L-N Orchestration Levels Trecek
#2180 Document Guard Fail-Mode Matrix (Fail-Open vs Fail-Closed) Trecek
#2178 Fix batch_create_issues Validation Summary Append Trecek
#2177 No-Mandatory-Figures Path in vis-lens-methodology-norms Trecek
#2176 Review-Design Handling of All-Silent Types Trecek
#2175 Rectify: Merge Failure Domain Routing — Rebase Misrouted to resolve-failures Trecek
#2174 Audit-trail artifact in worktree Trecek
#2172 Rectify: Merge-PR Conflict Detection — Semantic Validation Guards Trecek
#2171 Add type field to RecipeIngredient and enforce integer-default consistency Trecek
#2170 Pre-compile Bundled Recipe YAML to JSON for Faster Loading Trecek
#2169 perf: lazy-import igraph inside build_recipe_graph Trecek
#2168 fix: exempt subagents via agent_id and fix flag path via ancestor walk Trecek
#2167 Fix API-Simulator Wheel Caching in CI Trecek
#2165 Clarify recipe validation API naming and contract suffix conventions Trecek
#2164 Fix misleading names in session/gating layer Trecek
#2163 Rename mcp_health_guard.py to mcp_health_advisor.py Trecek
#2162 Document the tag-visibility vs application-gate split for MCP tools Trecek
#2160 Thread experiment_type and methodology_tradition through research recipe Trecek
#2159 Skill-Rename Migration Note Trecek
#2157 Cache DefaultSkillResolver Results and Share Across Rule Functions Trecek
#2156 Low-risk housekeeping — sort utility, package gateway, and all gaps Trecek
#2155 Recipe completeness guard in _parse_recipe Trecek
#2154 Perf — Eliminate Redundant I/O in load_and_validate() Trecek
#2153 perf: add mtime/lru_cache to uncached registry loaders Trecek
#2152 Rectify: Artifact-Dependent Routing Immunity + Batch Failure Gate Trecek
#2151 P5-A3-WP2 — Test dispatch_id_filter in audit/tokens/timings consumers Trecek
#2150 perf: switch YAML loading to CSafeLoader (11.1x speedup) Trecek
#2149 Perf — Session-scope _resolve_test_config cache (stop cache thrash) Trecek
#2148 P5-A5-WP3 — Wire normalization + campaign_id into fleet _api.py dispatch paths Trecek
#2147 Decompose cli/_prompts.py (819 lines) Trecek
#2146 fleet/state.py — Size reduction and DispatchRecord Factory Trecek
#2143 Decompose tools_execution.py (924 lines) Trecek
#2130 Add Content-Aware Cascade Downgrade for Additive-Only Changes Trecek
#2129 Split _type_results.py: Extract Execution-Scoped Types Trecek
Show all 200 PRs

The full list of 200 merged PRs includes additional features, fixes, tests, and infrastructure changes spanning v0.7.0 through v0.9.562. See the commit history for the complete changelog.

Linked Issues

Issue Title Status Labels
#720 Add model: field to all run_skill recipe steps OPEN bug, recipe:implementation, staged
#830 Auto-init git repo in create_worktree OPEN recipe:implementation, staged
#831 Skip push_branch when output_mode == local OPEN recipe:implementation, staged
#832 Revise causal_inference trigger to require manipulation OPEN recipe:implementation, staged
#833 Land 7 new experiment-type YAML files OPEN recipe:implementation, staged
#834 Trigger-evaluation ordering mechanism (priority field) OPEN recipe:implementation, staged
#835 Review-design handling of all-silent types OPEN recipe:implementation, staged
#836 Tests for experiment-type registry expansion OPEN recipe:implementation, staged
#837 Audit deep-research citations (human-in-the-loop) OPEN recipe:implementation, staged
#838 Design spec for dedicated environment-setup skill OPEN recipe:implementation, staged
#839 Implement environment-setup skill OPEN recipe:implementation, staged
#840 Decouple Docker concern from implement-experiment OPEN recipe:implementation, staged
#841 Rename vis-lens-domain-norms → vis-lens-methodology-norms OPEN recipe:implementation, staged
#842 Land 12 methodology-tradition entries OPEN recipe:implementation, staged
#845 Migrate ML sub-areas as conditional-branching venue appendices OPEN recipe:implementation, staged
#847 Tests for methodology-tradition expansion OPEN recipe:implementation, staged
#848 Generate research.yaml contract card OPEN recipe:implementation, staged
#849 Generate research.yaml pre-rendered diagram OPEN recipe:implementation, staged
#850 Research-recipe smoke test (two fixtures) OPEN recipe:implementation, staged
#851 Extend stage-data network probes for biology databases OPEN recipe:implementation, staged
160+ additional linked issues

This promotion carries forward closing references for 160+ additional issues across all domains. See the closing references section below.

Attention Required

  • Recipe schema breaking changes — RecipeKind, RecipeBlock, and CampaignDispatch are new types; Recipe and RecipeStep have many new validated fields; validate_recipe renamed to validate_recipe_structure
  • Migration engine refactor — DiagramMigrationAdapter is now advisory-only; diagrams will no longer auto-regenerate on migration
  • Hooks reorganization — three scripts renamed, 17+ new guards, HOOK_REGISTRY and RETIRED_SCRIPT_BASENAMES must be verified
  • 9 new runtime dependencies — including private api-simulator via uv.sources; lockfile and install verification required
  • Config changes — safety.protected_branches changed from integration to develop; quota_guard dual-window fields added
  • Version jump — 0.7.0 → 0.9.562 (562 patch versions accumulated on develop)

Architecture Impact

Module Dependency (Structural — "How are modules coupled?")

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
graph TB
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;

    subgraph IL3 ["IL-3 — APPLICATION"]
        direction LR
        SERVER["● server/<br/>━━━━━━━━━━<br/>FastMCP server<br/>★ tools/ subpackage<br/>● _factory.py, _state.py"]
        CLI["● cli/<br/>━━━━━━━━━━<br/>CLI entry points<br/>★ doctor/, fleet/, session/, ui/<br/>★ 11 new modules"]
    end

    subgraph IL2 ["IL-2 — DOMAIN"]
        direction LR
        RECIPE["● recipe/<br/>━━━━━━━━━━<br/>Schema + validation<br/>★ rules/ subpackage<br/>● schema.py — RecipeKind, RecipeBlock<br/>Fan-in: 24"]
        MIGRATION["● migration/<br/>━━━━━━━━━━<br/>Versioned migration engine<br/>● engine.py — AdvisoryAdapter"]
        FLEET["★ fleet/<br/>━━━━━━━━━━<br/>Campaign dispatch<br/>semaphore, sidecar, liveness<br/>11 modules"]
    end

    subgraph IL1 ["IL-1 — INFRASTRUCTURE"]
        direction LR
        CONFIG["● config/<br/>━━━━━━━━━━<br/>AutomationConfig + Dynaconf<br/>● defaults.yaml<br/>Fan-in: 29"]
        PIPELINE["● pipeline/<br/>━━━━━━━━━━<br/>ToolContext DI, gate<br/>● tokens.py, telemetry_fmt.py"]
        EXECUTION["● execution/<br/>━━━━━━━━━━<br/>Headless sessions, CI<br/>★ process/, headless/<br/>★ session/, merge_queue/<br/>35 modules → core"]
        WORKSPACE["★ workspace/<br/>━━━━━━━━━━<br/>Clone mgmt, worktrees<br/>skill resolution<br/>9 modules"]
        PLANNER["★ planner/<br/>━━━━━━━━━━<br/>Progressive resolution<br/>phases, assignments, WPs<br/>5 modules"]
    end

    subgraph IL0 ["IL-0 — FOUNDATION"]
        direction LR
        CORE["● core/<br/>━━━━━━━━━━<br/>★ types/ subpackage<br/>★ runtime/ subpackage<br/>✕ _type_*.py replaced<br/>Fan-in: 200"]
    end

    subgraph HOOKS_LAYER ["HOOKS — Cross-cutting"]
        direction LR
        HOOKS["● hooks/<br/>━━━━━━━━━━<br/>★ guards/ — 17 scripts<br/>★ formatters/ — 5 files<br/>★ _dispatch, _hook_settings"]
        HOOKREG["● hook_registry.py<br/>━━━━━━━━━━<br/>Registry + retired names"]
    end

    SERVER -->|"imports"| RECIPE
    SERVER -->|"imports"| MIGRATION
    CLI -->|"imports"| RECIPE
    SERVER -->|"imports"| CONFIG
    SERVER -->|"imports"| PIPELINE
    SERVER -->|"imports"| EXECUTION
    SERVER -->|"imports"| WORKSPACE
    CLI -->|"imports"| CONFIG
    CLI -->|"imports"| EXECUTION
    CLI -->|"imports"| WORKSPACE
    SERVER -->|"imports"| HOOKS
    CLI -->|"imports"| HOOKS
    CLI -->|"imports"| HOOKREG
    RECIPE -->|"imports"| CORE
    MIGRATION -->|"imports"| CORE
    FLEET -->|"imports"| CORE
    FLEET -.->|"lateral: _prompts"| HOOKS
    CONFIG -->|"imports"| CORE
    PIPELINE -->|"imports"| CORE
    EXECUTION -->|"imports"| CORE
    WORKSPACE -->|"imports"| CORE
    PLANNER -->|"imports"| CORE
    PIPELINE -.->|"runtime: AutomationConfig"| CONFIG
    HOOKS -->|"imports"| CORE
    HOOKREG -->|"imports"| CORE
    SERVER -->|"imports"| CORE
    CLI -->|"imports"| CORE

    class SERVER,CLI cli;
    class RECIPE,MIGRATION phase;
    class FLEET newComponent;
    class CONFIG,PIPELINE handler;
    class EXECUTION handler;
    class WORKSPACE,PLANNER newComponent;
    class CORE stateNode;
    class HOOKS,HOOKREG detector;
Loading
Color Category Description
Dark Blue IL-3 Apps Application layer — server and CLI
Purple IL-2 Domain Recipe schema, migration engine
Green New Packages fleet/, workspace/, planner/ (new in this promotion)
Orange IL-1 Infra config, pipeline, execution
Teal IL-0 Foundation core/ (fan-in: 200 files)
Red Hooks Cross-cutting hook scripts and registry

Process Flow (Physiological — "How does it behave?")

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 55, 'curve': 'basis'}}}%%
flowchart TB
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;

    START([MCP Tool Call])
    COMPLETE([SkillResult JSON])
    ERROR([Gate Error / Crash])

    subgraph Visibility ["● Session-Type Tag Visibility"]
        direction TB
        STYPE{"Session Type?<br/>━━━━━━━━━━<br/>env: SESSION_TYPE"}
        FLEET_VIS["★ Fleet Tags<br/>━━━━━━━━━━<br/>fleet-dispatch"]
        ORCH_VIS["Orchestrator Tags<br/>━━━━━━━━━━<br/>kitchen-core + packs"]
        SKILL_VIS["Skill Tags<br/>━━━━━━━━━━<br/>headless only"]
    end

    subgraph GatePhase ["● Gate Lifecycle"]
        direction TB
        OPEN_K["● open_kitchen<br/>━━━━━━━━━━<br/>gate.enable + hook_config<br/>quota cache prime"]
        GATE{"Gate<br/>Enabled?"}
        LOAD_R["● load_and_validate<br/>━━━━━━━━━━<br/>Cache → YAML → sub-recipe<br/>→ semantic rules → contract"]
        VALID{"Recipe<br/>Valid?"}
    end

    subgraph Dispatch ["● run_skill Dispatch"]
        direction TB
        GUARDS["● Guard Chain<br/>━━━━━━━━━━<br/>1. orchestrator_check<br/>2. gate_enabled<br/>3. skill_command_valid<br/>4. cwd_absolute<br/>5. dry_walkthrough"]
        PROVIDER["● Provider Resolution<br/>━━━━━━━━━━<br/>step override → recipe<br/>→ YAML field → config"]
        SKILL_SETUP["● Skill Session Setup<br/>━━━━━━━━━━<br/>resolve namespace<br/>compute closure<br/>init ephemeral dir"]
    end

    subgraph Headless ["● Headless Session"]
        direction TB
        LAUNCH["● _execute_claude_headless<br/>━━━━━━━━━━<br/>subprocess + PTY<br/>completion marker"]
        STALE_CHK{"Termination<br/>Reason?"}
        RECOVER["● Recovery Paths<br/>━━━━━━━━━━<br/>Channel B drain-race<br/>marker search<br/>pattern recovery"]
    end

    subgraph Adjudication ["● Result Adjudication"]
        direction TB
        OUTCOME["● _compute_outcome<br/>━━━━━━━━━━<br/>success gate chain<br/>+ retry FSM"]
        RETRY{"needs_retry?"}
        BUDGET{"Budget<br/>Exhausted?"}
        FALLBACK["● Provider Fallback<br/>━━━━━━━━━━<br/>inject fallback env<br/>re-launch session"]
        POST["● Post-Session<br/>━━━━━━━━━━<br/>flush log, record tokens<br/>refresh quota cache"]
    end

    START --> STYPE
    STYPE -->|"FLEET"| FLEET_VIS
    STYPE -->|"ORCHESTRATOR"| ORCH_VIS
    STYPE -->|"SKILL"| SKILL_VIS
    FLEET_VIS --> OPEN_K
    ORCH_VIS --> OPEN_K
    SKILL_VIS --> OPEN_K
    OPEN_K --> GATE
    GATE -->|"closed"| ERROR
    GATE -->|"open"| LOAD_R
    LOAD_R --> VALID
    VALID -->|"errors"| ERROR
    VALID -->|"valid"| GUARDS
    GUARDS -->|"any guard fails"| ERROR
    GUARDS -->|"pass"| PROVIDER
    PROVIDER --> SKILL_SETUP
    SKILL_SETUP --> LAUNCH
    LAUNCH --> STALE_CHK
    STALE_CHK -->|"STALE / IDLE_STALL"| RECOVER
    STALE_CHK -->|"TIMED_OUT"| OUTCOME
    STALE_CHK -->|"COMPLETED / NATURAL_EXIT"| OUTCOME
    RECOVER --> OUTCOME
    OUTCOME --> RETRY
    RETRY -->|"no"| POST
    RETRY -->|"yes"| BUDGET
    BUDGET -->|"exhausted"| POST
    BUDGET -->|"remaining"| FALLBACK
    FALLBACK --> LAUNCH
    POST --> COMPLETE

    class START,COMPLETE,ERROR terminal;
    class STYPE,GATE,VALID,STALE_CHK,RETRY,BUDGET stateNode;
    class OPEN_K,LOAD_R,GUARDS,PROVIDER,SKILL_SETUP phase;
    class LAUNCH,RECOVER,OUTCOME handler;
    class FLEET_VIS,ORCH_VIS,SKILL_VIS newComponent;
    class FALLBACK,POST output;
Loading
Color Category Description
Dark Blue Terminal Entry and exit points
Teal Decision Routing and branching decisions
Purple Phase Configuration, validation, and guard chains
Orange Handler Execution, recovery, and adjudication
Green New New session-type visibility components

Closes #1622
Closes #1695
Closes #1699
Closes #1700
Closes #1701
Closes #1702
Closes #1703
Closes #1706
Closes #1707
Closes #1708
Closes #1709
Closes #1710
Closes #1712
Closes #1716
Closes #1717
Closes #1718
Closes #1719
Closes #1722
Closes #1723
Closes #1724
Closes #1725
Closes #1726
Closes #1727
Closes #1728
Closes #1729
Closes #1735
Closes #1745
Closes #1747
Closes #1748
Closes #1749
Closes #1751
Closes #1752
Closes #1753
Closes #1754
Closes #1755
Closes #1756
Closes #1772
Closes #1773
Closes #1774
Closes #1775
Closes #1776
Closes #1777
Closes #1778
Closes #1779
Closes #1780
Closes #1798
Closes #1802
Closes #1803
Closes #1804
Closes #1805
Closes #1806
Closes #1825
Closes #1831
Closes #1834
Closes #1835
Closes #1837
Closes #1838
Closes #1849
Closes #1851
Closes #1852
Closes #1853
Closes #1860
Closes #1861
Closes #1862
Closes #1863
Closes #1875
Closes #1877
Closes #1879
Closes #1880
Closes #1881
Closes #1882
Closes #1883
Closes #1884
Closes #1885
Closes #1886
Closes #1887
Closes #1888
Closes #1897
Closes #1898
Closes #1899
Closes #1900
Closes #1901
Closes #1902
Closes #1903
Closes #1905
Closes #1906
Closes #1910
Closes #1918
Closes #1924
Closes #1928
Closes #1932
Closes #1936
Closes #1943
Closes #1944
Closes #1945
Closes #1954
Closes #1955
Closes #1963
Closes #1964
Closes #1965
Closes #1966
Closes #1975
Closes #1976
Closes #1980
Closes #1986
Closes #1987
Closes #2005
Closes #2007
Closes #2008
Closes #2009
Closes #2020
Closes #2029
Closes #2035
Closes #2036
Closes #2039
Closes #2043
Closes #2044
Closes #2045
Closes #2047
Closes #2048
Closes #2049
Closes #2051
Closes #2061
Closes #2063
Closes #2097
Closes #2133
Closes #2134
Closes #2136
Closes #2137
Closes #2138
Closes #2139
Closes #2140
Closes #2141
Closes #2158
Closes #2173
Closes #2182
Closes #2188
Closes #2190
Closes #2196
Closes #2197
Closes #2205
Closes #2208
Closes #720
Closes #830
Closes #831
Closes #832
Closes #833
Closes #834
Closes #835
Closes #836
Closes #837
Closes #838
Closes #839
Closes #840
Closes #841
Closes #842
Closes #845
Closes #847
Closes #848
Closes #849
Closes #850
Closes #851
Closes #852
Closes #857
Closes #858

Generated with Claude Code via AutoSkillit

Trecek and others added 30 commits May 7, 2026 22:12
…wn (#1961)

## Summary

Add a `Model` column to the per-step token summary table and a new
per-model aggregate breakdown table. The model identity is sourced from
the `model_breakdown` dict already parsed by `extract_token_usage()` in
`_session_model.py`. The change threads model identity through three
paths: (1) in-memory accumulation via `TokenEntry.model`, (2) on-disk
persistence via a new `model_identifier` field in `token_usage.json`,
and (3) formatting in `TelemetryFormatter`, the stdlib-only hook, and
the compact PostToolUse formatter.

No cost estimation is included — the acceptance criteria require token
counts only, and no pricing infrastructure exists.

Closes #1906

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-135150-264911/.autoskillit/temp/make-plan/token_summary_model_column_plan_2026-05-05_135500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ll-Count Tests (#1967)

## Summary

Remove redundant and brittle test code across three test modules:
parametrize the three identical kitchen_rules rejection tests, delete
the duplicate session-type warning test, and convert exact skill-count
assertions to lower-bound checks. No production code changes.

Closes #1886

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-180602-586205/.autoskillit/temp/make-plan/deduplicate_session_type_kitchen_rules_skill_count_tests_plan_2026-05-05_180602.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1969)

## Summary

The `resolve-failures` SKILL.md has an ambiguous verdict decision flow
that allows an LLM executor to emit `ci_only_failure` even after
successfully applying a fix. The fix restructures the Step 2d decision
tree to make the override rule explicit: **any time a code change is
committed and tests pass, the verdict is `real_fix`, regardless of
`failure_subtype`**. The Step 2d table is clarified to apply ONLY to the
"no fix applied" path, and a post-fix-loop verdict override is added to
prevent re-evaluation through the wrong decision path.

## Requirements

- REQ-RF-001: When `resolve-failures` applies a code change AND the
subsequent CI run passes, the verdict MUST be `real_fix`, not
`ci_only_failure`
- REQ-RF-002: `ci_only_failure` should only be emitted when no fix was
applied or when the applied fix did not resolve the CI failure
- REQ-RF-003: The fix must not break the existing `ci_only_failure` path
for genuinely unfixable CI failures

Closes #1954

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-180603-520539/.autoskillit/temp/make-plan/resolve_failures_ci_only_failure_verdict_fix_plan_2026-05-05_181000.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…itHub API Usage) (#1970)

## Summary

Add a `local_review_rounds` configuration and plumbing so that the
existing review loop steps (`annotate_pr_diff`, `review_pr`,
`resolve_review`, `check_review_loop`) receive a `review_mode` context
value (`"local"` or `"github"`) computed per-iteration. Part A covers
the config dataclass, defaults, ingredient bridge, callable
modification, recipe YAML wiring for all three looping recipes, and
tests for all of the above. Part B will cover the skill SKILL.md
behavioral changes (how `review-pr` and `resolve-review` branch on
`mode=local` vs `mode=github`) — implement as a separate task.

Closes #1945

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-165914-350121/.autoskillit/temp/make-plan/local_review_rounds_plan_2026-05-05_170500_part_a.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…ng (#1971)

## Summary

Update three SKILL.md files (`compose-pr`, `prepare-pr`, `diagnose-ci`)
to use directive description language and fix a step numbering gap in
`diagnose-ci`. Then uncomment four step overrides in
`.autoskillit/config.yaml` to route `retry_worktree`, `compose_pr`,
`diagnose_ci`, and `prepare_pr` to the MiniMax M2.7-highspeed profile.

## Requirements

### REQ-1: Update 3 SKILL.md descriptions to directive language

PR #1937 updated 6 SKILL.md files to use directive language (per the
Seleznov 650-trial study: 94-100% activation vs 37-77% for passive
descriptions). Three of the four new MiniMax target skills still use
passive descriptions.

**Files to update:**

| File | Current description style | Required change |
|------|--------------------------|-----------------|
| `src/autoskillit/skills_extended/compose-pr/SKILL.md` | Passive prose:
"Reads the PR prep file and validated arch-lens diagrams..." |
Directive: "PR composition executor. ALWAYS invoke this skill when
instructed to compose a PR. Do not read prep files or create PRs
directly — use this skill first to load the composition workflow." |
| `src/autoskillit/skills_extended/prepare-pr/SKILL.md` | Passive prose:
"Reads plan(s), runs git diff, classifies changed files..." | Directive:
"PR preparation executor. ALWAYS invoke this skill when instructed to
prepare PR metadata. Do not read plans or classify files directly — use
this skill first to load the preparation workflow." |
| `src/autoskillit/skills_extended/diagnose-ci/SKILL.md` | No
`description:` field at all | Add directive: "CI diagnosis executor.
ALWAYS invoke this skill when instructed to diagnose CI failures. Do not
fetch CI logs directly — use this skill first to load the diagnosis
workflow." |

`retry-worktree` already has directive language (updated in PR #1937).

**Why this matters:** The hook-based skill load guard (PR #1937)
enforces Skill tool loading regardless of description language. But
directive descriptions improve voluntary model compliance —
belt-and-suspenders. MiniMax's "thoughtful disobedience" pattern means
every compliance signal helps.

### REQ-2: Fix diagnose-ci step numbering gap

The `diagnose-ci` SKILL.md workflow skips from Step 1 to Step 3 — there
is no Step 2. This is likely accidental (no logical reason for the gap).
MiniMax may attempt to invent a Step 2 from training priors, causing
unintended tool calls.

**File:** `src/autoskillit/skills_extended/diagnose-ci/SKILL.md`

**Current numbering:** Step 1, Step 3, Step 4, Step 5, Step 5a, Step 6,
Step 7 (no Step 2).

**Fix:** Renumber to sequential: Step 1, Step 2, Step 3, Step 4, Step
4a, Step 5, Step 6.

| Old | New |
|-----|-----|
| Step 3: Fetch Failure Summary | Step 2: Fetch Failure Summary |
| Step 4: Fetch Per-Job Logs | Step 3: Fetch Per-Job Logs |
| Step 5: Classify Failure | Step 4: Classify Failure |
| Step 5a: Subtype Classification | Step 4a: Subtype Classification |
| Step 6: Write Diagnosis Report | Step 5: Write Diagnosis Report |
| Step 7: Emit Output Tokens | Step 6: Emit Output Tokens |

**Bonus fix:** Line 80 says "proceed to Step 5 (write minimal
diagnosis)" — but old Step 5 is Classify Failure, not Write Diagnosis.
After renumbering, "proceed to Step 5" correctly resolves to Write
Diagnosis Report for the first time. This fixes a latent cross-reference
bug.

### REQ-3: Verify step override activation

After REQ-1 and REQ-2 are complete, uncomment the 4 step overrides in
`.autoskillit/.secrets.yaml` and run one implementation pipeline to
verify:
1. Each step receives the FIRST ACTION directive (check session JSONL
for "FIRST ACTION" in the prompt)
2. Each step calls the Skill tool as its first action (check JSONL for
`"name":"Skill"` as the first tool call)
3. Each step completes successfully with structured output tokens
emitted correctly

Closes #1966

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-185537-800888/.autoskillit/temp/make-plan/prepare_4_run_skill_steps_for_minimax_m27_routing_plan_2026-05-05_190500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#1972)

## Summary

The `run_python` MCP tool dispatches callables through
`_import_and_call` using a `dict[str, object]` parameter — a
type-erasure boundary. Direct MCP tools get Pydantic lax-mode validation
via FastMCP (which coerces `str→int` automatically), but `run_python`
callables receive raw unvalidated values. The dispatcher already calls
`inspect.signature(func)` but only uses it for `None→default` coercion
(PR #1602), never reading `param.annotation`. This creates an asymmetry
where `str`-typed callable parameters receiving JSON integers crash at
subprocess boundaries, f-string operations, or Path construction.

The architectural fix: extend `_import_and_call`'s existing coercion
loop to read `param.annotation` and coerce primitive scalar types,
closing the type-safety gap between the two dispatch paths.
Defense-in-depth: add `str()` guards at each subprocess call site.
Structural tests: a parametrized test matrix that exercises every
callable × every param type combination through `_import_and_call`.

Closes #1965

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/remediation-20260505-183247-696032/.autoskillit/temp/rectify/rectify_run_python_type_coercion_2026-05-05_183800.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
… State (#1968)

## Summary

When a fleet campaign dispatch fails, the `FAILURE` status is terminal —
`_ALLOWED_TRANSITIONS[FAILURE]` is `frozenset()` with no outgoing
transitions. This blocks explicit user retry (`--resume`) because both
the `has_failed_dispatch()` halt guard in `dispatch_food_truck` and
Phase 2 of `resume_campaign_from_state` unconditionally reject campaigns
with any FAILURE record.

The fix adds a `FAILURE → PENDING` transition, a
`reset_failed_dispatch()` function, and modifies the two halt check
sites to distinguish between **automatic continuation** (should still
halt) and **explicit user retry** (should reset the failed dispatch and
re-execute).

## Requirements

- REQ-RETRY-001: A failed dispatch MUST be retryable without manual
state file edits
- REQ-RETRY-002: The halt-on-failure guard MUST still prevent automatic
continuation to subsequent dispatches after an unacknowledged failure
- REQ-RETRY-003: Retry of a failed dispatch MUST reset its state and
re-execute it from scratch (not resume)
- REQ-RETRY-004: The retry mechanism MUST be safe under concurrent
access (respect existing `_resume_lock` + `fcntl.LOCK_EX` pattern)

Closes #1695

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-180604-269785/.autoskillit/temp/make-plan/fleet_campaign_retry_blocked_by_terminal_failure_state_plan_2026-05-05_181500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1973)

## Summary

Create `src/autoskillit/recipes/research-review.yaml` as a standalone
sub-recipe containing the 22-step PR/review phase extracted from
`research.yaml`. The recipe receives campaign-injected hidden
ingredients (`worktree_path`, `research_dir`, `report_path`,
`experiment_plan`, `experiment_results`, `experiment_type`,
`scope_report`, `visualization_plan_path`), lifts all review steps
verbatim-and-adapted, replaces the archival phase with dual terminal
stops (`review_pr_complete` for PR mode, `review_local_complete` for
local mode), and corrects routing targets to terminate within this
sub-recipe rather than routing to archival or non-existent steps.

Closes #1702

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-184855-274993/.autoskillit/temp/make-plan/p2_wp3_create_research_review_yaml_plan_2026-05-05_185500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1974)

## Summary

_INFRA_UNCONDITIONAL_FILES in tests/_test_filter.py contains 9
filenames, but 3 of them (test_hook_executability.py,
test_hook_registration_coverage.py, test_hook_registry.py) live in
tests/hooks/, not tests/infra/. The path construction loop at line 1271
resolves all 9 under tests/infra/, silently dropping the 3 hook tests
from every tiered conservative filter run. The guard test in
test_test_filter_tiered_always_run.py checks only basenames (p.name),
masking the bug. This was introduced by commit 26c8059 (#1734) which
moved the files without updating the constant.

The fix splits the constant into two frozen sets with correct directory
mappings, adds a second path construction loop, and strengthens all
guard tests to assert parent directories.

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-203325-376042/.autoskillit/temp/make-plan/fix_test_filter_hook_test_path_mismatch_plan_2026-05-05_203500.md`

## Changed Files

- tests/_test_filter.py
- tests/test_test_filter.py
- tests/test_test_filter_coverage_map.py
- tests/test_test_filter_tiered_always_run.py

Closes #1875

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1977)

## Summary

Fix two validated audit findings (C4-1 and C4-3) by replacing weak test
assertions with precise positive checks. No production code changes are
required.

**Finding C4-1** — `tests/server/test_tools_dispatch_halt.py`: Five
tests that verify dispatch proceeds past the halt gate use `assert
result.get("error") != "fleet_campaign_halted"`. Each test calls
`_setup_standard_dispatch()` which wires a valid recipe and executor, so
the expected outcome is that the dispatch proceeds past the halt gate —
`assert "dispatch_id" in result` is the correct assertion.

**Finding C4-3** — `tests/cli/test_doctor.py` lines 448 and 464: Two
tests assert `checks[0]["severity"] in ("warning", "error")`. Source
inspection of `_check_plugin_cache_exists` and
`_check_installed_plugins_entry` confirms both return `Severity.WARNING`
unconditionally under the test conditions.

Closes #1887

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/strengthen-assertions-20260505-211450-512860/.autoskillit/temp/make-plan/strengthen_assertions_plan_2026-05-05_211450.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

Add a validate-only pre-commit hook (`scripts/check_sub_claude_md.py`)
that checks every sub-CLAUDE.md file table mentions all `.py` files in
its directory. This catches the gap at commit time (during `pre-commit
run --all-files`) instead of at CI time, preventing the systematic 5+ CI
round-trip failures observed since PR #1820. The script replicates the
coverage logic from `test_sub_claude_md_covers_all_py_files` and
`test_tests_sub_claude_md_covers_all_py_files`, using the same
`EXPECTED_SUB_CLAUDE_MDS` lists. A new `.pre-commit-config.yaml` stanza
triggers it on `.py` file changes under `src/autoskillit/` and `tests/`.

## Requirements

- New script: `scripts/check_sub_claude_md.py` (validate-only, exits 1
with structured message on mismatch)
- New stanza in `.pre-commit-config.yaml` triggered on `files:
^(tests/|src/autoskillit/).*\.py$`
- Must check both `tests/<subdir>/CLAUDE.md` and
`src/autoskillit/<subdir>/CLAUDE.md` file tables
- Must use the same `EXPECTED_SUB_CLAUDE_MDS` lists as the test files
(or derive from disk)
- No auto-fix — the agent must manually add the row with a meaningful
Purpose description
- Pattern: `pass_filenames: false` (like existing `doc-counts` hook)

## Changed Files

### New (★):
★ scripts/check_sub_claude_md.py
★ tests/docs/test_check_sub_claude_md_script.py

### Modified (●):
● .pre-commit-config.yaml
● tests/docs/CLAUDE.md
● tests/infra/test_ci_dev_config.py

Closes #1975

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-211945-173795/.autoskillit/temp/make-plan/add_pre_commit_hook_sub_claude_md_plan_2026-05-05_213000.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

Replace `test_config_resolution_fleet_enabled_via_experimental` in
`tests/config/test_fleet_config.py` (lines 140–155) with an isolated
version that uses `tmp_path` to create a synthetic `config.yaml`
containing `experimental_enabled: true`, then calls
`load_config(tmp_path)` to exercise the full config loading pipeline.
The live-disk read (`Path(__file__).parents[2] / ".autoskillit" /
"config.yaml"`) and its CI skip guard are removed entirely.

The test is **rewritten** (not deleted) because
`test_is_feature_enabled_fleet_defaults_false` (in
`tests/core/test_type_constants.py`) only exercises
`is_feature_enabled()` in isolation — it does not cover the
`load_config()` → Dynaconf layer merge →
`AutomationConfig.from_dynaconf()` → `is_feature_enabled()` pipeline.
The rewrite provides genuine CI coverage of that end-to-end path, which
the delete option would leave uncovered.

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-221515-520014/.autoskillit/temp/make-plan/remove_live_config_read_plan_2026-05-05_221745.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

Co-authored-by: Trecek <trecek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1981)

## Summary

C4-6 (load_config(tmp_path / "settings.toml") wrong arg type) was
already resolved by a prior split commit. C4-9 (three separate
load_config() calls for one-liner assertions in TestWorkspaceConfig) is
the sole remaining change: consolidate three test methods into a single
test_workspace_config_defaults method calling load_config(tmp_path) once
and asserting all fields together. Drop the hasattr check as redundant.

## Requirements

### C4-6 — load_config() argument bug (ALREADY RESOLVED)
Fix `load_config(tmp_path / "settings.toml")` → `load_config(tmp_path)`.
Passing a `.toml` file path causes silent fallback to defaults.
(Resolved in commit a22cb18.)

### C4-9 — Consolidate TestWorkspaceConfig assertions (REQUIRES ACTION)
Consolidate `TestWorkspaceConfig`'s three separate
`load_config(tmp_path)` calls into a single
`test_workspace_config_defaults` test checking all three fields at once.
Drop the `hasattr` check.

## Changed Files

### Modified (●):
tests/config/test_config.py

Closes #1888

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-221514-570647/.autoskillit/temp/make-plan/fix_load_config_argument_bug_and_consolidate_workspace_assertions_plan_2026-05-05_222100.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

Two test files contain stale docstrings and assertion messages that
reference old names
or misstate what the assertions actually check:

- **C5-2** (`tests/workspace/test_skills.py`):
`test_bundled_skills_list_matches_filesystem`
docstring and failure message still say `make-script-skill` — the skill
was renamed to
  `write-recipe`.
- **C5-5** (`tests/execution/test_process_submodules.py`): per-symbol
test docstrings say
`"exports X"` but the assertions verify `__module__` (definition
origin), not `__all__`
  membership. Each docstring should say `"is defined in X submodule"`.

Both fixes are pure string edits in test files. No logic changes, no new
fixtures, no
isolation concerns.

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-221516-303979/.autoskillit/temp/make-plan/fix_stale_docstrings_workspace_execution_tests_plan_2026-05-05_000000.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…1979)

## Summary

Create `src/autoskillit/recipes/research-archive.yaml` as a standalone
sub-recipe that extracts the 9-step archival phase from `research.yaml`
(lines 855–963). The critical change from the parent recipe: all
ingredient-sourced values (`pr_url`, `worktree_path`, `research_dir`,
`base_branch`) use `inputs.X` references instead of `context.X`, since
these are declared ingredients in the standalone recipe rather than
step-captured context variables. Step-captured values
(`experiment_branch`, `artifact_branch`, `artifact_pr_url`,
`archive_tag`) correctly remain as `context.X`. Pack declarations are
`[github, ci]` (the archival phase only needs GitHub CLI and CI tools,
not the full `research` pack). No `autoskillit_version` field —
consistent with bundled recipe policy (removed in #1950). Only 4
ingredients: 3 campaign-sourced hidden (`worktree_path`, `research_dir`,
`pr_url`) and 1 user-input (`base_branch`). The
`report_path_after_finalize` and `source_dir` ingredients from the
parent recipe are omitted because no archival step references them.

Closes #1703

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260505-213323-163152/.autoskillit/temp/make-plan/p2_wp4_create_research_archive_yaml_sub_recipe_plan_2026-05-05_213600.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

The hook guard system lacks structural enforcement that
command-inspecting guards must cover all tool variants that execute
shell commands. Guards are written independently with ad-hoc extraction
of command text, and the test suite validates each guard only against
the tool format it was designed for — not against all tools it should
logically intercept. The fix adds a structural meta-test that makes it
impossible to register a command-inspecting guard without covering both
the `Bash` native tool and `run_cmd` MCP tool, plus a parametrized test
helper that forces every such guard to prove it blocks dangerous
commands through either tool pathway.

This closes a gap where `unsafe_install_guard.py` and
`pr_create_guard.py` only read `tool_input.cmd` (from `run_cmd`) but
ignore `tool_input.command` (from `Bash`), allowing headless agents to
bypass these guards entirely when using the native Bash tool instead of
the MCP wrapper.

Closes #1980

## Implementation Plan

Plan file:
`.autoskillit/temp/rectify/rectify_unsafe-install-guard-bash-tool-coverage-gap_2026-05-05_223500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (individual)

@@ -436,3 +450,129 @@ async def batch_cleanup_clones(
except Exception as exc:
logger.warning("batch_cleanup_clones failed", exc_info=True)
return json.dumps({"deleted": [], "preserved": [], "error": str(exc)})
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: batch_cleanup_clones error handler returns result missing delete_failures key. Callers get KeyError.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (individual)

parts.append(b.get("text", ""))
# Non-text blocks (thinking, tool_use, etc.) contribute no text
self.result = "\n".join(parts)
elif not isinstance(self.result, str):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] slop: Unreachable elif branch in post_init: elif not isinstance(self.result, str) is dead code.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (batch 4)

description="L3 Fleet Orchestrator — multi-session campaign dispatch",
tool_tags=frozenset({"fleet"}),
skill_categories=frozenset({"fleet"}),
import_package="autoskillit.fleet",
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] arch: IL-0 module stores import_package='autoskillit.fleet' and 'autoskillit.planner' as string data in FEATURE_REGISTRY. No actual import at module load. But importlib.import_module usage would violate layer boundary.

This finding requires a human decision — the correct path is ambiguous.

ended_at = time.time()

# --- Timeout pre-check: short-circuit before result-block parsing ---
if skill_result.subtype == "timeout":
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] bugs: skill_result.subtype compared as string literal == 'timeout'. Should use CliSubtype.TIMEOUT enum for type safety.


try:
new_version = importlib.metadata.version("autoskillit")
except Exception:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _verify_update_result catches bare except Exception and falls back to new_version=current. Cannot distinguish version unchanged from infrastructure error.


if Version(latest) > Version(current):
return Signal("binary", f"New release: {latest} (you have {current})")
except Exception:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _binary_signal catches bare except Exception and returns None. Missing packaging import silently returns no-signal.

if not isinstance(conditions, list):
return False
return condition in conditions
except Exception:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _is_dismissed catches bare except Exception and returns False. Malformed dismissed_version causes repeated prompts.

args=["autoskillit", "install"], returncode=0
)
with terminal_guard():
subprocess.run(cmd, check=False, env=skip_env)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _run_update_sequence: upgrade subprocess return code not inspected. Failed upgrade silently ignored.

extras: dict[str, str] = {
"AUTOSKILLIT_HEADLESS": "1",
"AUTOSKILLIT_SESSION_TYPE": SESSION_TYPE_SKILL,
"MAX_MCP_OUTPUT_TOKENS": _MAX_MCP_OUTPUT_TOKENS_VALUE,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: MAX_MCP_OUTPUT_TOKENS and MCP_CONNECTION_NONBLOCKING duplicated between extras dict and _SESSION_BASELINE_ENV. Silent maintenance hazard if canonical values change.

extras[KITCHEN_SESSION_ID_ENV_VAR] = kitchen_session_id
if allowed_write_prefix:
extras["AUTOSKILLIT_ALLOWED_WRITE_PREFIX"] = allowed_write_prefix
# Layer caller env_extras (campaign vars) UNDER the mandatory keys.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: build_food_truck_cmd: caller env_extras keys silently dropped if they match mandatory keys. No warning logged.

f"Recipe '{recipe}' could not be loaded: {exc}",
)

_DISPATCHABLE_KINDS = frozenset({"standard", "food-truck"})
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _DISPATCHABLE_KINDS defined inside _run_dispatch (recreated every call). Should be module-level constant.

except Exception as exc:
logger.error("run_skill unhandled exception", exc_info=True)
return SkillResult.crashed(
exception=exc,
skill_command=skill_command,
order_id=order_id,
).to_json()
except BaseException:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: CancelledError (BaseException) raised inside inner try falls through to except Exception and returns crashed SkillResult rather than being re-raised.

This finding requires a human decision — the correct path is ambiguous.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (individual)

if new_version != current:
return True

from autoskillit.cli._install_info import upgrade_command
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] slop: Redundant import of upgrade_command inside _verify_update_result — already imported at module level.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (individual)

except Exception as exc:
logger.warning("load_recipe failed for '%s'", recipe, exc_info=True)
return fleet_error(
FleetErrorCode.FLEET_RECIPE_NOT_FOUND,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] slop: _DISPATCHABLE_KINDS defined as frozenset inside _run_dispatch on every call. Should be module-level constant.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review (individual)

@@ -170,7 +183,19 @@ async def run_python(
return json.dumps({"success": False, "error": f"{type(exc).__name__}: {exc}"})


@mcp.tool(tags={"autoskillit", "kitchen"}, annotations={"readOnlyHint": True})
def _persist_run_skill_state(skill_result: SkillResult, project_dir: Path) -> None:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] slop: _persist_run_skill_state and _clear_run_skill_state are unnecessary single-line wrapper functions. Deferred imports could be inlined.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(L403 — outside diff hunk) [warning] bugs: _check_merge_base_unpublished accesses step.on_result.routes.values() without guard. If routes is None, raises AttributeError.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(L367 — outside diff hunk) [warning] defense: output_mode validation for research recipe is hardcoded name-based special case in generic open_kitchen handler. Should be driven by recipe schema.

This finding requires a human decision — the correct path is ambiguous.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(L275 — outside diff hunk) [warning] slop: _check_always_has_no_write_exit has multi-line docstring whose first sentence repeats the @semantic_rule description exactly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(L270 — outside diff hunk) [warning] slop: install_result initialized to dummy CompletedProcess only to satisfy type checker before with block overwrites it. Use Optional typing instead.

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: approved_with_comments

Scope: Top 20 most-changed Python source files out of 1,551 total changed files (287K LoC added).

37 warning-level findings across 15 files. No blocking changes required. See inline comments.

Finding Summary by Dimension

Dimension Findings Key Pattern
defense 17 Broad except Exception handlers swallowing errors silently (headless flush, update checks, rule loaders)
bugs 9 Assert-as-validation (disabled under -O), string-vs-enum comparisons, token log ordering
slop 8 Dead code branches, redundant imports, unnecessary wrapper functions, duplicate logic
cohesion 2 Asymmetric SkillResult construction across recovery paths
arch 1 IL-0 FEATURE_REGISTRY storing IL-2 import paths as string data

Top Patterns

  1. Silent exception swallowing (10 findings): except Exception: pass or except Exception: return [] patterns suppress infrastructure failures. Most common in headless/__init__.py (flush paths), _update_checks.py, and recipe rule loaders. Recommended: narrow exception types or raise log level to WARNING.

  2. Assert-as-validation (2 findings): assert x is not None used for runtime invariants that should be if x is None: raise RuntimeError(...) — asserts are stripped under -O.

  3. Dead code / unnecessary wrappers (5 findings): Unreachable branches (_session_model.py:82), redundant one-line delegation functions (_cmd_rpc.py:127,170), duplicate logic (tools_git.py:24).

  4. Type safety (2 findings): String literal comparison == "timeout" instead of enum CliSubtype.TIMEOUT in fleet/_api.py.

Files NOT Reviewed

This review covers only the top 20 source files by addition count. Notable files NOT reviewed include:

  • All test files (1,400+ lines of test changes)
  • Recipe YAML/JSON contracts
  • Skills and SKILL.md files
  • Configuration, documentation, and CI files
  • The remaining ~1,500 changed files

Trecek and others added 15 commits May 8, 2026 07:02
## Summary

Add `slots=True` to all 63 `@dataclass(frozen=True)` definitions (across
33 files in 11
packages) that currently lack it. Python `slots=True` on a frozen
dataclass eliminates the
`__dict__` per-instance overhead and replaces it with typed slot
descriptors, reducing memory
usage and improving attribute access speed. The project's
`requires-python = ">=3.11"` means
`slots=True` on `@dataclass` (introduced in 3.10) is fully supported.

Two frozen dataclasses already have `slots=True` (`result_parser.py`,
`_hook_settings.py`) and
serve as the established pattern for this change. Every affected
dataclass was verified to be
safe: no inheritance chains between dataclasses, no manually defined
`__slots__`, and no
non-dataclass parent that would conflict.

A new architecture compliance test is introduced first to establish the
invariant and prevent
future regressions.

Closes #2192

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-215724-550140/.autoskillit/temp/make-plan/perf_add_slots_true_frozen_dataclasses_plan_2026-05-07_000001.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-sonnet-4-6 | 1 | 115 | 10.6k | 396.2k | 45.2k | 82 |
51.2k | 6m 58s |
| verify | claude-sonnet-4-6 | 1 | 204 | 21.8k | 1.2M | 59.7k | 66 |
48.2k | 5m 3s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.1M | 15.0k | 1.0M | 35.1k
| 166 | 52.5k | 4m 45s |
| fix | claude-sonnet-4-6 | 1 | 360 | 17.2k | 2.7M | 90.0k | 108 | 79.7k
| 15m 58s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 58.8k | 4.2k | 149.0k |
29.8k | 16 | 42.3k | 1m 26s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 49.7k | 1.5k | 175.9k |
29.8k | 15 | 15.0k | 50s |
| **Total** | | | 1.3M | 70.3k | 5.7M | 90.0k | | 289.0k | 35m 1s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 183 | 5727.6 | 287.1 | 81.7 |
| fix | 16 | 170299.2 | 4982.6 | 1077.5 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **199** | 28542.7 | 1452.1 | 353.0 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 3 | 679 | 49.6k | 4.3M | 179.1k | 27m 59s |
| MiniMax-M2.7-highspeed | 3 | 1.3M | 20.6k | 1.4M | 109.9k | 7m 1s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#2217)

## Summary

Replace `return asdict(self)` with explicit field dicts in the
`to_dict()` method of four hot-path dataclasses: `DispatchRecord`,
`TokenEntry`, `TimingEntry`, and `FailureRecord`. For
`DispatchRecord.token_usage` (the single non-primitive field across all
four), use `dict(self.token_usage)` for a shallow copy that avoids the
deep-copy overhead of `asdict` while preserving safety for current
callers. Remove `asdict` from the `dataclasses` import in all four files
once it is no longer used. The cold-path `AutomationConfig`
serialization at `cli/app.py:302` is explicitly out of scope and must
remain unchanged.

## Requirements

## Acceptance Criteria

- [ ] All 4 warm-path `to_dict()` methods use explicit field dicts
- [ ] `DispatchRecord.token_usage` uses shallow copy
(`dict(self.token_usage)`)
- [ ] `AutomationConfig` at `cli/app.py:302` keeps `asdict()`
- [ ] All existing tests pass
- [ ] JSON output is identical (verified by round-trip tests)

## Changed Files

### New (★):
tests/fleet/test_state_schema.py
tests/pipeline/test_audit.py

### Modified (●):
src/autoskillit/core/types/_type_results.py
src/autoskillit/fleet/state_types.py
src/autoskillit/pipeline/timings.py
src/autoskillit/pipeline/tokens.py

Closes #2193

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-215724-984456/.autoskillit/temp/make-plan/perf_replace_asdict_plan_2026-05-07_220015.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-sonnet-4-6 | 1 | 107 | 10.0k | 340.2k | 61.7k | 40 |
32.2k | 4m 19s |
| verify | claude-sonnet-4-6 | 1 | 68 | 8.6k | 256.7k | 46.2k | 42 |
33.3k | 4m 32s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.2M | 6.5k | 831.6k | 29.8k
| 77 | 16.3k | 2m 48s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 73.3k | 3.7k | 205.7k |
29.8k | 20 | 15.3k | 1m 34s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 28.6k | 1.5k | 146.1k |
29.8k | 14 | 15.1k | 45s |
| **Total** | | | 1.3M | 30.5k | 1.8M | 61.7k | | 112.2k | 14m 1s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 136 | 6114.4 | 119.6 | 48.1 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **136** | 13090.7 | 825.1 | 223.9 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 2 | 175 | 18.7k | 596.9k | 65.5k | 8m 52s |
| MiniMax-M2.7-highspeed | 3 | 1.3M | 11.8k | 1.2M | 46.7k | 5m 8s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

Change validate-audit (project-local), validate-test-audit, and the
extended validate-audit SKILL.md files from writing flat timestamped
files into a shared `validate-audit/` directory to creating a per-run
timestamped subdirectory (`validate-audit-{YYYY-MM-DD_HHMMSS}/`) with
timestamp-free filenames inside. This matches the established pattern
used by `validate-team`. Update downstream path references in
`skill_contracts.yaml` and all affected tests.

Closes #1960

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-225440-809232/.autoskillit/temp/make-plan/validate-audit_adopt_per-run_subdirectory_output_pattern_plan_2026-05-07_225500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| build_execution_map | claude-sonnet-4-6 | 1 | 76 | 12.2k | 347.2k |
52.1k | 31 | 39.9k | 4m 22s |
| plan | claude-opus-4-6 | 1 | 79 | 19.2k | 755.8k | 79.2k | 57 | 85.7k
| 7m 20s |
| verify | claude-sonnet-4-6 | 1 | 29 | 15.7k | 447.8k | 54.6k | 77 |
41.5k | 7m 44s |
| implement* | MiniMax-M2.7-highspeed | 1 | 3.8M | 20.0k | 2.3M | 29.8k
| 178 | 72.2k | 7m 30s |
| fix | claude-sonnet-4-6 | 1 | 190 | 10.7k | 1.1M | 60.6k | 61 | 50.3k
| 8m 29s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 89.7k | 4.2k | 208.5k |
29.8k | 20 | 42.2k | 1m 32s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 68.6k | 1.8k | 172.3k |
28.7k | 16 | 41.0k | 55s |
| **Total** | | | 3.9M | 83.7k | 5.3M | 79.2k | | 372.7k | 37m 54s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| build_execution_map | 0 | — | — | — |
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 162 | 13957.0 | 446.0 | 123.2 |
| fix | 2 | 532814.5 | 25159.0 | 5347.5 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **164** | 32062.6 | 2272.9 | 510.5 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 3 | 295 | 38.6k | 1.9M | 131.6k | 20m 35s |
| claude-opus-4-6 | 1 | 79 | 19.2k | 755.8k | 85.7k | 7m 20s |
| MiniMax-M2.7-highspeed | 3 | 3.9M | 25.9k | 2.6M | 155.4k | 9m 57s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

The recipe validation pipeline has an asymmetric structural gap: the
feature-gate axis has a complete `undeclared-feature-requirement` rule
(ERROR severity) that statically cross-references every `run_skill`
step's categories against `FEATURE_REGISTRY`, but the pack-gate axis has
no equivalent rule. This gap caused three incidents over five weeks. The
fix adds an `undeclared-pack-requirement` semantic rule mirroring the
feature-gate pattern, makes `unknown-required-pack` ERROR-severity,
fixes `research-design.yaml` to declare `vis-lens`, and updates the test
that locked in the incorrect `requires_packs` value.

## Requirements

## Conflict Resolution Decisions

The following files had merge conflicts that were automatically
resolved.

## Changed Files

### Modified (●):

- `src/autoskillit/recipe/rules/rules_packs.py`
- `src/autoskillit/recipe/rules/rules_skills.py`
- `src/autoskillit/recipes/research-design.json`
- `src/autoskillit/recipes/research-design.yaml`
- `tests/recipe/test_bundled_recipes_general.py`
- `tests/recipe/test_bundled_recipes_research_design.py`
- `tests/recipe/test_rules_packs.py`

Closes #2220

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/remediation-20260507-235311-480573/.autoskillit/temp/rectify/rectify_undeclared_pack_requirement_2026-05-08_000100.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| rectify | claude-sonnet-4-6 | 1 | 6.6k | 13.1k | 907.1k | 118.7k | 197
| 67.7k | 8m 41s |
| dry_walkthrough | claude-sonnet-4-6 | 1 | 40 | 9.6k | 547.7k | 53.1k |
99 | 40.0k | 4m 56s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.9M | 16.7k | 1.6M | 70.4k
| 160 | 129.5k | 7m 54s |
| assess | claude-opus-4-6 | 1 | 84 | 9.4k | 2.1M | 71.5k | 88 | 60.9k |
6m 41s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 83.5k | 2.9k | 206.4k |
34.0k | 22 | 52.2k | 1m 25s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 44.4k | 1.2k | 198.2k |
28.7k | 14 | 15.0k | 40s |
| **Total** | | | 2.1M | 52.9k | 5.5M | 118.7k | | 365.3k | 30m 20s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| rectify | 0 | — | — | — |
| dry_walkthrough | 0 | — | — | — |
| implement | 234 | 6951.9 | 553.6 | 71.2 |
| assess | 21 | 98049.5 | 2898.2 | 446.5 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **255** | 21745.4 | 1432.6 | 207.3 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 2 | 6.7k | 22.7k | 1.5M | 107.7k | 13m 38s |
| MiniMax-M2.7-highspeed | 3 | 2.1M | 20.8k | 2.0M | 196.7k | 9m 59s |
| claude-opus-4-6 | 1 | 84 | 9.4k | 2.1M | 60.9k | 6m 41s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

The `fleet campaign` session launches without an `initial_message`, so
Claude has no first user turn to trigger the `FIRST ACTION` block that
displays the ingredient table. The `fleet dispatch` path and `order`
path both correctly pass a greeting as `initial_message`. The fix adds a
`_FLEET_CAMPAIGN_GREETINGS` list and wires it through the campaign
launch path, mirroring the existing dispatch pattern.

Closes #2214

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260508-010012-804577/.autoskillit/temp/make-plan/fleet_campaign_session_missing_initial_message_greeting_trigger_plan_2026-05-08_010012.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-opus-4-6 | 1 | 75 | 9.1k | 933.1k | 62.0k | 71 | 52.3k |
5m 8s |
| verify | claude-opus-4-6 | 1 | 37 | 7.3k | 734.2k | 46.7k | 59 | 33.5k
| 4m 3s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.1M | 12.7k | 1.0M | 34.0k
| 100 | 56.4k | 5m 43s |
| fix | claude-sonnet-4-6 | 1 | 374 | 30.8k | 3.4M | 105.9k | 118 |
92.9k | 11m 35s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 169.7k | 4.1k | 402.1k |
28.7k | 37 | 41.2k | 1m 56s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 52.2k | 1.2k | 227.0k |
28.7k | 16 | 15.1k | 42s |
| **Total** | | | 1.3M | 65.4k | 6.7M | 105.9k | | 291.5k | 29m 9s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 153 | 6606.7 | 368.5 | 83.3 |
| fix | 20 | 168857.8 | 4646.4 | 1541.3 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **173** | 38638.6 | 1684.9 | 378.0 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-opus-4-6 | 2 | 112 | 16.5k | 1.7M | 85.8k | 9m 11s |
| MiniMax-M2.7-highspeed | 3 | 1.3M | 18.1k | 1.6M | 112.8k | 8m 22s |
| claude-sonnet-4-6 | 1 | 374 | 30.8k | 3.4M | 92.9k | 11m 35s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…2222)

## Summary

The `research-campaign.yaml` declares 8 campaign-level ingredients but
only forwards a subset to each dispatch's `ingredients:` block. Several
ingredients (`task`, `review_design`, `output_mode`, `review_pr`,
`audit_claims`) that sub-recipes declare are never forwarded — they
silently fall back to sub-recipe defaults (or to the `_run_dispatch`
auto-injection in the case of `task`), ignoring the user's
campaign-level values.

This plan fixes the YAML forwarding gaps and adds a new static
validation rule (`campaign-dangling-ingredient`) that catches this class
of bug at authoring time.

## Requirements

(Embedded in the issue body — issue #2215 describes the problem and
solution in detail)

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260508-010013-712390/.autoskillit/temp/make-plan/research_campaign_ingredient_forwarding_plan_2026-05-08_010600.md`

## Changed Files

### Modified (●):
● src/autoskillit/recipe/rules/rules_campaign.py
● src/autoskillit/recipes/campaigns/research-campaign.json
● src/autoskillit/recipes/campaigns/research-campaign.yaml
● tests/recipe/test_campaign_loader.py
● tests/recipe/test_research_campaign_rules.py
● tests/recipe/test_rules_campaign.py

Closes #2215

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-opus-4-6 | 1 | 78 | 13.1k | 649.7k | 68.6k | 78 | 59.9k
| 6m 11s |
| verify | claude-opus-4-6 | 1 | 897 | 6.2k | 625.9k | 70.5k | 81 |
57.3k | 4m 27s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.1M | 10.7k | 974.1k |
28.7k | 83 | 51.5k | 7m 50s |
| fix | claude-sonnet-4-6 | 1 | 174 | 8.1k | 929.2k | 61.4k | 53 | 48.3k
| 3m 47s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 133.1k | 3.0k | 341.7k |
28.7k | 24 | 15.2k | 1m 21s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 37.0k | 1.4k | 169.5k |
28.7k | 14 | 15.1k | 44s |
| **Total** | | | 1.2M | 42.4k | 3.7M | 70.5k | | 247.4k | 24m 21s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 297 | 3279.6 | 173.5 | 35.9 |
| fix | 33 | 28156.7 | 1463.2 | 246.1 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **330** | 11181.9 | 749.7 | 128.6 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-opus-4-6 | 2 | 975 | 19.3k | 1.3M | 117.2k | 10m 38s |
| MiniMax-M2.7-highspeed | 3 | 1.2M | 15.0k | 1.5M | 81.9k | 9m 55s |
| claude-sonnet-4-6 | 1 | 174 | 8.1k | 929.2k | 48.3k | 3m 47s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
## Summary

Create a new `validate-review-decisions` skill at
`skills_extended/validate-review-decisions/SKILL.md` that adds mandatory
intent analysis and seven evidence-gathering rules to the validation
workflow for review-decisions audit reports. Update `full-audit.yaml` to
route review-decisions validation through this new skill instead of the
generic `validate-audit`. Add contract tests and update all test
manifests.

The skill follows the architecture established by `validate-test-audit`
(domain-specific semantic rules + intent analysis) while preserving full
output compatibility with `validate-audit` (same directory, naming
convention, `validated: true` sentinel, `AUTOSKILLIT_AUDIT_RUN_DIR`
support).

## Requirements

### REQ-VRD-1: Skill structure
The skill MUST be placed at
`skills_extended/validate-review-decisions/SKILL.md` with `categories:
[audit]` frontmatter.

### REQ-VRD-2: Intent analysis as mandatory step
Code validation subagents MUST perform intent analysis (docstring check,
git provenance, test coverage, contract analysis, architectural
constraint check, behavioral simulation) before assigning a verdict to
ANY finding. This is not optional — every finding must have an intent
analysis section in the subagent's reasoning.

### REQ-VRD-3: Evidence-gathering rules in subagent instructions
The skill MUST include the seven evidence-gathering rules in the code
validation subagent prompt. Rules MUST be generalizable (no references
to specific finding IDs).

### REQ-VRD-4: Output compatibility
Output files MUST use the same directory, naming convention, and format
as `validate-audit`.

### REQ-VRD-5: Standalone invocability
The skill MUST accept an `{audit_report_path}` argument and
auto-discover the most recent review-decisions audit report when
omitted.

### REQ-VRD-6: Full-audit recipe routing
Update `full-audit.yaml` to dispatch review-decisions audit validation
to `validate-review-decisions` and all other audit types to
`validate-audit` (or `validate-test-audit` for tests).

### REQ-VRD-7: No pack changes
The skill MUST use the existing `audit` pack.

### REQ-VRD-8: Consider generalizing intent analysis to validate-audit
(follow-up)
After `validate-review-decisions` and `validate-test-audit` both
establish intent analysis patterns, evaluate merging the common intent
analysis rules back into the generic `validate-audit` skill. This is a
follow-up, not a blocker.

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260508-010014-405487/.autoskillit/temp/make-plan/validate_review_decisions_skill_plan_2026-05-08_010600.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-opus-4-6 | 1 | 2.6k | 14.9k | 1.8M | 93.1k | 105 | 79.9k
| 9m 1s |
| verify | claude-opus-4-6 | 1 | 1.7k | 11.6k | 1.4M | 63.3k | 116 |
50.4k | 6m 22s |
| implement* | MiniMax-M2.7-highspeed | 1 | 3.6M | 22.5k | 1.9M | 28.7k
| 180 | 16.2k | 12m 59s |
| fix | claude-sonnet-4-6 | 1 | 174 | 7.5k | 874.3k | 53.2k | 47 | 42.9k
| 3m 33s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 102.6k | 4.2k | 261.0k |
34.0k | 24 | 27.4k | 1m 37s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 57.5k | 1.8k | 227.0k |
28.7k | 16 | 15.1k | 49s |
| **Total** | | | 3.7M | 62.5k | 6.4M | 93.1k | | 232.0k | 34m 23s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 766 | 2508.8 | 21.2 | 29.4 |
| fix | 2 | 437171.0 | 21471.5 | 3770.5 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **768** | 8375.0 | 302.1 | 81.3 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-opus-4-6 | 2 | 4.3k | 26.5k | 3.1M | 130.2k | 15m 24s |
| MiniMax-M2.7-highspeed | 3 | 3.7M | 28.5k | 2.4M | 58.8k | 15m 25s |
| claude-sonnet-4-6 | 1 | 174 | 7.5k | 874.3k | 42.9k | 3m 33s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…queued` State (#2225)

## Summary

Formalize the issue label lifecycle as a first-class discrete system by
introducing an `IssueLabelState` enum, a `LabelDef` registry with
per-state metadata (color, description, swap semantics), and a
transition table with validation — modeled after the fleet
`DispatchStatus` pattern. Add `queued` as the fifth lifecycle state.
Refactor `claim_issue`, `release_issue`, and `claim_and_resolve_issue`
to derive colors, descriptions, and swap-remove sets from the registry
instead of hardcoding them. Update `process-issues` Phase 0.5 to apply
`queued` (not `in-progress`) during upfront claiming, and add a `queued
→ in-progress` swap at recipe pickup.

## Requirements

### R1: Formalize Label Lifecycle as a Discrete System

Create an architectural component (enum + registry or Protocol-based
approach) where each **lifecycle label** (as distinct from
classification labels like `recipe:implementation`) is a first-class
entity that declares:

1. **Label name** — the GitHub label string (e.g., `"queued"`,
`"in-progress"`, `"staged"`, `"fail"`)
2. **Color** — hex color for `ensure_label` (currently hardcoded:
`fbca04` for in-progress, `0075ca` for staged, `d73a4a` for fail)
3. **Description** — human-readable description for `ensure_label`
4. **Swap semantics** — which labels to remove when entering this state
(e.g., entering `in-progress` removes `queued` and `fail`)
5. **Valid transitions** — which states this label can transition to
(modeled after `_ALLOWED_TRANSITIONS` in fleet)

When a new lifecycle label is added, the system must enforce that all of
these are defined. No hanging labels — every label in the lifecycle
participates in the tracking history.

### R2: Add `queued` Lifecycle State

Add a `queued` label with the following lifecycle position:

```
[unlabeled / triaged]
        │
        │ upfront claim (Phase 0.5)
        ▼
    [queued]          ← claimed by orchestrator, not yet processing
        │
        │ recipe session begins
        ▼
  [in-progress]      ← recipe actively executing
        │
        ├─ success → [staged]
        ├─ failure → [fail]
        └─ bare    → [unlabeled]
```

Transitions for `queued`:
- **Entry**: from unlabeled/fail (upfront claim swaps `fail` → `queued`)
- **Exit**: to `in-progress` (recipe pickup), or to unlabeled (fatal
cleanup release)

### R3: Integrate with `process-issues` Workflow

Update `process-issues/SKILL.md` Phase 0.5 and Step 3:

- **Phase 0.5 (upfront claiming)**: `claim_issue` should apply `queued`
instead of `in-progress`
- **Step 3b.1 (recipe pickup)**: Before loading the recipe, swap
`queued` → `in-progress` for the specific issue being processed
- **Fatal failure cleanup**: Release all `queued` issues (not just
`in-progress`) back to unlabeled
- **Recipe `claim_and_resolve` step**: When `upfront_claimed=true`, the
issue arrives with `queued` (not `in-progress`), so the reentry path
needs to handle the `queued` → `in-progress` swap

### R4: Integration Points

These components need updates:

| Component | File | Change |
|-----------|------|--------|
| `GitHubConfig` | `config/_config_dataclasses.py` | Add `queued_label`
field (or migrate to registry) |
| `defaults.yaml` | `config/defaults.yaml` | Add `queued` to
`allowed_labels` and label config |
| `claim_issue` | `server/tools/tools_issue_lifecycle.py` | Support
claiming with `queued` label |
| `claim_and_resolve_issue` | `server/tools/tools_issue_composite.py` |
Handle `queued` → `in-progress` transition |
| `release_issue` | `server/tools/tools_issue_lifecycle.py` | Remove
`queued` in cleanup paths |
| `swap_labels` | `execution/github.py` | No change needed (generic) |
| `process-issues` | `skills_extended/process-issues/SKILL.md` | Phase
0.5 uses `queued`, Step 3 swaps to `in-progress` |
| `build-execution-map` | `skills_extended/build-execution-map/SKILL.md`
| Query `queued` label alongside `in-progress` for conflict detection |

### R5: Classification vs. Lifecycle Label Distinction

The system should distinguish between:
- **Lifecycle labels** (`queued`, `in-progress`, `staged`, `fail`) —
managed by the state machine, subject to atomic swaps and transition
validation
- **Classification labels** (`recipe:implementation`,
`recipe:remediation`, `bug`, `enhancement`, `autoreported`) — applied by
triage/reporting, never removed by the pipeline, not part of the state
machine

This distinction should be encoded in the architecture, not just
documented.

## Implementation Plan

Plan files:
-
`/home/talon/projects/autoskillit-runs/impl-20260508-023057-647423/.autoskillit/temp/make-plan/formalize_issue_label_lifecycle_plan_2026-05-08_023500_part_a.md`
-
`/home/talon/projects/autoskillit-runs/impl-20260508-023057-647423/.autoskillit/temp/make-plan/formalize_issue_label_lifecycle_plan_2026-05-08_023500_part_b.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-sonnet-4-6 | 1 | 80 | 32.2k | 840.0k | 81.9k | 100 |
82.2k | 18m 26s |
| verify | claude-sonnet-4-6 | 2 | 1.9k | 27.0k | 1.9M | 77.8k | 210 |
108.7k | 15m 53s |
| implement* | MiniMax-M2.7-highspeed | 2 | 5.8M | 44.2k | 4.9M | 87.5k
| 374 | 272.0k | 25m 15s |
| fix | claude-sonnet-4-6 | 1 | 394 | 21.5k | 3.3M | 98.0k | 126 | 98.3k
| 17m 13s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 91.2k | 8.3k | 206.6k |
34.2k | 25 | 60.5k | 2m 7s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 44.8k | 2.3k | 169.6k |
28.7k | 14 | 15.1k | 55s |
| **Total** | | | 6.0M | 135.5k | 11.2M | 98.0k | | 636.9k | 1h 19m |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 604 | 8054.5 | 450.4 | 73.2 |
| fix | 52 | 62593.5 | 1891.2 | 413.6 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **656** | 17128.3 | 970.9 | 206.5 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 3 | 2.3k | 80.7k | 6.0M | 289.2k | 51m 34s |
| MiniMax-M2.7-highspeed | 3 | 6.0M | 54.8k | 5.2M | 347.7k | 28m 17s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@Trecek Trecek merged commit 45c1ba6 into main May 8, 2026
2 checks passed
@Trecek Trecek deleted the develop branch May 8, 2026 15:32
@Trecek Trecek restored the develop branch May 8, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment