Various fixes: integration rollup into main by Trecek · Pull Request #366 · TalonT-Org/AutoSkillit

Trecek · 2026-03-12T22:27:50Z

Summary

Branch protection: Protected-branch validation for merge and push operations with hooks and structural tests
Pipeline observability: Canonical TelemetryFormatter, quota events, wall-clock timing, drift fix
PR pipeline gates: Mergeability gate, review cycle, fidelity checks, CI gating, review-first enforcement
Release CI: Version bump automation, branch sync, force-push integration back-sync, release workflows
Skill hardening: Anti-prose guards for loop constructs, loop-boundary detector, skill compliance tests, end-turn hazard documentation
Recipe validation: Unknown-skill-command semantic rule, arch import violation fixes
PostToolUse hook: Pretty output formatter with exception boundary and data-loss bug fixes
Dry walkthrough: Test command genericization, Step 4.5 historical regression check
prepare-issue: Duplicate detection and broader triggers
Display output: Terminal targets consolidation (Part A)
Pre-release stability: Test failures, arch exemptions, init idempotency, CLAUDE.md corrections
Documentation: Release docs sprint, getting-started, CLI reference, architecture, configuration, installation guides
readOnlyHint: Added to all MCP tools for parallel execution support
review-pr skill: Restored bundled skill and reverted erroneous deletions

Test plan

CI passes (Preflight checks + Tests on ubuntu-latest)
No regressions in existing test suite
New tests for branch protection, telemetry formatter, pretty output, release workflows, skill compliance all pass

🤖 Generated with Claude Code

Add is_protected_branch() guard in core/branch_guard.py, enforced in perform_merge() and push_to_remote() to prevent accidental merges or pushes to main, integration, or stable. Configurable via SafetyConfig.protected_branches with defaults.yaml defaults. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add branch_protection_guard.py PreToolUse hook for merge_worktree and push_to_remote - Register branch_protection_guard.py and headless_orchestration_guard.py in HOOK_REGISTRY - Add structural test: every hook script in hooks/ must be in HOOK_REGISTRY - Add structural test: every destructive tool must have PreToolUse hook - Add cwd assertions to 3 run_skill tests (MockSubprocessRunner[N][1]) - Fix BranchingConfig.default_base_branch: 'integration' -> 'main' (match defaults.yaml) - Add test verifying Python default matches defaults.yaml Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add branch_protection_guard.py to _PRINT_EXEMPT (standalone hooks use print for JSON) - Document branch_protection_guard.py in CLAUDE.md Architecture section - Fix test_init_idempotent_no_duplicates: check for duplicate matchers instead of counting matchers containing "run_skill" (which now legitimately appears in both the run_skill matcher and the headless_orchestration_guard matcher) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…splay Eliminates the dual-formatter anti-pattern where two independent formatting implementations (server-side bullet lists and hook-side Markdown-KV) produced incompatible outputs for the same telemetry data. - Create TelemetryFormatter in pipeline layer with format_token_table(), format_timing_table(), and format_compact_kv() methods - Add format parameter to get_token_summary and get_timing_summary MCP tools (format="json" default, format="table" returns pre-formatted markdown) - Rewrite write_telemetry_files to use TelemetryFormatter and merge wall_clock_seconds (previously bypassed, root cause of field never reaching file output) - Delete _format_token_summary() and _format_timing_summary() server-side formatters - Update PostToolUse hook to prefer wall_clock_seconds, add dedicated _fmt_get_timing_summary formatter, handle pre-formatted responses - Simplify recipe TOKEN SUMMARY instructions from 5-step manual chain to single get_token_summary(format=table) call - Add telemetry-before-open-pr semantic rule (WARNING) - Fix stale open-pr skill_contracts.yaml entry - Add comprehensive tests: TelemetryFormatter unit tests, format parameter tests, output-equivalence test (hook ≡ canonical), format-structural assertions replacing weak presence-only checks, semantic rule tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds a new semantic rule that validates skill_command references in run_skill steps resolve to actual bundled skills on disk. Mirrors the existing unknown-tool rule pattern. Prevents silent recipe breakage when skills are deleted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Import SkillResolver from workspace __init__ (not submodule) per REQ-ARCH-001 - Detect dynamic template expressions in skill name portion only, not in arguments (e.g. /autoskillit:audit-${{...}} is dynamic, /autoskillit:review-pr ${{...}} is not) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

All 6 failing tests correctly detect the same issue: bundled recipes reference /autoskillit:review-pr which was erroneously deleted. Filters are tagged TODO(Part-B) for removal once review-pr skill is restored. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extends the text-then-tool compliance suite with _check_loop_boundary() that detects "For each" loops containing tool invocations without an anti-prose guard. Adds 3 detector unit test fixtures and wires the new check into the project-wide test_no_text_then_tool_in_any_step scan. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds CRITICAL guard instructions to "For each" loops containing tool invocations in open-pr, create-review-pr, process-issues, and setup-project. Prevents stochastic end_turn windows at loop iteration boundaries where the model would otherwise emit progress prose. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Explains why text output between tool calls kills headless sessions, the two anti-pattern classes (text-then-tool, loop-boundary), how to reproduce the issue, why recipes are immune, and what upstream fixes would make the guards unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Commit 5baa6b9 erroneously deleted the review-pr skill and made cascading changes to tests and recipes. This restores the skill, its tests, and reverts the degradation logic — making the codebase consistent so the unknown-skill-command rule passes for all bundled recipes. - Restore review-pr/SKILL.md and its two test files from main - Revert audit-and-fix.yaml: remove check_review_pr_available gate, use /autoskillit:review-pr, restore on_failure → resolve_review - Remove all 6 TODO(Part-B) suppression filters and xfail markers - Update BUNDLED_SKILLS list, write-recipe skill list, CLAUDE.md (52→53) - Replace pytest.skip with pytest.fail in review_pr_text fixture - Genericize code-index path example in review-pr SKILL.md (REQ-GEN-004) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…, Headless Isolation (#404) ## Summary Integration rollup of **43 PRs** (#293–#406) consolidating **62 commits** across **291 files** (+27,909 / −6,040 lines). This release advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue integration, sub-recipe composition, a PostToolUse output reformatter, headless session isolation guards, and comprehensive pipeline observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new test files. --- ## Major Features ### GitHub Merge Queue Integration (#370, #362, #390) - New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's merge queue until merged, ejected, or timed out (default 600s). Uses REST + GraphQL APIs with stuck-queue detection and auto-merge re-enrollment - New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`) — never raises; all outcomes are structured results - `parse_merge_queue_response()` pure function for GraphQL queue entry parsing - New `auto_merge` ingredient in `implementation.yaml` and `remediation.yaml` — enrolls PRs in the merge queue after CI passes - Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue → wait → handle ejections → re-enter - `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step 1.5 (CI/review eligibility filtering) ### Sub-Recipe Composition (#380) - Recipe steps can now reference sub-recipes via `sub_recipe` + `gate` fields — lazy-loaded and merged at validation time - Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines sub-recipe steps with safe name-prefixing and route remapping (`done` → parent's `on_success`, `escalate` → parent's `on_failure`) - `_build_active_recipe()` evaluates gate ingredients against overrides/defaults; dual validation runs on both active and combined recipes - First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm → dispatch workflow, gated by `sprint_mode` ingredient (hidden, default false) - Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry` placeholder step - New semantic rules: `unknown-sub-recipe` (ERROR), `circular-sub-recipe` (ERROR) with DFS cycle detection ### PostToolUse Output Reformatter (#293, #405) - `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw MCP JSON responses to Markdown-KV before Claude consumes them (30–77% token overhead reduction) - Dedicated formatters for 11 high-traffic tools (`run_skill`, `run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.) plus a generic KV formatter for remaining tools - Pipeline vs. interactive mode detection via hook config file - Unwraps Claude Code's `{"result": "<json-string>"}` envelope before dispatching - 1,516-line test file with 40+ behavioral tests ### Headless Session Isolation (#359, #393, #397, #405, #406) - **Env isolation**: `build_sanitized_env()` strips `AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing `AUTOSKILLIT_HEADLESS=1` from leaking into test runners - **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all relative paths to session CWD; `_validate_output_paths()` checks structured output tokens against CWD prefix; `_scan_jsonl_write_paths()` post-session scanner catches actual Write/Edit/Bash tool calls outside CWD - **Headless orchestration guard**: new PreToolUse hook blocks `run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`, enforcing Tier 1/Tier 2 nesting invariant - **`_require_not_headless()` server-side guard**: blocks 10 orchestration-only tools from headless sessions at the handler layer - **Unified error response contract**: `headless_error_result()` produces consistent 9-field responses; `_build_headless_error_response()` canonical builder for all failure paths in `tools_integrations.py` ### Cook UX Overhaul (#375, #363) - `open_kitchen` now accepts optional `name` + `overrides` — opens kitchen AND loads recipe in a single call - Pre-launch terminal preview with ANSI-colored flow diagram and ingredients table via new `cli/_ansi.py` module - `--dangerously-skip-permissions` warning banner with interactive confirmation prompt - Randomized session greetings from themed pools - Orchestrator prompt rewritten: recipe YAML no longer injected via `--append-system-prompt`; session calls `open_kitchen('{recipe_name}')` as first action - Conversational ingredient collection replaces mechanical per-field prompting --- ## New MCP Tools | Tool | Gate | Description | |------|------|-------------| | `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue (REST + GraphQL) | | `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA for review-first gating | | `get_quota_events` | Ungated | Surfaces quota guard decisions from `quota_events.jsonl` | --- ## Pipeline Observability (#318, #341) - **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source of truth for all telemetry rendering; replaces dual-formatter anti-pattern. Four rendering modes: Markdown table, terminal table, compact KV (for PostToolUse hook) - `get_token_summary` and `get_timing_summary` gain `format` parameter (`"json"` | `"table"`) - `wall_clock_seconds` merged into token summary output — see duration alongside token counts in one call - **Telemetry clear marker**: `write_telemetry_clear_marker()` / `read_telemetry_clear_marker()` prevent token accounting drift on MCP server restart after `clear=True` - **Quota event logging**: `quota_check.py` hook now writes structured JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to `quota_events.jsonl` --- ## CI Watcher & Remote Resolution Fixes (#395, #406) - **`CIRunScope` value object** — carries `workflow` + `head_sha` scope; replaces bare `head_sha` parameter across all CI watcher signatures - **Workflow filter**: `wait_for_ci` and `get_ci_status` accept `workflow` parameter (falls back to project-level `config.ci.workflow`), preventing unrelated workflows (version bumps, labelers) from satisfying CI checks - **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out, startup_failure, cancelled}` - **Canonical remote resolver** (`execution/remote_resolver.py`): `resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` — correctly resolves `owner/repo` after `clone_repo` sets `origin` to `file://` isolation URL - **Clone isolation fix**: `clone_repo` now always clones from remote URL (never local path); sets `origin=file:///<clone>` for isolation and `upstream=<real_url>` for push/CI operations --- ## PR Pipeline Gates (#317, #343) - **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`, `partition_prs()` — partitions PRs into eligible/CI-blocked/review-blocked with human-readable reasons - **`pipeline/fidelity.py`**: `extract_linked_issues()` (Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema validation - **`check_pr_mergeable`** now returns `mergeable_status` field alongside boolean - **`release_issue`** gains `target_branch` + `staged_label` parameters for staged issue lifecycle on non-default branches (#392) --- ## Recipe System Changes ### Structural - `RecipeIngredient.hidden` field — excluded from ingredients table (used for internal flags like `sprint_mode`) - `Recipe.experimental` flag parsed from YAML - `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth - `format_ingredients_table()` with sorted display order (required → auto-detect → flags → optional → constants) - Diagram rendering engine (~670 lines) removed from `diagrams.py` — rendering now handled by `/render-recipe` skill; format version bumped to v7 ### Recipe YAML Changes - **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`, `bugfix-loop.yaml` - **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml` - **`implementation.yaml`**: merge queue steps, `auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""` (auto-detect), CI workflow filter, `extract_pr_number` step - **`remediation.yaml`**: `topic` → `task` rename, merge queue steps, `dry_walkthrough` retries:3 with forward-only routing, `verify` → `test` rename - **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step (replaces `create-review-pr`), post-PR mergeability polling, review cycle with `resolve-review` retries ### New Semantic Rules - `missing-output-patterns` (WARNING) — flags `run_skill` steps without `expected_output_patterns` - `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist - `circular-sub-recipe` (ERROR) — DFS cycle detection - `unknown-skill-command` (ERROR) — validates skill names against bundled set - `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes `open-pr` --- ## New Skills (24) ### Architecture Lens Family (13) `arch-lens-c4-container`, `arch-lens-concurrency`, `arch-lens-data-lineage`, `arch-lens-deployment`, `arch-lens-development`, `arch-lens-error-resilience`, `arch-lens-module-dependency`, `arch-lens-operational`, `arch-lens-process-flow`, `arch-lens-repository-access`, `arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle` ### Audit Family (5) `audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`, `audit-tests` ### Planning & Diagramming (3) `elaborate-phase`, `make-arch-diag`, `make-req` ### Bug/Guard Lifecycle (2) `design-guards`, `verify-diag` ### Pipeline (1) `open-integration-pr` — creates integration PRs with per-PR details, arch-lens diagrams, carried-forward `Closes #N` references, and auto-closes collapsed PRs ### Sprint Planning (1 — gated by sub-recipe) `sprint-planner` — selects a focused, conflict-free sprint from a triage manifest --- ## Skill Modifications (Highlights) - **`analyze-prs`**: merge queue detection, CI/review eligibility filtering, queue-mode ordering - **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git history mining + GitHub issue cross-reference) - **`review-pr`**: deterministic diff annotation via `diff_annotator.py`, echo-primary-obligation step, post-completion confirmation, degraded-mode narration - **`collapse-issues`**: content fidelity enforcement — per-issue `fetch_github_issue` calls, copy-mode body assembly (#388) - **`prepare-issue`**: multi-keyword dedup search, numbered candidate selection, extend-existing-issue flow - **`resolve-review`**: GraphQL thread auto-resolution after addressing findings (#379) - **`resolve-merge-conflicts`**: conflict resolution decision report with per-file log (#389) - **Cross-skill**: output tokens migrated to `key = value` format; code-index paths made generic with fallback notes; arch-lens references fully qualified; anti-prose guards at loop boundaries --- ## CLI & Hooks ### New CLI Commands - `autoskillit install` — plugin installation + cache refresh - `autoskillit upgrade` — `.autoskillit/scripts/` → `.autoskillit/recipes/` migration ### CLI Changes - `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix` flag removed - `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware registration - `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions` confirmation - `recipes render`: repurposed from generator to viewer (delegates to `/render-recipe`) - `serve`: server import deferred to after `configure_logging()` to prevent stdout corruption ### New Hooks - `branch_protection_guard.py` (PreToolUse) — denies `merge_worktree`/`push_to_remote` targeting protected branches - `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration tools in headless sessions - `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter ### Hook Infrastructure - `HookDef.event_type` field — registry now handles both PreToolUse and PostToolUse - `generate_hooks_json()` groups entries by event type - `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made event-type-agnostic --- ## Core & Config ### New Core Modules - `core/branch_guard.py` — `is_protected_branch()` pure function - `core/github_url.py` — `parse_github_repo()` + `normalize_owner_repo()` canonical parsers ### Core Type Expansions - `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset - `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from `UNGATED_TOOLS` - `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response - `CIRunScope` — immutable scope for CI watcher calls - `MergeQueueWatcher` protocol - `SkillResult.cli_subtype` + `write_path_warnings` fields - `SubprocessRunner.env` parameter ### Config - `safety.protected_branches`: `[main, integration, stable]` - `github.staged_label`: `"staged"` - `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`) - `branching.default_base_branch`: `"integration"` → `"main"` - `ModelConfig.default`: `str | None` → `str = "sonnet"` --- ## Infrastructure & Release ### Version - `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock` - FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399) ### CI/CD Workflows - **`version-bump.yml`** (new) — auto patch-bumps `main` on integration PR merge, force-syncs integration branch one patch ahead - **`release.yml`** (new) — minor version bump + GitHub Release on merge to `stable` - **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python + Actions) - **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for `stable` ### PyPI Readiness - `pyproject.toml`: `readme`, `license`, `authors`, `keywords`, `classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion list ### readOnlyHint Parallel Execution Fix - All MCP tools annotated `readOnlyHint=True` — enables Claude Code parallel tool execution (~7x speedup). One deliberate exception: `wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue state) ### Tool Response Exception Boundary - `track_response_size` decorator catches unhandled exceptions and serializes them as `{"success": false, "subtype": "tool_exception"}` — prevents FastMCP opaque error wrapping ### SkillResult Subtype Normalization (#358) - `_normalize_subtype()` gate eliminates dual-source contradiction between CLI subtype and session outcome - Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race artifact) - Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` / `"missing_completion_marker"` / `"adjudicated_failure"` --- ## Test Coverage **47 new test files** (+12,703 lines) covering: | Area | Key Tests | |------|-----------| | Merge queue watcher state machine | `test_merge_queue.py` (226 lines) | | Clone isolation × CI resolution | `test_clone_ci_contract.py`, `test_remote_resolver.py` | | PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) | | Branch protection + headless guards | `test_branch_protection_guard.py`, `test_headless_orchestration_guard.py` | | Sub-recipe composition | 5 test files (schema, loading, validation, sprint mode × 2) | | Telemetry formatter | `test_telemetry_formatter.py` (281 lines) | | PR pipeline gates | `test_analyze_prs_gates.py`, `test_review_pr_fidelity.py` | | Diff annotator | `test_diff_annotator.py` (242 lines) | | Skill compliance | Output token format, genericization, loop-boundary guards | | Release workflows | Structural contracts for `version-bump.yml`, `release.yml` | | Issue content fidelity | Body-assembling skills must call `fetch_github_issue` per-issue | | CI watcher scope | `test_ci_params.py` — workflow_id query param composition | --- ## Consolidated PRs #293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337, #338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366, #368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392, #393, #395, #396, #397, #399, #405, #406 --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Trecek and others added 11 commits March 12, 2026 08:10

Trecek changed the base branch from main to integration March 12, 2026 22:33

Trecek added this pull request to the merge queue Mar 12, 2026

Trecek removed this pull request from the merge queue due to a manual request Mar 12, 2026

Trecek added this pull request to the merge queue Mar 12, 2026

Merged via the queue into integration with commit 302cea8 Mar 12, 2026
2 checks passed

Trecek deleted the various_fixes branch March 12, 2026 22:59

Trecek mentioned this pull request Mar 15, 2026

Integration v0.3.1: Merge Queue, Sub-Recipes, PostToolUse Reformatter, Headless Isolation #404

Merged

Trecek mentioned this pull request Mar 19, 2026

Promote integration → main (56 PRs, 46 issues) #438

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various fixes: integration rollup into main#366

Various fixes: integration rollup into main#366
Trecek merged 11 commits intointegrationfrom
various_fixes

Trecek commented Mar 12, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant