Skip to content

[FIX] Remove release_issue_success from implementation and remediation recipes#361

Merged
Trecek merged 9 commits intointegrationfrom
bug-release-issue-success-removes-in-progress-label-when-pr/335
Mar 12, 2026
Merged

[FIX] Remove release_issue_success from implementation and remediation recipes#361
Trecek merged 9 commits intointegrationfrom
bug-release-issue-success-removes-in-progress-label-when-pr/335

Conversation

@Trecek
Copy link
Collaborator

@Trecek Trecek commented Mar 12, 2026

Summary

Both implementation.yaml and remediation.yaml call release_issue on the success path
(release_issue_success step), which removes the in-progress label from the GitHub issue
after CI passes and the PR is opened. This is incorrect — the in-progress label should stay on
the issue while there is an open PR. Only the failure path should release the label.

The fix removes the release_issue_success step from both recipes and routes ci_watch.on_success
directly to confirm_cleanup. The release_issue_failure step and all failure-path routing
remain unchanged.

Requirements

RECIPE

  • REQ-RECIPE-001: The implementation.yaml recipe must NOT call release_issue on the success path when a PR has been opened.
  • REQ-RECIPE-002: The remediation.yaml recipe must NOT call release_issue on the success path when a PR has been opened.
  • REQ-RECIPE-003: The ci_watch step on_success must route directly to confirm_cleanup (bypassing release_issue_success) in both recipes.
  • REQ-RECIPE-004: The release_issue_failure step must remain unchanged — the in-progress label is still released on pipeline failure.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    %% TERMINALS %%
    UPSTREAM(["upstream steps<br/>(review_pr / re_push / re_push_review)"])
    DONE(["done<br/>(clone preserved)"])
    DELETE_CLONE(["delete_clone<br/>(clone removed)"])
    ESCALATE(["escalate_stop"])

    %% CI WATCH — MODIFIED %%
    CW["● ci_watch<br/>━━━━━━━━━━<br/>wait_for_ci<br/>branch: merge_target<br/>timeout: 300s<br/>skip when open_pr=false"]

    subgraph SuccessPath ["SUCCESS PATH (label stays on issue — PR still open)"]
        direction TB
        CC["confirm_cleanup<br/>━━━━━━━━━━<br/>action: confirm<br/>Delete clone now?"]
    end

    subgraph FailureDiagnosis ["FAILURE PATH — CI diagnosis & retry loop"]
        direction TB
        DC["diagnose_ci<br/>━━━━━━━━━━<br/>run_skill: diagnose-ci<br/>writes diagnosis_path"]
        RC["resolve_ci<br/>━━━━━━━━━━<br/>run_skill: resolve-failures<br/>fixes CI issues"]
        RP["re_push<br/>━━━━━━━━━━<br/>push_to_remote<br/>triggers new CI run"]
    end

    subgraph FailureCleanup ["FAILURE CLEANUP — label released only on failure"]
        direction TB
        RIF["release_issue_failure<br/>━━━━━━━━━━<br/>release_issue tool<br/>skip when no issue_url<br/>removes in-progress label"]
        CLF["cleanup_failure<br/>━━━━━━━━━━<br/>remove_clone keep=true<br/>preserves clone on disk"]
    end

    %% FLOW %%
    UPSTREAM --> CW
    CW -->|"on_success"| CC
    CW -->|"on_failure"| DC

    CC -->|"yes (on_success)"| DELETE_CLONE
    CC -->|"no (on_failure)"| DONE

    DC -->|"on_success / on_failure"| RC
    RC --> RP
    RP -->|"on_success"| CW

    RP -->|"on_failure / max retries"| RIF
    RIF -->|"on_success / on_failure"| CLF
    CLF -->|"on_success / on_failure"| ESCALATE

    %% CLASS ASSIGNMENTS %%
    class UPSTREAM,DONE,DELETE_CLONE,ESCALATE terminal;
    class CW handler;
    class CC phase;
    class DC,RC,RP handler;
    class RIF,CLF detector;
Loading

Color Legend:

Color Category Description
Dark Blue Terminal Start/end states (upstream, done, escalate)
Orange Handler Recipe tool-call steps (ci_watch, diagnose_ci, resolve_ci, re_push)
Purple Phase Control/confirmation steps (confirm_cleanup)
Red Detector Failure guards and cleanup steps (release_issue_failure, cleanup_failure)

Modification Key: = Modified by this PR

Closes #335

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/fix-release-label-20260311-214045-907910/temp/make-plan/fix_release_label_plan_2026-03-11_214500.md

Token Usage Summary

Token Summary

fix

  • input_tokens: 617
  • output_tokens: 241470
  • cache_creation_input_tokens: 802321
  • cache_read_input_tokens: 45488590
  • invocation_count: 11
  • elapsed_seconds: 5789.439858307993

resolve_review

  • input_tokens: 391
  • output_tokens: 168303
  • cache_creation_input_tokens: 449347
  • cache_read_input_tokens: 21658363
  • invocation_count: 7
  • elapsed_seconds: 4158.092358417003

audit_impl

  • input_tokens: 2744
  • output_tokens: 159180
  • cache_creation_input_tokens: 545076
  • cache_read_input_tokens: 4174746
  • invocation_count: 12
  • elapsed_seconds: 4455.282969588996

open_pr

  • input_tokens: 213
  • output_tokens: 137769
  • cache_creation_input_tokens: 422276
  • cache_read_input_tokens: 6866150
  • invocation_count: 8
  • elapsed_seconds: 3639.0721166939948

review_pr

  • input_tokens: 177
  • output_tokens: 244448
  • cache_creation_input_tokens: 479046
  • cache_read_input_tokens: 6204544
  • invocation_count: 7
  • elapsed_seconds: 4626.790592216992

plan

  • input_tokens: 1665
  • output_tokens: 187361
  • cache_creation_input_tokens: 655841
  • cache_read_input_tokens: 11017819
  • invocation_count: 8
  • elapsed_seconds: 3779.893521053007

verify

  • input_tokens: 8274
  • output_tokens: 151282
  • cache_creation_input_tokens: 664012
  • cache_read_input_tokens: 12450916
  • invocation_count: 10
  • elapsed_seconds: 3138.836677256011

implement

  • input_tokens: 785
  • output_tokens: 269097
  • cache_creation_input_tokens: 876108
  • cache_read_input_tokens: 61001114
  • invocation_count: 10
  • elapsed_seconds: 6131.576118467008

analyze_prs

  • input_tokens: 15
  • output_tokens: 29915
  • cache_creation_input_tokens: 67472
  • cache_read_input_tokens: 374939
  • invocation_count: 1
  • elapsed_seconds: 524.8599999569997

merge_pr

  • input_tokens: 166
  • output_tokens: 56815
  • cache_creation_input_tokens: 314110
  • cache_read_input_tokens: 4466380
  • invocation_count: 9
  • elapsed_seconds: 1805.746022643012

create_review_pr

  • input_tokens: 27
  • output_tokens: 20902
  • cache_creation_input_tokens: 62646
  • cache_read_input_tokens: 856008
  • invocation_count: 1
  • elapsed_seconds: 444.64073076299974

resolve_merge_conflicts

  • input_tokens: 88
  • output_tokens: 52987
  • cache_creation_input_tokens: 122668
  • cache_read_input_tokens: 7419558
  • invocation_count: 1
  • elapsed_seconds: 975.0682389580033

Timing Summary

fix

  • total_seconds: 5789.520738078008
  • invocation_count: 11

resolve_review

  • total_seconds: 4158.092358417003
  • invocation_count: 7

audit_impl

  • total_seconds: 4455.326102384002
  • invocation_count: 12

open_pr

  • total_seconds: 3639.0721166939948
  • invocation_count: 8

review_pr

  • total_seconds: 4626.790592216992
  • invocation_count: 7

plan

  • total_seconds: 3780.017026652018
  • invocation_count: 8

verify

  • total_seconds: 3138.9388441649935
  • invocation_count: 10

implement

  • total_seconds: 6131.693511301022
  • invocation_count: 10

analyze_prs

  • total_seconds: 524.8599999569997
  • invocation_count: 1

merge_pr

  • total_seconds: 1805.746022643012
  • invocation_count: 9

create_review_pr

  • total_seconds: 444.64073076299974
  • invocation_count: 1

resolve_merge_conflicts

  • total_seconds: 975.0682389580033
  • invocation_count: 1

clone

  • total_seconds: 14.340457123005763
  • invocation_count: 3

update_ticket

  • total_seconds: 1.0304174909979338
  • invocation_count: 1

capture_base_sha

  • total_seconds: 0.00960076300543733
  • invocation_count: 3

create_branch

  • total_seconds: 1.9964898860052926
  • invocation_count: 3

push_merge_target

  • total_seconds: 2.7216036769968923
  • invocation_count: 3

test

  • total_seconds: 332.84049233200494
  • invocation_count: 5

merge

  • total_seconds: 395.920931871995
  • invocation_count: 3

push

  • total_seconds: 0.9490148440090707
  • invocation_count: 1

🤖 Generated with Claude Code via AutoSkillit

Trecek and others added 5 commits March 11, 2026 21:56
Both test_ip_ci_watch_routing and test_if_ci_watch_routing now assert
on_success == "confirm_cleanup" (not release_issue_success) and assert
that release_issue_success step is absent from the recipe.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rl_pipeline.py

- Add RECIPES_WITH_RELEASE_SUCCESS and RECIPES_WITHOUT_RELEASE_SUCCESS constants
- test_release_issue_steps_present: assert success step absent for implementation/remediation
- test_release_issue_success_routes_to_confirm_cleanup: narrow to recipes with the step
- test_ci_watch_routes_to_release_issue_success -> test_ci_watch_on_success_routing: split
  assertions by recipe set (confirm_cleanup vs release_issue_success)
- test_release_issue_success_with_args_contains_issue_url: narrow to recipes with the step

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ci_watch.on_success now routes directly to confirm_cleanup. The
release_issue_success step is removed entirely — the in-progress label
should remain on the issue while a PR is open. Only the failure path
(release_issue_failure) releases the label.

Updated ci_watch note to explain the routing rationale.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ci_watch.on_success now routes directly to confirm_cleanup. The
release_issue_success step is removed entirely — the in-progress label
should remain on the issue while a PR is open. Only the failure path
(release_issue_failure) releases the label.

Updated ci_watch note to explain the routing rationale.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Shorten docstrings and assertion messages to stay within 99-char limit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

@@ -352,16 +354,23 @@ def test_claim_issue_routes_to_create_branch_on_true(self):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: test_release_issue_steps_present loads YAML for overlapping recipe sets in three separate loops. Recipes in RECIPES_WITH_RELEASE_SUCCESS and RECIPES_WITHOUT_RELEASE_SUCCESS are also in RECIPES, so they get loaded twice. Cache parsed YAML per recipe name to avoid redundant I/O.

@@ -376,8 +385,13 @@ def test_release_issue_failure_routes_to_cleanup_failure(self):
f"{name}: release_issue_failure.on_success should be cleanup_failure"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: test_ci_watch_on_success_routing iterates RECIPES_WITHOUT_RELEASE_SUCCESS and RECIPES_WITH_RELEASE_SUCCESS in two separate loops, each calling yaml.safe_load independently. Merge into one loop or share a fixture to eliminate redundant disk reads.

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 2 actionable issues (warning severity). See inline comments for details.

…_present and merge dual-loop in test_ci_watch_on_success_routing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

Skipped when open_pr=false — no remote feature branch is opened in that mode.
On success, the pipeline is done — cleanup_success removes the clone.
On success, routes directly to confirm_cleanup — the in-progress label is NOT
released here because an open PR still represents active work. The label is
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] bugs: When ci_watch succeeds, the pipeline routes directly to confirm_cleanup without releasing the in-progress label. The comment claims the label is only released on the failure path (release_issue_failure), meaning for a fully successful run (CI passes) in this recipe, the in-progress label is never removed. This leaves the GitHub issue permanently labeled as in-progress after a successful completion.

(conclusion + failed job names) for downstream resolve_ci.
Skipped when open_pr=false — no remote feature branch is opened in that mode.
On success, the pipeline is done — cleanup_success removes the clone.
On success, routes directly to confirm_cleanup — the in-progress label is NOT
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] bugs: Same as implementation.yaml: when ci_watch succeeds, no release_issue step is called before confirm_cleanup. The in-progress label is never released on the success path, leaving the issue permanently marked in-progress after a successful run.

"""Claim/release issue gate steps are present and correctly wired in all 4 recipes."""

RECIPES = ["implementation", "implementation-groups", "remediation", "audit-and-fix"]
RECIPES_WITH_RELEASE_SUCCESS = ["implementation-groups", "audit-and-fix"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: RECIPES_WITH_RELEASE_SUCCESS and RECIPES_WITHOUT_RELEASE_SUCCESS are unguarded class constants with no test asserting they are exhaustive and disjoint relative to RECIPES. A recipe added to RECIPES but not to either split list silently avoids checks. Add: assert set(RECIPES_WITH_RELEASE_SUCCESS) | set(RECIPES_WITHOUT_RELEASE_SUCCESS) == set(RECIPES).

**{name: "confirm_cleanup" for name in self.RECIPES_WITHOUT_RELEASE_SUCCESS},
**{name: "release_issue_success" for name in self.RECIPES_WITH_RELEASE_SUCCESS},
}
for name, expected_route in expected.items():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: test_ci_watch_on_success_routing never asserts that the expected dict covers all RECIPES. If a recipe is added to RECIPES but omitted from both split lists, it is silently skipped. Add: assert set(expected) == set(self.RECIPES).

on_success: confirm_cleanup
on_failure: confirm_cleanup

release_issue_failure:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] arch: implementation.yaml and remediation.yaml skip release_issue_success on the success path, while implementation-groups.yaml and audit-and-fix.yaml retain it. Verify that all exit paths (including open_pr=false mode) in implementation-groups and audit-and-fix do not release the label prematurely.

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 4 blocking issues (2 critical, 2 warning). See inline comments.

Verdict: changes_requested

Critical

  • implementation.yaml:482 — in-progress label never released on the success path (ci_watch → confirm_cleanup skips release_issue_success entirely)
  • remediation.yaml:464 — same issue: label permanently stays on-issue after a successful run

Warning

  • test_issue_url_pipeline.py:330 — RECIPES_WITH_RELEASE_SUCCESS / RECIPES_WITHOUT_RELEASE_SUCCESS have no exhaustiveness assertion vs RECIPES
  • test_issue_url_pipeline.py:391 — test_ci_watch_on_success_routing silently skips any recipe absent from both sub-lists

Trecek and others added 3 commits March 12, 2026 13:25
…ntation and remediation recipes

- Add release_issue_success step to implementation.yaml and remediation.yaml so the
  in-progress label is released when ci_watch succeeds, matching the pattern already
  used in implementation-groups.yaml and audit-and-fix.yaml
- Update ci_watch.on_success route from confirm_cleanup → release_issue_success in both
  recipes; release_issue_success then routes to confirm_cleanup
- Move implementation and remediation into RECIPES_WITH_RELEASE_SUCCESS, clearing
  RECIPES_WITHOUT_RELEASE_SUCCESS (now empty)
- Add test_split_lists_are_exhaustive to guard RECIPES_WITH_RELEASE_SUCCESS |
  RECIPES_WITHOUT_RELEASE_SUCCESS == RECIPES against future drift
- Add set(expected) == set(self.RECIPES) assertion in test_ci_watch_on_success_routing
  to prevent silently skipping any recipe added to RECIPES

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mentation and remediation

Update test_ip_ci_watch_routing and test_if_ci_watch_routing to assert
ci_watch.on_success == "release_issue_success" and that the step exists,
matching the restored label-release behavior on the success path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…uccess

Per issue #335, the in-progress label must NOT be released on the success
path in implementation and remediation recipes — it stays on the issue while
the PR is open. The resolve-review step incorrectly re-added release_issue_success
to both recipes and updated tests to match the wrong behavior.

- Remove release_issue_success step from implementation.yaml
- Remove release_issue_success step from remediation.yaml
- Set ci_watch.on_success → confirm_cleanup in both recipes
- Move implementation and remediation to RECIPES_WITHOUT_RELEASE_SUCCESS in tests
- Revert test_bundled_recipes.py to assert confirm_cleanup and step absence

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek enabled auto-merge March 12, 2026 23:33
@Trecek Trecek disabled auto-merge March 12, 2026 23:44
@Trecek Trecek added this pull request to the merge queue Mar 12, 2026
Merged via the queue into integration with commit bcb54b5 Mar 12, 2026
2 checks passed
@Trecek Trecek deleted the bug-release-issue-success-removes-in-progress-label-when-pr/335 branch March 12, 2026 23:47
Trecek added a commit that referenced this pull request Mar 15, 2026
…, Headless Isolation (#404)

## Summary

Integration rollup of **43 PRs** (#293#406) consolidating **62
commits** across **291 files** (+27,909 / −6,040 lines). This release
advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue
integration, sub-recipe composition, a PostToolUse output reformatter,
headless session isolation guards, and comprehensive pipeline
observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new
test files.

---

## Major Features

### GitHub Merge Queue Integration (#370, #362, #390)
- New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's
merge queue until merged, ejected, or timed out (default 600s). Uses
REST + GraphQL APIs with stuck-queue detection and auto-merge
re-enrollment
- New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`)
— never raises; all outcomes are structured results
- `parse_merge_queue_response()` pure function for GraphQL queue entry
parsing
- New `auto_merge` ingredient in `implementation.yaml` and
`remediation.yaml` — enrolls PRs in the merge queue after CI passes
- Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue
→ wait → handle ejections → re-enter
- `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step
1.5 (CI/review eligibility filtering)

### Sub-Recipe Composition (#380)
- Recipe steps can now reference sub-recipes via `sub_recipe` + `gate`
fields — lazy-loaded and merged at validation time
- Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines
sub-recipe steps with safe name-prefixing and route remapping (`done` →
parent's `on_success`, `escalate` → parent's `on_failure`)
- `_build_active_recipe()` evaluates gate ingredients against
overrides/defaults; dual validation runs on both active and combined
recipes
- First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm →
dispatch workflow, gated by `sprint_mode` ingredient (hidden, default
false)
- Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry`
placeholder step
- New semantic rules: `unknown-sub-recipe` (ERROR),
`circular-sub-recipe` (ERROR) with DFS cycle detection

### PostToolUse Output Reformatter (#293, #405)
- `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw
MCP JSON responses to Markdown-KV before Claude consumes them (30–77%
token overhead reduction)
- Dedicated formatters for 11 high-traffic tools (`run_skill`,
`run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.)
plus a generic KV formatter for remaining tools
- Pipeline vs. interactive mode detection via hook config file
- Unwraps Claude Code's `{"result": "<json-string>"}` envelope before
dispatching
- 1,516-line test file with 40+ behavioral tests

### Headless Session Isolation (#359, #393, #397, #405, #406)
- **Env isolation**: `build_sanitized_env()` strips
`AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing
`AUTOSKILLIT_HEADLESS=1` from leaking into test runners
- **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all
relative paths to session CWD; `_validate_output_paths()` checks
structured output tokens against CWD prefix; `_scan_jsonl_write_paths()`
post-session scanner catches actual Write/Edit/Bash tool calls outside
CWD
- **Headless orchestration guard**: new PreToolUse hook blocks
`run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`,
enforcing Tier 1/Tier 2 nesting invariant
- **`_require_not_headless()` server-side guard**: blocks 10
orchestration-only tools from headless sessions at the handler layer
- **Unified error response contract**: `headless_error_result()`
produces consistent 9-field responses;
`_build_headless_error_response()` canonical builder for all failure
paths in `tools_integrations.py`

### Cook UX Overhaul (#375, #363)
- `open_kitchen` now accepts optional `name` + `overrides` — opens
kitchen AND loads recipe in a single call
- Pre-launch terminal preview with ANSI-colored flow diagram and
ingredients table via new `cli/_ansi.py` module
- `--dangerously-skip-permissions` warning banner with interactive
confirmation prompt
- Randomized session greetings from themed pools
- Orchestrator prompt rewritten: recipe YAML no longer injected via
`--append-system-prompt`; session calls `open_kitchen('{recipe_name}')`
as first action
- Conversational ingredient collection replaces mechanical per-field
prompting

---

## New MCP Tools

| Tool | Gate | Description |
|------|------|-------------|
| `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue
(REST + GraphQL) |
| `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA
for review-first gating |
| `get_quota_events` | Ungated | Surfaces quota guard decisions from
`quota_events.jsonl` |

---

## Pipeline Observability (#318, #341)

- **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source
of truth for all telemetry rendering; replaces dual-formatter
anti-pattern. Four rendering modes: Markdown table, terminal table,
compact KV (for PostToolUse hook)
- `get_token_summary` and `get_timing_summary` gain `format` parameter
(`"json"` | `"table"`)
- `wall_clock_seconds` merged into token summary output — see duration
alongside token counts in one call
- **Telemetry clear marker**: `write_telemetry_clear_marker()` /
`read_telemetry_clear_marker()` prevent token accounting drift on MCP
server restart after `clear=True`
- **Quota event logging**: `quota_check.py` hook now writes structured
JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to
`quota_events.jsonl`

---

## CI Watcher & Remote Resolution Fixes (#395, #406)

- **`CIRunScope` value object** — carries `workflow` + `head_sha` scope;
replaces bare `head_sha` parameter across all CI watcher signatures
- **Workflow filter**: `wait_for_ci` and `get_ci_status` accept
`workflow` parameter (falls back to project-level `config.ci.workflow`),
preventing unrelated workflows (version bumps, labelers) from satisfying
CI checks
- **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out,
startup_failure, cancelled}`
- **Canonical remote resolver** (`execution/remote_resolver.py`):
`resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` —
correctly resolves `owner/repo` after `clone_repo` sets `origin` to
`file://` isolation URL
- **Clone isolation fix**: `clone_repo` now always clones from remote
URL (never local path); sets `origin=file:///<clone>` for isolation and
`upstream=<real_url>` for push/CI operations

---

## PR Pipeline Gates (#317, #343)

- **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`,
`partition_prs()` — partitions PRs into
eligible/CI-blocked/review-blocked with human-readable reasons
- **`pipeline/fidelity.py`**: `extract_linked_issues()`
(Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema
validation
- **`check_pr_mergeable`** now returns `mergeable_status` field
alongside boolean
- **`release_issue`** gains `target_branch` + `staged_label` parameters
for staged issue lifecycle on non-default branches (#392)

---

## Recipe System Changes

### Structural
- `RecipeIngredient.hidden` field — excluded from ingredients table
(used for internal flags like `sprint_mode`)
- `Recipe.experimental` flag parsed from YAML
- `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth
- `format_ingredients_table()` with sorted display order (required →
auto-detect → flags → optional → constants)
- Diagram rendering engine (~670 lines) removed from `diagrams.py` —
rendering now handled by `/render-recipe` skill; format version bumped
to v7

### Recipe YAML Changes
- **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`,
`bugfix-loop.yaml`
- **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml`
- **`implementation.yaml`**: merge queue steps,
`auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""`
(auto-detect), CI workflow filter, `extract_pr_number` step
- **`remediation.yaml`**: `topic` → `task` rename, merge queue steps,
`dry_walkthrough` retries:3 with forward-only routing, `verify` → `test`
rename
- **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step
(replaces `create-review-pr`), post-PR mergeability polling, review
cycle with `resolve-review` retries

### New Semantic Rules
- `missing-output-patterns` (WARNING) — flags `run_skill` steps without
`expected_output_patterns`
- `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist
- `circular-sub-recipe` (ERROR) — DFS cycle detection
- `unknown-skill-command` (ERROR) — validates skill names against
bundled set
- `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes
`open-pr`

---

## New Skills (24)

### Architecture Lens Family (13)
`arch-lens-c4-container`, `arch-lens-concurrency`,
`arch-lens-data-lineage`, `arch-lens-deployment`,
`arch-lens-development`, `arch-lens-error-resilience`,
`arch-lens-module-dependency`, `arch-lens-operational`,
`arch-lens-process-flow`, `arch-lens-repository-access`,
`arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle`

### Audit Family (5)
`audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`,
`audit-tests`

### Planning & Diagramming (3)
`elaborate-phase`, `make-arch-diag`, `make-req`

### Bug/Guard Lifecycle (2)
`design-guards`, `verify-diag`

### Pipeline (1)
`open-integration-pr` — creates integration PRs with per-PR details,
arch-lens diagrams, carried-forward `Closes #N` references, and
auto-closes collapsed PRs

### Sprint Planning (1 — gated by sub-recipe)
`sprint-planner` — selects a focused, conflict-free sprint from a triage
manifest

---

## Skill Modifications (Highlights)

- **`analyze-prs`**: merge queue detection, CI/review eligibility
filtering, queue-mode ordering
- **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git
history mining + GitHub issue cross-reference)
- **`review-pr`**: deterministic diff annotation via
`diff_annotator.py`, echo-primary-obligation step, post-completion
confirmation, degraded-mode narration
- **`collapse-issues`**: content fidelity enforcement — per-issue
`fetch_github_issue` calls, copy-mode body assembly (#388)
- **`prepare-issue`**: multi-keyword dedup search, numbered candidate
selection, extend-existing-issue flow
- **`resolve-review`**: GraphQL thread auto-resolution after addressing
findings (#379)
- **`resolve-merge-conflicts`**: conflict resolution decision report
with per-file log (#389)
- **Cross-skill**: output tokens migrated to `key = value` format;
code-index paths made generic with fallback notes; arch-lens references
fully qualified; anti-prose guards at loop boundaries

---

## CLI & Hooks

### New CLI Commands
- `autoskillit install` — plugin installation + cache refresh
- `autoskillit upgrade` — `.autoskillit/scripts/` →
`.autoskillit/recipes/` migration

### CLI Changes
- `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix`
flag removed
- `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware
registration
- `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions`
confirmation
- `recipes render`: repurposed from generator to viewer (delegates to
`/render-recipe`)
- `serve`: server import deferred to after `configure_logging()` to
prevent stdout corruption

### New Hooks
- `branch_protection_guard.py` (PreToolUse) — denies
`merge_worktree`/`push_to_remote` targeting protected branches
- `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration
tools in headless sessions
- `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter

### Hook Infrastructure
- `HookDef.event_type` field — registry now handles both PreToolUse and
PostToolUse
- `generate_hooks_json()` groups entries by event type
- `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made
event-type-agnostic

---

## Core & Config

### New Core Modules
- `core/branch_guard.py` — `is_protected_branch()` pure function
- `core/github_url.py` — `parse_github_repo()` +
`normalize_owner_repo()` canonical parsers

### Core Type Expansions
- `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset
- `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from
`UNGATED_TOOLS`
- `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response
- `CIRunScope` — immutable scope for CI watcher calls
- `MergeQueueWatcher` protocol
- `SkillResult.cli_subtype` + `write_path_warnings` fields
- `SubprocessRunner.env` parameter

### Config
- `safety.protected_branches`: `[main, integration, stable]`
- `github.staged_label`: `"staged"`
- `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`)
- `branching.default_base_branch`: `"integration"` → `"main"`
- `ModelConfig.default`: `str | None` → `str = "sonnet"`

---

## Infrastructure & Release

### Version
- `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock`
- FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399)

### CI/CD Workflows
- **`version-bump.yml`** (new) — auto patch-bumps `main` on integration
PR merge, force-syncs integration branch one patch ahead
- **`release.yml`** (new) — minor version bump + GitHub Release on merge
to `stable`
- **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python +
Actions)
- **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for
`stable`

### PyPI Readiness
- `pyproject.toml`: `readme`, `license`, `authors`, `keywords`,
`classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion
list

### readOnlyHint Parallel Execution Fix
- All MCP tools annotated `readOnlyHint=True` — enables Claude Code
parallel tool execution (~7x speedup). One deliberate exception:
`wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue
state)

### Tool Response Exception Boundary
- `track_response_size` decorator catches unhandled exceptions and
serializes them as `{"success": false, "subtype": "tool_exception"}` —
prevents FastMCP opaque error wrapping

### SkillResult Subtype Normalization (#358)
- `_normalize_subtype()` gate eliminates dual-source contradiction
between CLI subtype and session outcome
- Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race
artifact)
- Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` /
`"missing_completion_marker"` / `"adjudicated_failure"`

---

## Test Coverage

**47 new test files** (+12,703 lines) covering:

| Area | Key Tests |
|------|-----------|
| Merge queue watcher state machine | `test_merge_queue.py` (226 lines)
|
| Clone isolation × CI resolution | `test_clone_ci_contract.py`,
`test_remote_resolver.py` |
| PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) |
| Branch protection + headless guards |
`test_branch_protection_guard.py`,
`test_headless_orchestration_guard.py` |
| Sub-recipe composition | 5 test files (schema, loading, validation,
sprint mode × 2) |
| Telemetry formatter | `test_telemetry_formatter.py` (281 lines) |
| PR pipeline gates | `test_analyze_prs_gates.py`,
`test_review_pr_fidelity.py` |
| Diff annotator | `test_diff_annotator.py` (242 lines) |
| Skill compliance | Output token format, genericization, loop-boundary
guards |
| Release workflows | Structural contracts for `version-bump.yml`,
`release.yml` |
| Issue content fidelity | Body-assembling skills must call
`fetch_github_issue` per-issue |
| CI watcher scope | `test_ci_params.py` — workflow_id query param
composition |

---

## Consolidated PRs

#293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337,
#338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366,
#368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392,
#393, #395, #396, #397, #399, #405, #406

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant