Skip to content

Implementation Plan: analyze-prs Merge Queue Detection#362

Merged
Trecek merged 13 commits intointegrationfrom
test-github-merge-queue-as-alternative-to-sequential-pr-merg/347
Mar 12, 2026
Merged

Implementation Plan: analyze-prs Merge Queue Detection#362
Trecek merged 13 commits intointegrationfrom
test-github-merge-queue-as-alternative-to-sequential-pr-merg/347

Conversation

@Trecek
Copy link
Collaborator

@Trecek Trecek commented Mar 12, 2026

Summary

Add GitHub merge queue awareness to the analyze-prs skill. When a merge queue is
active on the target branch, the skill uses queue-position ordering and tags all
MERGEABLE queue entries as simple. When the queue is absent or empty it falls back
to the existing file-overlap / topological-sort path. The PR manifest format is
identical in both modes.

Two files change: src/autoskillit/execution/github.py gains parse_merge_queue_response(), a pure function that extracts and normalises merge queue entries from a raw GraphQL response dict; and src/autoskillit/skills/analyze-prs/SKILL.md gains Step 0.5 (merge queue detection) and branches the existing Steps 1–4 so they are skipped or simplified when QUEUE_MODE = true.

Requirements

INFRA — Merge Queue Infrastructure

  • REQ-INFRA-001: The integration branch ruleset must have merge queue enabled.
  • REQ-INFRA-002: The CI workflow must include a merge_group trigger so status checks fire on gh-readonly-queue/* temporary branches.

DETECT — Merge Queue Detection

  • REQ-DETECT-001: The analyze-prs skill must query the GitHub GraphQL API to determine whether a merge queue is enabled on the target branch.
  • REQ-DETECT-002: The analyze-prs skill must retrieve all merge queue entries with their position, state, and associated PR metadata when a queue is detected.
  • REQ-DETECT-003: The system must fall back to existing file-overlap analysis when the merge queue is not enabled or returns no entries.

ORDER — Queue-Position Ordering

  • REQ-ORDER-001: When merge queue entries are available, the PR manifest must list PRs in queue-position order (ascending position field).
  • REQ-ORDER-002: PRs with state: MERGEABLE in the queue must be tagged as simple in the manifest.
  • REQ-ORDER-003: The manifest format must remain identical regardless of whether queue ordering or computed ordering was used.

FALLBACK — Conflict Resolution Fallback

  • REQ-FALLBACK-001: If a queue-ordered PR unexpectedly conflicts during merge, the pipeline must fall back to the existing conflict resolution cycle.
  • REQ-FALLBACK-002: The fallback path must not require any changes to the pr-merge-pipeline.yaml recipe structure.

Architecture Impact

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 45, 'rankSpacing': 55, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal    fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode   fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler     fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase       fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output      fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector    fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    %% TERMINALS %%
    START([START])
    DONE([COMPLETE])

    %% STEP 0 %%
    subgraph S0 ["Step 0 — Authenticate & List PRs"]
        direction TB
        ListPRs["gh pr list<br/>━━━━━━━━━━<br/>--base branch --state open<br/>--json number,title,headRefName,..."]
        ZeroPRs{"0 PRs?"}
    end

    %% STEP 0.5 %%
    subgraph S05 ["● Step 0.5 — Detect GitHub Merge Queue"]
        direction TB
        ResolveRepo["● gh repo view<br/>━━━━━━━━━━<br/>extract OWNER + REPO"]
        GraphQL["● gh api graphql<br/>━━━━━━━━━━<br/>mergeQueue(branch) entries<br/>position · state · pullRequest"]
        ParseFn["● parse_merge_queue_response()<br/>━━━━━━━━━━<br/>github.py — pure function<br/>returns sorted entry list"]
        QueueDecision{"MERGEABLE<br/>entries?"}
    end

    %% QUEUE PATH %%
    subgraph QP ["Queue Path (QUEUE_MODE = true)"]
        direction TB
        FetchMeta["gh pr view (parallel ≤8)<br/>━━━━━━━━━━<br/>headRefName · files · additions<br/>deletions · changedFiles<br/>(no diffs, no body)"]
        QueueOrder["Order by position ASC<br/>━━━━━━━━━━<br/>from QUEUE_ENTRIES<br/>overlap_with = [] for all"]
        TagSimple["Tag complexity<br/>━━━━━━━━━━<br/>MERGEABLE → simple<br/>other states → needs_check"]
    end

    %% EXISTING PATH %%
    subgraph EP ["● Existing Path (QUEUE_MODE = false)"]
        direction TB
        FetchDiffs["● Step 1: Fetch PR diffs (parallel ≤8)<br/>━━━━━━━━━━<br/>gh pr diff + gh pr view files<br/>+ requirements section"]
        CIReview["● Step 1.5: CI & Review gate<br/>━━━━━━━━━━<br/>gh pr checks · gh pr view reviews<br/>→ ELIGIBLE / CI_BLOCKED / REVIEW_BLOCKED"]
        Overlap["● Step 2: File overlap matrix<br/>━━━━━━━━━━<br/>shared_files per PR pair<br/>conflict if shared_files ≠ []"]
        TopoSort["● Step 3: Topological sort<br/>━━━━━━━━━━<br/>no-overlap first (additions ASC)<br/>then overlap graph order"]
        TagComplex["● Step 4: Tag complexity<br/>━━━━━━━━━━<br/>simple / needs_check<br/>per overlap + size rules"]
    end

    %% STEP 5 %%
    subgraph S5 ["Step 5 — Write Outputs (identical format both modes)"]
        direction TB
        WriteJSON["pr_order_{ts}.json<br/>━━━━━━━━━━<br/>integration_branch · base_branch<br/>prs · ci_blocked_prs · review_blocked_prs"]
        WriteMD["pr_analysis_plan_{ts}.md<br/>━━━━━━━━━━<br/>human-readable plan<br/>notes queue source when QUEUE_MODE"]
    end

    S6["Step 6: Verify & Report<br/>━━━━━━━━━━<br/>validate JSON · emit output tokens"]

    %% FLOW %%
    START --> ListPRs
    ListPRs --> ZeroPRs
    ZeroPRs -->|"yes — empty manifest"| DONE
    ZeroPRs -->|"no"| ResolveRepo
    ResolveRepo --> GraphQL
    GraphQL -->|"parse response"| ParseFn
    ParseFn --> QueueDecision

    QueueDecision -->|"yes → QUEUE_MODE=true"| FetchMeta
    QueueDecision -->|"no → QUEUE_MODE=false"| FetchDiffs
    GraphQL -->|"error (auth/rate/network)"| FetchDiffs

    FetchMeta --> QueueOrder --> TagSimple
    FetchDiffs --> CIReview --> Overlap --> TopoSort --> TagComplex

    TagSimple  --> WriteJSON
    TagComplex --> WriteJSON
    WriteJSON  --> WriteMD
    WriteMD    --> S6
    S6         --> DONE

    %% CLASS ASSIGNMENTS %%
    class START,DONE terminal;
    class ZeroPRs,QueueDecision stateNode;
    class ListPRs,FetchDiffs,CIReview,Overlap,TopoSort,TagComplex handler;
    class ResolveRepo,GraphQL,ParseFn newComponent;
    class FetchMeta,QueueOrder,TagSimple newComponent;
    class WriteJSON,WriteMD output;
    class S6 phase;
Loading

Color Legend:

Color Category Description
Dark Blue Terminal START / COMPLETE states
Teal State Decision points (0-PR exit, QUEUE_MODE branch)
Orange Handler Existing processing steps
Green (●) Modified Merge queue detection steps — github.py + SKILL.md
Dark Teal Output Written manifest files
Purple Phase Verify & Report

Closes #347

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/merge-queue-20260311-214047-666786/temp/make-plan/analyze_prs_merge_queue_plan_2026-03-11_120000.md

Token Usage Summary

Token Summary

fix

  • input_tokens: 629
  • output_tokens: 243257
  • cache_creation_input_tokens: 826397
  • cache_read_input_tokens: 45631198
  • invocation_count: 12
  • elapsed_seconds: 5893.771816905999

resolve_review

  • input_tokens: 391
  • output_tokens: 168303
  • cache_creation_input_tokens: 449347
  • cache_read_input_tokens: 21658363
  • invocation_count: 7
  • elapsed_seconds: 4158.092358417003

audit_impl

  • input_tokens: 2756
  • output_tokens: 166320
  • cache_creation_input_tokens: 581154
  • cache_read_input_tokens: 4335466
  • invocation_count: 13
  • elapsed_seconds: 4878.894512576997

open_pr

  • input_tokens: 239
  • output_tokens: 155616
  • cache_creation_input_tokens: 470597
  • cache_read_input_tokens: 7709497
  • invocation_count: 9
  • elapsed_seconds: 3999.4891979619897

review_pr

  • input_tokens: 177
  • output_tokens: 244448
  • cache_creation_input_tokens: 479046
  • cache_read_input_tokens: 6204544
  • invocation_count: 7
  • elapsed_seconds: 4626.790592216992

plan

  • input_tokens: 1665
  • output_tokens: 187361
  • cache_creation_input_tokens: 655841
  • cache_read_input_tokens: 11017819
  • invocation_count: 8
  • elapsed_seconds: 3779.893521053007

verify

  • input_tokens: 8274
  • output_tokens: 151282
  • cache_creation_input_tokens: 664012
  • cache_read_input_tokens: 12450916
  • invocation_count: 10
  • elapsed_seconds: 3138.836677256011

implement

  • input_tokens: 785
  • output_tokens: 269097
  • cache_creation_input_tokens: 876108
  • cache_read_input_tokens: 61001114
  • invocation_count: 10
  • elapsed_seconds: 6131.576118467008

analyze_prs

  • input_tokens: 15
  • output_tokens: 29915
  • cache_creation_input_tokens: 67472
  • cache_read_input_tokens: 374939
  • invocation_count: 1
  • elapsed_seconds: 524.8599999569997

merge_pr

  • input_tokens: 166
  • output_tokens: 56815
  • cache_creation_input_tokens: 314110
  • cache_read_input_tokens: 4466380
  • invocation_count: 9
  • elapsed_seconds: 1805.746022643012

create_review_pr

  • input_tokens: 27
  • output_tokens: 20902
  • cache_creation_input_tokens: 62646
  • cache_read_input_tokens: 856008
  • invocation_count: 1
  • elapsed_seconds: 444.64073076299974

resolve_merge_conflicts

  • input_tokens: 88
  • output_tokens: 52987
  • cache_creation_input_tokens: 122668
  • cache_read_input_tokens: 7419558
  • invocation_count: 1
  • elapsed_seconds: 975.0682389580033

Timing Summary

fix

  • total_seconds: 5893.887897783006
  • invocation_count: 12

resolve_review

  • total_seconds: 4158.092358417003
  • invocation_count: 7

audit_impl

  • total_seconds: 4878.975284532004
  • invocation_count: 13

open_pr

  • total_seconds: 3999.5348451210048
  • invocation_count: 9

review_pr

  • total_seconds: 4626.790592216992
  • invocation_count: 7

plan

  • total_seconds: 3780.017026652018
  • invocation_count: 8

verify

  • total_seconds: 3138.9388441649935
  • invocation_count: 10

implement

  • total_seconds: 6131.693511301022
  • invocation_count: 10

analyze_prs

  • total_seconds: 524.8599999569997
  • invocation_count: 1

merge_pr

  • total_seconds: 1805.746022643012
  • invocation_count: 9

create_review_pr

  • total_seconds: 444.64073076299974
  • invocation_count: 1

resolve_merge_conflicts

  • total_seconds: 975.0682389580033
  • invocation_count: 1

clone

  • total_seconds: 14.340457123005763
  • invocation_count: 3

update_ticket

  • total_seconds: 1.0304174909979338
  • invocation_count: 1

capture_base_sha

  • total_seconds: 0.00960076300543733
  • invocation_count: 3

create_branch

  • total_seconds: 1.9964898860052926
  • invocation_count: 3

push_merge_target

  • total_seconds: 2.7216036769968923
  • invocation_count: 3

test

  • total_seconds: 394.38787825701
  • invocation_count: 6

merge

  • total_seconds: 395.920931871995
  • invocation_count: 3

push

  • total_seconds: 1.8540772910200758
  • invocation_count: 2

🤖 Generated with Claude Code via AutoSkillit

Trecek added 5 commits March 11, 2026 21:55
Adds a module-level pure function that extracts and normalises merge
queue entries from a raw GraphQL response dict, sorted by position
ascending. Returns empty list on null queue, empty queue, GraphQL
errors, or unexpected structure. Used by analyze-prs skill in
queue-detection step 0.5.
Nine tests covering: sorted-by-position, entry field extraction,
all-states preserved, empty nodes, null queue, GraphQL errors,
missing data key, malformed node (None pr_number), and pure function
idempotency.
Inserts Step 0.5 between Step 0 and Step 1: queries the GitHub GraphQL
API for active merge queue entries. When MERGEABLE entries are found,
sets QUEUE_MODE=true and branches all subsequent steps:
- Step 1: lightweight metadata fetch only (no diffs)
- Step 1.5: skipped (MERGEABLE implies CI+review passed)
- Step 2: skipped (queue proves compatibility; overlap_with_pr_numbers=[])
- Step 3: queue position order (ascending)
- Step 4: all MERGEABLE entries tagged simple
- Step 5 .md: notes queue source when QUEUE_MODE=true

Manifest format is identical in both modes.
…in SKILL.md to avoid callable-contract false-positive on module path
"pr_title": pr.get("title", ""),
}
)
return sorted(entries, key=lambda e: e["position"])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] arch: The except clause catches only AttributeError and TypeError, but an error in any single node iteration silently drops ALL subsequent entries rather than skipping just the bad node. Per-iteration error handling would preserve valid entries.

"""
try:
queue = data.get("data", {}).get("repository", {}).get("mergeQueue") or {}
nodes = queue.get("entries", {}).get("nodes") or []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] bugs: node.get("position", 0) defaults to 0 for missing position keys. Multiple entries with missing positions all receive position=0, making their relative sort order undefined (unstable). Use a distinct sentinel (e.g. float('inf')) to make missing-position entries sort to the end.

Each subagent fetches:
- `gh pr diff {number}` — full unified diff
- `gh pr view {number} --json files` — structured file list with additions/deletions per file
- `gh pr view {number} --json body -q .body` — PR body to extract `## Requirements` section if present
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] arch: In QUEUE_MODE, Step 1 fetches gh pr view {number} --json headRefName,files,... but the files field returns an array of objects (each with path, additions, deletions), not a list of plain paths. The instruction omits the .files[].path extraction step, so files_changed will be a list of dicts rather than strings, breaking downstream overlap and tagging steps.

### Step 1: Fetch PR Data

- **If `QUEUE_MODE = false`**: **ALWAYS launch subagents in parallel** — never process PRs
sequentially. Launch one Explore subagent per PR (up to 8 simultaneously; batch in groups
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: QUEUE_MODE and QUEUE_ENTRIES are introduced in Step 0.5 but have no single authoritative definition or variable declaration section. Their semantics are repeated inline across Steps 1, 1.5, 2, 3, and 4, scattering the feature definition rather than defining once and referencing.

{requirements_section from PR body}

{repeat for each PR}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: The Integration Strategy template uses curly-brace literal syntax {If QUEUE_MODE = true, add: ...} while all other conditional branches throughout the skill use markdown bold - **If QUEUE_MODE = true**: notation. The mixed notation styles are inconsistent.

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 5 blocking issues. See inline comments.

Trecek and others added 4 commits March 12, 2026 09:54
…n parse_merge_queue_response

Per reviewer: wrap loop body in per-iteration try/except so a bad node skips
that entry without dropping subsequent valid entries.  Use float('inf') as
position sentinel so missing-position entries sort to the end rather than
all colliding at position 0 with undefined relative order.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Per reviewer: `gh pr view --json files` returns an array of objects
{path, additions, deletions}, not plain strings.  Add explicit -q
'[.files[].path]' extraction step for both QUEUE_MODE=false subagents
and QUEUE_MODE=true parallel fetches so files_changed is a list of
strings rather than a list of dicts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nitions to analyze-prs

Per reviewer: QUEUE_MODE and QUEUE_ENTRIES had no single definition section;
their semantics were scattered across Steps 1-4.  Add a state variable table
at the end of Step 0.5 that serves as the canonical definition point, so
downstream step references can point to one source of truth.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t of SKILL.md

Per reviewer: the Integration Strategy template used curly-brace literal
syntax {If QUEUE_MODE = true, add: ...} while every other conditional
branch uses markdown bold `- **If \`QUEUE_MODE = true\`**:` notation.
Replace with consistent markdown bold form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

Note: GitHub prevents requesting changes on your own PR. Posting as COMMENT instead.

6 blocking findings across defense, cohesion, bugs, and tests dimensions. See inline comments.

return "\n".join(lines)


def parse_merge_queue_response(data: dict) -> list[dict]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: parse_merge_queue_response uses bare dict for both the data parameter and return type. Use dict[str, object] or a TypedDict to enforce shape at the boundary and provide type safety for nested access patterns.

- the response contains GraphQL errors
- the response structure is unexpected
"""
try:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] bugs: Mutable {} literal used as the default in queue.get("entries", {}). While harmless here (never mutated), idiomatic Python uses None with or {} for clarity and to avoid accidental mutation in future refactors.

try:
pr = node.get("pullRequest") or {}
entries.append(
{
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: GraphQL errors in the response (e.g. a {"errors": [...]} response) are silently swallowed — the function returns [] with no indication an error occurred vs. a legitimately empty queue. Callers cannot distinguish a permission error from an empty queue, masking access failures.

.conclusion != null and
.conclusion != "success" and
.conclusion != "skipped" and
.conclusion != "neutral"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: In the QUEUE_MODE = true branch of Step 1.5, CI_BLOCKED_PRS and REVIEW_BLOCKED_PRS are described as empty but not explicitly initialized (no bash array assignment shown), while the QUEUE_MODE = false branch initializes them with explicit CI_BLOCKED_PRS=() declarations. Step 5 manifest writing may reference unset variables in queue mode.

{requirements_section from PR body}

{repeat for each PR}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: The QUEUE_MODE = true conditional for Integration Strategy is placed before the template prose placeholder ({2-3 sentences...}), breaking the consistent inline-branch pattern used throughout Steps 1-4. Move it inside the template section or document why this placement differs.


def test_parse_merge_queue_response_bad_node_skips_not_drops_rest():
"""A None node (or other non-dict) skips that entry but preserves valid entries."""
data = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] bugs: test_parse_merge_queue_response_malformed_node_skipped asserts that a node with a missing pullRequest key is included with pr_number=None. However, the parse_merge_queue_response docstring states the function returns an empty list for unexpected structure. Either update the docstring to document that missing keys yield null fields, or fix the implementation to skip such nodes (and assert len(entries) == 0).

data = {
"data": {
"repository": {
"mergeQueue": {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: Test name test_parse_merge_queue_response_malformed_node_skipped says the entry is skipped, but the assertion shows it IS included with pr_number=None. The comment contradicts the test name. Rename to test_parse_merge_queue_response_missing_pullrequest_included_with_none to accurately describe the behavior.

Trecek added 3 commits March 12, 2026 13:27
…raphQL error logging

- Change signature from bare `dict` to `dict[str, Any]` for both parameter and return type
- Replace mutable `{}` default with `queue.get("entries") or {}` idiom
- Detect and log GraphQL errors at WARNING level before parsing; return [] early
- Update docstring to document null-pr_number behavior for missing pullRequest key
…gration Strategy placement

- Add explicit CI_BLOCKED_PRS=() and REVIEW_BLOCKED_PRS=() bash assignments in the
  QUEUE_MODE=true branch of Step 1.5 so Step 5 can reference both arrays unconditionally
- Move the QUEUE_MODE=true conditional in the Integration Strategy template inside the
  template section (after the prose placeholder) to match the inline-branch pattern used
  throughout Steps 1-4
…e inclusion behavior

Test name test_parse_merge_queue_response_malformed_node_skipped was contradicted by
its own assertion (entry IS included with pr_number=None). Rename to
test_parse_merge_queue_response_missing_pullrequest_included_with_none and update
the inline comment for clarity.
@Trecek Trecek added this pull request to the merge queue Mar 12, 2026
@Trecek Trecek removed this pull request from the merge queue due to a manual request Mar 12, 2026
The merge queue requires this event trigger to run status checks on
gh-readonly-queue/* temporary branches. Without it the queue sits at
AWAITING_CHECKS indefinitely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek merged commit 5bf0814 into integration Mar 12, 2026
2 checks passed
@Trecek Trecek deleted the test-github-merge-queue-as-alternative-to-sequential-pr-merg/347 branch March 12, 2026 22:52
Trecek added a commit that referenced this pull request Mar 15, 2026
…, Headless Isolation (#404)

## Summary

Integration rollup of **43 PRs** (#293#406) consolidating **62
commits** across **291 files** (+27,909 / −6,040 lines). This release
advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue
integration, sub-recipe composition, a PostToolUse output reformatter,
headless session isolation guards, and comprehensive pipeline
observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new
test files.

---

## Major Features

### GitHub Merge Queue Integration (#370, #362, #390)
- New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's
merge queue until merged, ejected, or timed out (default 600s). Uses
REST + GraphQL APIs with stuck-queue detection and auto-merge
re-enrollment
- New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`)
— never raises; all outcomes are structured results
- `parse_merge_queue_response()` pure function for GraphQL queue entry
parsing
- New `auto_merge` ingredient in `implementation.yaml` and
`remediation.yaml` — enrolls PRs in the merge queue after CI passes
- Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue
→ wait → handle ejections → re-enter
- `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step
1.5 (CI/review eligibility filtering)

### Sub-Recipe Composition (#380)
- Recipe steps can now reference sub-recipes via `sub_recipe` + `gate`
fields — lazy-loaded and merged at validation time
- Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines
sub-recipe steps with safe name-prefixing and route remapping (`done` →
parent's `on_success`, `escalate` → parent's `on_failure`)
- `_build_active_recipe()` evaluates gate ingredients against
overrides/defaults; dual validation runs on both active and combined
recipes
- First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm →
dispatch workflow, gated by `sprint_mode` ingredient (hidden, default
false)
- Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry`
placeholder step
- New semantic rules: `unknown-sub-recipe` (ERROR),
`circular-sub-recipe` (ERROR) with DFS cycle detection

### PostToolUse Output Reformatter (#293, #405)
- `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw
MCP JSON responses to Markdown-KV before Claude consumes them (30–77%
token overhead reduction)
- Dedicated formatters for 11 high-traffic tools (`run_skill`,
`run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.)
plus a generic KV formatter for remaining tools
- Pipeline vs. interactive mode detection via hook config file
- Unwraps Claude Code's `{"result": "<json-string>"}` envelope before
dispatching
- 1,516-line test file with 40+ behavioral tests

### Headless Session Isolation (#359, #393, #397, #405, #406)
- **Env isolation**: `build_sanitized_env()` strips
`AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing
`AUTOSKILLIT_HEADLESS=1` from leaking into test runners
- **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all
relative paths to session CWD; `_validate_output_paths()` checks
structured output tokens against CWD prefix; `_scan_jsonl_write_paths()`
post-session scanner catches actual Write/Edit/Bash tool calls outside
CWD
- **Headless orchestration guard**: new PreToolUse hook blocks
`run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`,
enforcing Tier 1/Tier 2 nesting invariant
- **`_require_not_headless()` server-side guard**: blocks 10
orchestration-only tools from headless sessions at the handler layer
- **Unified error response contract**: `headless_error_result()`
produces consistent 9-field responses;
`_build_headless_error_response()` canonical builder for all failure
paths in `tools_integrations.py`

### Cook UX Overhaul (#375, #363)
- `open_kitchen` now accepts optional `name` + `overrides` — opens
kitchen AND loads recipe in a single call
- Pre-launch terminal preview with ANSI-colored flow diagram and
ingredients table via new `cli/_ansi.py` module
- `--dangerously-skip-permissions` warning banner with interactive
confirmation prompt
- Randomized session greetings from themed pools
- Orchestrator prompt rewritten: recipe YAML no longer injected via
`--append-system-prompt`; session calls `open_kitchen('{recipe_name}')`
as first action
- Conversational ingredient collection replaces mechanical per-field
prompting

---

## New MCP Tools

| Tool | Gate | Description |
|------|------|-------------|
| `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue
(REST + GraphQL) |
| `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA
for review-first gating |
| `get_quota_events` | Ungated | Surfaces quota guard decisions from
`quota_events.jsonl` |

---

## Pipeline Observability (#318, #341)

- **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source
of truth for all telemetry rendering; replaces dual-formatter
anti-pattern. Four rendering modes: Markdown table, terminal table,
compact KV (for PostToolUse hook)
- `get_token_summary` and `get_timing_summary` gain `format` parameter
(`"json"` | `"table"`)
- `wall_clock_seconds` merged into token summary output — see duration
alongside token counts in one call
- **Telemetry clear marker**: `write_telemetry_clear_marker()` /
`read_telemetry_clear_marker()` prevent token accounting drift on MCP
server restart after `clear=True`
- **Quota event logging**: `quota_check.py` hook now writes structured
JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to
`quota_events.jsonl`

---

## CI Watcher & Remote Resolution Fixes (#395, #406)

- **`CIRunScope` value object** — carries `workflow` + `head_sha` scope;
replaces bare `head_sha` parameter across all CI watcher signatures
- **Workflow filter**: `wait_for_ci` and `get_ci_status` accept
`workflow` parameter (falls back to project-level `config.ci.workflow`),
preventing unrelated workflows (version bumps, labelers) from satisfying
CI checks
- **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out,
startup_failure, cancelled}`
- **Canonical remote resolver** (`execution/remote_resolver.py`):
`resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` —
correctly resolves `owner/repo` after `clone_repo` sets `origin` to
`file://` isolation URL
- **Clone isolation fix**: `clone_repo` now always clones from remote
URL (never local path); sets `origin=file:///<clone>` for isolation and
`upstream=<real_url>` for push/CI operations

---

## PR Pipeline Gates (#317, #343)

- **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`,
`partition_prs()` — partitions PRs into
eligible/CI-blocked/review-blocked with human-readable reasons
- **`pipeline/fidelity.py`**: `extract_linked_issues()`
(Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema
validation
- **`check_pr_mergeable`** now returns `mergeable_status` field
alongside boolean
- **`release_issue`** gains `target_branch` + `staged_label` parameters
for staged issue lifecycle on non-default branches (#392)

---

## Recipe System Changes

### Structural
- `RecipeIngredient.hidden` field — excluded from ingredients table
(used for internal flags like `sprint_mode`)
- `Recipe.experimental` flag parsed from YAML
- `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth
- `format_ingredients_table()` with sorted display order (required →
auto-detect → flags → optional → constants)
- Diagram rendering engine (~670 lines) removed from `diagrams.py` —
rendering now handled by `/render-recipe` skill; format version bumped
to v7

### Recipe YAML Changes
- **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`,
`bugfix-loop.yaml`
- **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml`
- **`implementation.yaml`**: merge queue steps,
`auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""`
(auto-detect), CI workflow filter, `extract_pr_number` step
- **`remediation.yaml`**: `topic` → `task` rename, merge queue steps,
`dry_walkthrough` retries:3 with forward-only routing, `verify` → `test`
rename
- **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step
(replaces `create-review-pr`), post-PR mergeability polling, review
cycle with `resolve-review` retries

### New Semantic Rules
- `missing-output-patterns` (WARNING) — flags `run_skill` steps without
`expected_output_patterns`
- `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist
- `circular-sub-recipe` (ERROR) — DFS cycle detection
- `unknown-skill-command` (ERROR) — validates skill names against
bundled set
- `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes
`open-pr`

---

## New Skills (24)

### Architecture Lens Family (13)
`arch-lens-c4-container`, `arch-lens-concurrency`,
`arch-lens-data-lineage`, `arch-lens-deployment`,
`arch-lens-development`, `arch-lens-error-resilience`,
`arch-lens-module-dependency`, `arch-lens-operational`,
`arch-lens-process-flow`, `arch-lens-repository-access`,
`arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle`

### Audit Family (5)
`audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`,
`audit-tests`

### Planning & Diagramming (3)
`elaborate-phase`, `make-arch-diag`, `make-req`

### Bug/Guard Lifecycle (2)
`design-guards`, `verify-diag`

### Pipeline (1)
`open-integration-pr` — creates integration PRs with per-PR details,
arch-lens diagrams, carried-forward `Closes #N` references, and
auto-closes collapsed PRs

### Sprint Planning (1 — gated by sub-recipe)
`sprint-planner` — selects a focused, conflict-free sprint from a triage
manifest

---

## Skill Modifications (Highlights)

- **`analyze-prs`**: merge queue detection, CI/review eligibility
filtering, queue-mode ordering
- **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git
history mining + GitHub issue cross-reference)
- **`review-pr`**: deterministic diff annotation via
`diff_annotator.py`, echo-primary-obligation step, post-completion
confirmation, degraded-mode narration
- **`collapse-issues`**: content fidelity enforcement — per-issue
`fetch_github_issue` calls, copy-mode body assembly (#388)
- **`prepare-issue`**: multi-keyword dedup search, numbered candidate
selection, extend-existing-issue flow
- **`resolve-review`**: GraphQL thread auto-resolution after addressing
findings (#379)
- **`resolve-merge-conflicts`**: conflict resolution decision report
with per-file log (#389)
- **Cross-skill**: output tokens migrated to `key = value` format;
code-index paths made generic with fallback notes; arch-lens references
fully qualified; anti-prose guards at loop boundaries

---

## CLI & Hooks

### New CLI Commands
- `autoskillit install` — plugin installation + cache refresh
- `autoskillit upgrade` — `.autoskillit/scripts/` →
`.autoskillit/recipes/` migration

### CLI Changes
- `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix`
flag removed
- `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware
registration
- `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions`
confirmation
- `recipes render`: repurposed from generator to viewer (delegates to
`/render-recipe`)
- `serve`: server import deferred to after `configure_logging()` to
prevent stdout corruption

### New Hooks
- `branch_protection_guard.py` (PreToolUse) — denies
`merge_worktree`/`push_to_remote` targeting protected branches
- `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration
tools in headless sessions
- `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter

### Hook Infrastructure
- `HookDef.event_type` field — registry now handles both PreToolUse and
PostToolUse
- `generate_hooks_json()` groups entries by event type
- `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made
event-type-agnostic

---

## Core & Config

### New Core Modules
- `core/branch_guard.py` — `is_protected_branch()` pure function
- `core/github_url.py` — `parse_github_repo()` +
`normalize_owner_repo()` canonical parsers

### Core Type Expansions
- `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset
- `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from
`UNGATED_TOOLS`
- `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response
- `CIRunScope` — immutable scope for CI watcher calls
- `MergeQueueWatcher` protocol
- `SkillResult.cli_subtype` + `write_path_warnings` fields
- `SubprocessRunner.env` parameter

### Config
- `safety.protected_branches`: `[main, integration, stable]`
- `github.staged_label`: `"staged"`
- `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`)
- `branching.default_base_branch`: `"integration"` → `"main"`
- `ModelConfig.default`: `str | None` → `str = "sonnet"`

---

## Infrastructure & Release

### Version
- `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock`
- FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399)

### CI/CD Workflows
- **`version-bump.yml`** (new) — auto patch-bumps `main` on integration
PR merge, force-syncs integration branch one patch ahead
- **`release.yml`** (new) — minor version bump + GitHub Release on merge
to `stable`
- **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python +
Actions)
- **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for
`stable`

### PyPI Readiness
- `pyproject.toml`: `readme`, `license`, `authors`, `keywords`,
`classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion
list

### readOnlyHint Parallel Execution Fix
- All MCP tools annotated `readOnlyHint=True` — enables Claude Code
parallel tool execution (~7x speedup). One deliberate exception:
`wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue
state)

### Tool Response Exception Boundary
- `track_response_size` decorator catches unhandled exceptions and
serializes them as `{"success": false, "subtype": "tool_exception"}` —
prevents FastMCP opaque error wrapping

### SkillResult Subtype Normalization (#358)
- `_normalize_subtype()` gate eliminates dual-source contradiction
between CLI subtype and session outcome
- Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race
artifact)
- Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` /
`"missing_completion_marker"` / `"adjudicated_failure"`

---

## Test Coverage

**47 new test files** (+12,703 lines) covering:

| Area | Key Tests |
|------|-----------|
| Merge queue watcher state machine | `test_merge_queue.py` (226 lines)
|
| Clone isolation × CI resolution | `test_clone_ci_contract.py`,
`test_remote_resolver.py` |
| PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) |
| Branch protection + headless guards |
`test_branch_protection_guard.py`,
`test_headless_orchestration_guard.py` |
| Sub-recipe composition | 5 test files (schema, loading, validation,
sprint mode × 2) |
| Telemetry formatter | `test_telemetry_formatter.py` (281 lines) |
| PR pipeline gates | `test_analyze_prs_gates.py`,
`test_review_pr_fidelity.py` |
| Diff annotator | `test_diff_annotator.py` (242 lines) |
| Skill compliance | Output token format, genericization, loop-boundary
guards |
| Release workflows | Structural contracts for `version-bump.yml`,
`release.yml` |
| Issue content fidelity | Body-assembling skills must call
`fetch_github_issue` per-issue |
| CI watcher scope | `test_ci_params.py` — workflow_id query param
composition |

---

## Consolidated PRs

#293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337,
#338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366,
#368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392,
#393, #395, #396, #397, #399, #405, #406

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant