Skip to content

Rectify: collapse-issues Content Fidelity — Issue Body Data Lineage#388

Merged
Trecek merged 4 commits intointegrationfrom
collapse-issues-skill-produces-shallow-hyperlink-summaries-i/372
Mar 15, 2026
Merged

Rectify: collapse-issues Content Fidelity — Issue Body Data Lineage#388
Trecek merged 4 commits intointegrationfrom
collapse-issues-skill-produces-shallow-hyperlink-summaries-i/372

Conversation

@Trecek
Copy link
Collaborator

@Trecek Trecek commented Mar 15, 2026

Summary

collapse-issues assembled multi-issue combined bodies by fetching all bodies in bulk via gh issue list --json body — which truncates content at ~256 characters — and then instructing the LLM to insert <full body of issue N, verbatim> into the output. The angle-bracket syntax signaled a fill-in-the-blank template slot, causing the LLM to substitute a one-sentence summary or hyperlink rather than the actual body text. No contract test caught either defect.

The fix establishes a content-fidelity lineage contract: collapse-issues now calls fetch_github_issue per-issue (REST endpoint, full body) and uses explicit verbatim-paste language with a COPY MODE instruction. Three new contract tests enforce this for collapse-issues specifically, and a new cross-skill sweep (test_issue_content_fidelity.py) makes the defect structurally impossible in any future skill that assembles ## From #N body sections.

Requirements

FETCH — Source Issue Content Retrieval

  • REQ-FETCH-001: The skill must fetch the full body of each source issue via gh issue view N --json body before composing the collapsed issue.
  • REQ-FETCH-002: The skill must not summarize, truncate, or paraphrase source issue content during collapse.

INLINE — Content Inlining

  • REQ-INLINE-001: The collapsed issue body must contain the complete body text from every source issue, inlined under clearly labeled sections.
  • REQ-INLINE-002: Structured sections from source issues (requirements, acceptance criteria, design proposals, etc.) must be preserved verbatim in the collapsed output.
  • REQ-INLINE-003: Each inlined section must identify its source issue number for provenance (e.g., "From feat: allow users to block or allow specific bundled skills and recipes #29").

SELF — Self-Containment

  • REQ-SELF-001: A reader of the collapsed issue must be able to understand all source content without clicking through to original issues.
  • REQ-SELF-002: Cross-reference links to originals must serve as provenance markers, not as substitutes for inlined content.

STALE — Staleness Handling

  • REQ-STALE-001: When a source issue is known to be outdated, the collapsed issue must include a staleness note adjacent to that source's inlined content.

Architecture Impact

Data Lineage Diagram

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
flowchart LR
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    subgraph GitHub ["GitHub API"]
        BULK["● gh issue list<br/>━━━━━━━━━━<br/>number, title, labels<br/>body removed from query"]
        REST["fetch_github_issue<br/>━━━━━━━━━━<br/>GET /issues/{N} (REST)<br/>Full untruncated body"]
    end

    subgraph Skill ["● collapse-issues SKILL.md"]
        STEP3["● Step 3: Metadata Fetch<br/>━━━━━━━━━━<br/>number, title, labels only<br/>no body — grouping input"]
        STEP4["Step 4: LLM Grouping<br/>━━━━━━━━━━<br/>title + labels scoring<br/>form candidate groups"]
        STEP5["★ Step 5: Per-Issue Fetch<br/>━━━━━━━━━━<br/>fetch_github_issue per issue<br/>include_comments=true"]
        FETCHED["★ fetched_content[N]<br/>━━━━━━━━━━<br/>content.body field<br/>full untruncated text"]
        STEP6B["● Step 6b: Body Assembly<br/>━━━━━━━━━━<br/>SWITCH TO COPY MODE<br/>verbatim paste only"]
        NEVER["● NEVER Block<br/>━━━━━━━━━━<br/>no summarize/paraphrase<br/>no angle-bracket syntax<br/>no bulk body field"]
    end

    subgraph Output ["Combined Issue"]
        COMBINED["Combined Issue Body<br/>━━━━━━━━━━<br/>## From #N sections<br/>full verbatim content"]
    end

    subgraph Tests ["★ Contract Enforcement"]
        T1["● test_collapse_issues_contracts.py<br/>━━━━━━━━━━<br/>+ uses_per_issue_fetch<br/>+ never_summarize<br/>+ no_angle_bracket_placeholder"]
        T2["★ test_issue_content_fidelity.py<br/>━━━━━━━━━━<br/>cross-skill sweep:<br/>## From # → fetch_github_issue<br/>no angle-bracket syntax<br/>NEVER summarize block"]
        T3["● test_github_ops.py<br/>━━━━━━━━━━<br/>stale pytest.skip guards<br/>removed — tests active"]
    end

    BULK -->|"number,title,labels"| STEP3
    STEP3 -->|"issue list (no body)"| STEP4
    STEP4 -->|"qualifying groups"| STEP5
    STEP5 -->|"per-issue REST call"| REST
    REST -->|"full body"| FETCHED
    FETCHED -->|"content field"| STEP6B
    NEVER -.->|"constrains"| STEP6B
    STEP6B -->|"verbatim sections"| COMBINED

    T1 -.->|"enforces"| STEP5
    T1 -.->|"enforces"| NEVER
    T2 -.->|"cross-skill guard"| STEP5
    T3 -.->|"activates skipped tests"| T1

    class BULK handler;
    class REST stateNode;
    class STEP3 handler;
    class STEP4 phase;
    class STEP5 newComponent;
    class FETCHED newComponent;
    class STEP6B newComponent;
    class NEVER newComponent;
    class COMBINED output;
    class T1 detector;
    class T2 newComponent;
    class T3 detector;
Loading

Color Legend:

Color Category Description
Dark Blue Input Data origins and entry points
Orange Handler Existing fetch/processing steps
Purple Phase LLM control and grouping analysis
Green New/Modified ★ New or ● modified components
Teal Data REST source (authoritative data)
Dark Teal Output Combined issue result
Red Contract Tests Enforcement guards

Closes #372

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-372-20260314-205033-708941/temp/rectify/rectify_collapse-issues-content-fidelity_2026-03-14_205033.md

Token Usage Summary

Token Summary\n\nNo token data captured for this run.

🤖 Generated with Claude Code via AutoSkillit

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

assert "collapse-issues" in text, "triage-issues must document optional --collapse integration"


def test_collapse_issues_uses_per_issue_fetch():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: test_collapse_issues_uses_per_issue_fetch, test_collapse_issues_never_summarize, and test_collapse_issues_no_angle_bracket_body_placeholder duplicate contract coverage already provided cross-skill by test_issue_content_fidelity.py. The same contracts are now enforced in two places for collapse-issues. Consider whether this skill-specific duplication is intentional (defense-in-depth) or should be removed.

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 2 findings (1 actionable, 1 requiring human decision). See inline comments.

Actionable (changes_requested):

  • L154: Local re import inside test body — move to module level for consistency

Needs decision:

  • L119: Skill-specific tests (test_collapse_issues_uses_per_issue_fetch, test_collapse_issues_never_summarize, test_collapse_issues_no_angle_bracket_body_placeholder) duplicate cross-skill coverage in test_issue_content_fidelity.py — intentional defense-in-depth or removable redundancy?

@Trecek Trecek enabled auto-merge March 15, 2026 06:16
Trecek and others added 4 commits March 15, 2026 00:56
- Replace bulk `gh issue list --json body` (truncates at ~256 chars) with
  per-issue `fetch_github_issue` call (REST endpoint, full body) — new Step 5
- Renumber Steps 5-8 → 6-9 after inserting per-issue fetch step
- Replace angle-bracket `<full body of issue N, verbatim>` placeholder with
  explicit verbatim-paste instructions and SWITCH TO COPY MODE directive
- Add four NEVER constraints: no summarize/paraphrase/abbreviate, no hyperlink
  substitution, no angle-bracket body syntax, no body from bulk list endpoint
- Add three contract tests to test_collapse_issues_contracts.py:
  test_collapse_issues_uses_per_issue_fetch,
  test_collapse_issues_never_summarize,
  test_collapse_issues_no_angle_bracket_body_placeholder
- Remove stale pytest.skip guards from test_github_ops.py (skill now exists)
- Add test_issue_content_fidelity.py: cross-skill sweep ensuring any skill
  with ## From # body-assembly sections uses fetch_github_issue and forbids
  angle-bracket copy syntax (immune to future regressions)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add 'do not output any prose' guards to Step 5 and Step 7 loop
constructs that contain fetch_github_issue tool calls, satisfying
the test_no_text_then_tool_in_any_step compliance contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ontracts

Local imports of `re` inside each test body were inconsistent with the
module-level import pattern used elsewhere in the contracts test suite.
Consolidate to a single module-level import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the collapse-issues-skill-produces-shallow-hyperlink-summaries-i/372 branch from 2d94f0b to 33d9f96 Compare March 15, 2026 07:57
@Trecek Trecek added this pull request to the merge queue Mar 15, 2026
Merged via the queue into integration with commit ffbc9e4 Mar 15, 2026
2 checks passed
@Trecek Trecek deleted the collapse-issues-skill-produces-shallow-hyperlink-summaries-i/372 branch March 15, 2026 08:00
Trecek added a commit that referenced this pull request Mar 15, 2026
…, Headless Isolation (#404)

## Summary

Integration rollup of **43 PRs** (#293#406) consolidating **62
commits** across **291 files** (+27,909 / −6,040 lines). This release
advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue
integration, sub-recipe composition, a PostToolUse output reformatter,
headless session isolation guards, and comprehensive pipeline
observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new
test files.

---

## Major Features

### GitHub Merge Queue Integration (#370, #362, #390)
- New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's
merge queue until merged, ejected, or timed out (default 600s). Uses
REST + GraphQL APIs with stuck-queue detection and auto-merge
re-enrollment
- New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`)
— never raises; all outcomes are structured results
- `parse_merge_queue_response()` pure function for GraphQL queue entry
parsing
- New `auto_merge` ingredient in `implementation.yaml` and
`remediation.yaml` — enrolls PRs in the merge queue after CI passes
- Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue
→ wait → handle ejections → re-enter
- `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step
1.5 (CI/review eligibility filtering)

### Sub-Recipe Composition (#380)
- Recipe steps can now reference sub-recipes via `sub_recipe` + `gate`
fields — lazy-loaded and merged at validation time
- Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines
sub-recipe steps with safe name-prefixing and route remapping (`done` →
parent's `on_success`, `escalate` → parent's `on_failure`)
- `_build_active_recipe()` evaluates gate ingredients against
overrides/defaults; dual validation runs on both active and combined
recipes
- First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm →
dispatch workflow, gated by `sprint_mode` ingredient (hidden, default
false)
- Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry`
placeholder step
- New semantic rules: `unknown-sub-recipe` (ERROR),
`circular-sub-recipe` (ERROR) with DFS cycle detection

### PostToolUse Output Reformatter (#293, #405)
- `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw
MCP JSON responses to Markdown-KV before Claude consumes them (30–77%
token overhead reduction)
- Dedicated formatters for 11 high-traffic tools (`run_skill`,
`run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.)
plus a generic KV formatter for remaining tools
- Pipeline vs. interactive mode detection via hook config file
- Unwraps Claude Code's `{"result": "<json-string>"}` envelope before
dispatching
- 1,516-line test file with 40+ behavioral tests

### Headless Session Isolation (#359, #393, #397, #405, #406)
- **Env isolation**: `build_sanitized_env()` strips
`AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing
`AUTOSKILLIT_HEADLESS=1` from leaking into test runners
- **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all
relative paths to session CWD; `_validate_output_paths()` checks
structured output tokens against CWD prefix; `_scan_jsonl_write_paths()`
post-session scanner catches actual Write/Edit/Bash tool calls outside
CWD
- **Headless orchestration guard**: new PreToolUse hook blocks
`run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`,
enforcing Tier 1/Tier 2 nesting invariant
- **`_require_not_headless()` server-side guard**: blocks 10
orchestration-only tools from headless sessions at the handler layer
- **Unified error response contract**: `headless_error_result()`
produces consistent 9-field responses;
`_build_headless_error_response()` canonical builder for all failure
paths in `tools_integrations.py`

### Cook UX Overhaul (#375, #363)
- `open_kitchen` now accepts optional `name` + `overrides` — opens
kitchen AND loads recipe in a single call
- Pre-launch terminal preview with ANSI-colored flow diagram and
ingredients table via new `cli/_ansi.py` module
- `--dangerously-skip-permissions` warning banner with interactive
confirmation prompt
- Randomized session greetings from themed pools
- Orchestrator prompt rewritten: recipe YAML no longer injected via
`--append-system-prompt`; session calls `open_kitchen('{recipe_name}')`
as first action
- Conversational ingredient collection replaces mechanical per-field
prompting

---

## New MCP Tools

| Tool | Gate | Description |
|------|------|-------------|
| `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue
(REST + GraphQL) |
| `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA
for review-first gating |
| `get_quota_events` | Ungated | Surfaces quota guard decisions from
`quota_events.jsonl` |

---

## Pipeline Observability (#318, #341)

- **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source
of truth for all telemetry rendering; replaces dual-formatter
anti-pattern. Four rendering modes: Markdown table, terminal table,
compact KV (for PostToolUse hook)
- `get_token_summary` and `get_timing_summary` gain `format` parameter
(`"json"` | `"table"`)
- `wall_clock_seconds` merged into token summary output — see duration
alongside token counts in one call
- **Telemetry clear marker**: `write_telemetry_clear_marker()` /
`read_telemetry_clear_marker()` prevent token accounting drift on MCP
server restart after `clear=True`
- **Quota event logging**: `quota_check.py` hook now writes structured
JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to
`quota_events.jsonl`

---

## CI Watcher & Remote Resolution Fixes (#395, #406)

- **`CIRunScope` value object** — carries `workflow` + `head_sha` scope;
replaces bare `head_sha` parameter across all CI watcher signatures
- **Workflow filter**: `wait_for_ci` and `get_ci_status` accept
`workflow` parameter (falls back to project-level `config.ci.workflow`),
preventing unrelated workflows (version bumps, labelers) from satisfying
CI checks
- **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out,
startup_failure, cancelled}`
- **Canonical remote resolver** (`execution/remote_resolver.py`):
`resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` —
correctly resolves `owner/repo` after `clone_repo` sets `origin` to
`file://` isolation URL
- **Clone isolation fix**: `clone_repo` now always clones from remote
URL (never local path); sets `origin=file:///<clone>` for isolation and
`upstream=<real_url>` for push/CI operations

---

## PR Pipeline Gates (#317, #343)

- **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`,
`partition_prs()` — partitions PRs into
eligible/CI-blocked/review-blocked with human-readable reasons
- **`pipeline/fidelity.py`**: `extract_linked_issues()`
(Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema
validation
- **`check_pr_mergeable`** now returns `mergeable_status` field
alongside boolean
- **`release_issue`** gains `target_branch` + `staged_label` parameters
for staged issue lifecycle on non-default branches (#392)

---

## Recipe System Changes

### Structural
- `RecipeIngredient.hidden` field — excluded from ingredients table
(used for internal flags like `sprint_mode`)
- `Recipe.experimental` flag parsed from YAML
- `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth
- `format_ingredients_table()` with sorted display order (required →
auto-detect → flags → optional → constants)
- Diagram rendering engine (~670 lines) removed from `diagrams.py` —
rendering now handled by `/render-recipe` skill; format version bumped
to v7

### Recipe YAML Changes
- **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`,
`bugfix-loop.yaml`
- **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml`
- **`implementation.yaml`**: merge queue steps,
`auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""`
(auto-detect), CI workflow filter, `extract_pr_number` step
- **`remediation.yaml`**: `topic` → `task` rename, merge queue steps,
`dry_walkthrough` retries:3 with forward-only routing, `verify` → `test`
rename
- **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step
(replaces `create-review-pr`), post-PR mergeability polling, review
cycle with `resolve-review` retries

### New Semantic Rules
- `missing-output-patterns` (WARNING) — flags `run_skill` steps without
`expected_output_patterns`
- `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist
- `circular-sub-recipe` (ERROR) — DFS cycle detection
- `unknown-skill-command` (ERROR) — validates skill names against
bundled set
- `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes
`open-pr`

---

## New Skills (24)

### Architecture Lens Family (13)
`arch-lens-c4-container`, `arch-lens-concurrency`,
`arch-lens-data-lineage`, `arch-lens-deployment`,
`arch-lens-development`, `arch-lens-error-resilience`,
`arch-lens-module-dependency`, `arch-lens-operational`,
`arch-lens-process-flow`, `arch-lens-repository-access`,
`arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle`

### Audit Family (5)
`audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`,
`audit-tests`

### Planning & Diagramming (3)
`elaborate-phase`, `make-arch-diag`, `make-req`

### Bug/Guard Lifecycle (2)
`design-guards`, `verify-diag`

### Pipeline (1)
`open-integration-pr` — creates integration PRs with per-PR details,
arch-lens diagrams, carried-forward `Closes #N` references, and
auto-closes collapsed PRs

### Sprint Planning (1 — gated by sub-recipe)
`sprint-planner` — selects a focused, conflict-free sprint from a triage
manifest

---

## Skill Modifications (Highlights)

- **`analyze-prs`**: merge queue detection, CI/review eligibility
filtering, queue-mode ordering
- **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git
history mining + GitHub issue cross-reference)
- **`review-pr`**: deterministic diff annotation via
`diff_annotator.py`, echo-primary-obligation step, post-completion
confirmation, degraded-mode narration
- **`collapse-issues`**: content fidelity enforcement — per-issue
`fetch_github_issue` calls, copy-mode body assembly (#388)
- **`prepare-issue`**: multi-keyword dedup search, numbered candidate
selection, extend-existing-issue flow
- **`resolve-review`**: GraphQL thread auto-resolution after addressing
findings (#379)
- **`resolve-merge-conflicts`**: conflict resolution decision report
with per-file log (#389)
- **Cross-skill**: output tokens migrated to `key = value` format;
code-index paths made generic with fallback notes; arch-lens references
fully qualified; anti-prose guards at loop boundaries

---

## CLI & Hooks

### New CLI Commands
- `autoskillit install` — plugin installation + cache refresh
- `autoskillit upgrade` — `.autoskillit/scripts/` →
`.autoskillit/recipes/` migration

### CLI Changes
- `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix`
flag removed
- `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware
registration
- `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions`
confirmation
- `recipes render`: repurposed from generator to viewer (delegates to
`/render-recipe`)
- `serve`: server import deferred to after `configure_logging()` to
prevent stdout corruption

### New Hooks
- `branch_protection_guard.py` (PreToolUse) — denies
`merge_worktree`/`push_to_remote` targeting protected branches
- `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration
tools in headless sessions
- `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter

### Hook Infrastructure
- `HookDef.event_type` field — registry now handles both PreToolUse and
PostToolUse
- `generate_hooks_json()` groups entries by event type
- `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made
event-type-agnostic

---

## Core & Config

### New Core Modules
- `core/branch_guard.py` — `is_protected_branch()` pure function
- `core/github_url.py` — `parse_github_repo()` +
`normalize_owner_repo()` canonical parsers

### Core Type Expansions
- `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset
- `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from
`UNGATED_TOOLS`
- `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response
- `CIRunScope` — immutable scope for CI watcher calls
- `MergeQueueWatcher` protocol
- `SkillResult.cli_subtype` + `write_path_warnings` fields
- `SubprocessRunner.env` parameter

### Config
- `safety.protected_branches`: `[main, integration, stable]`
- `github.staged_label`: `"staged"`
- `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`)
- `branching.default_base_branch`: `"integration"` → `"main"`
- `ModelConfig.default`: `str | None` → `str = "sonnet"`

---

## Infrastructure & Release

### Version
- `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock`
- FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399)

### CI/CD Workflows
- **`version-bump.yml`** (new) — auto patch-bumps `main` on integration
PR merge, force-syncs integration branch one patch ahead
- **`release.yml`** (new) — minor version bump + GitHub Release on merge
to `stable`
- **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python +
Actions)
- **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for
`stable`

### PyPI Readiness
- `pyproject.toml`: `readme`, `license`, `authors`, `keywords`,
`classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion
list

### readOnlyHint Parallel Execution Fix
- All MCP tools annotated `readOnlyHint=True` — enables Claude Code
parallel tool execution (~7x speedup). One deliberate exception:
`wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue
state)

### Tool Response Exception Boundary
- `track_response_size` decorator catches unhandled exceptions and
serializes them as `{"success": false, "subtype": "tool_exception"}` —
prevents FastMCP opaque error wrapping

### SkillResult Subtype Normalization (#358)
- `_normalize_subtype()` gate eliminates dual-source contradiction
between CLI subtype and session outcome
- Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race
artifact)
- Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` /
`"missing_completion_marker"` / `"adjudicated_failure"`

---

## Test Coverage

**47 new test files** (+12,703 lines) covering:

| Area | Key Tests |
|------|-----------|
| Merge queue watcher state machine | `test_merge_queue.py` (226 lines)
|
| Clone isolation × CI resolution | `test_clone_ci_contract.py`,
`test_remote_resolver.py` |
| PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) |
| Branch protection + headless guards |
`test_branch_protection_guard.py`,
`test_headless_orchestration_guard.py` |
| Sub-recipe composition | 5 test files (schema, loading, validation,
sprint mode × 2) |
| Telemetry formatter | `test_telemetry_formatter.py` (281 lines) |
| PR pipeline gates | `test_analyze_prs_gates.py`,
`test_review_pr_fidelity.py` |
| Diff annotator | `test_diff_annotator.py` (242 lines) |
| Skill compliance | Output token format, genericization, loop-boundary
guards |
| Release workflows | Structural contracts for `version-bump.yml`,
`release.yml` |
| Issue content fidelity | Body-assembling skills must call
`fetch_github_issue` per-issue |
| CI watcher scope | `test_ci_params.py` — workflow_id query param
composition |

---

## Consolidated PRs

#293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337,
#338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366,
#368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392,
#393, #395, #396, #397, #399, #405, #406

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant