Skip to content

Implementation Plan: Pipeline Observability — Quota Guard Logging and Per-Step Elapsed Time#318

Merged
Trecek merged 5 commits intointegrationfrom
combined-pipeline-observability-quota-guard-logging-and-per/302
Mar 10, 2026
Merged

Implementation Plan: Pipeline Observability — Quota Guard Logging and Per-Step Elapsed Time#318
Trecek merged 5 commits intointegrationfrom
combined-pipeline-observability-quota-guard-logging-and-per/302

Conversation

@Trecek
Copy link
Collaborator

@Trecek Trecek commented Mar 10, 2026

Summary

This PR implements two related pipeline observability improvements from issue #302 (combining #218 and #65):

  1. Quota guard event logging (feat: Add quota guard observability to diagnostic logging system #218): Instruments hooks/quota_check.py to write a structured event to quota_events.jsonl at every decision point (approved, blocked, cache_miss, parse_error), giving operators a diagnostic trail for quota-guard activity.

  2. Per-step elapsed time in token summary (feat: report per-step elapsed time in token summary #65): Surfaces the elapsed_seconds field already stored in TokenEntry through two rendering paths: the _fmt_get_token_summary formatter in pretty_output.py (what operators see inline during pipeline runs) and _format_token_summary in tools_status.py (what write_telemetry_files writes to token_summary.md).

Architecture Impact

Data Lineage Diagram

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
flowchart LR
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    %% ── QUOTA GUARD LINEAGE ── %%
    subgraph QuotaInputs ["Quota Guard Inputs"]
        QCACHE["quota_cache.json<br/>━━━━━━━━━━<br/>utilization: float<br/>resets_at: str|null<br/>fetched_at: ISO ts"]
        HCFG["hook_config.json<br/>━━━━━━━━━━<br/>threshold: float<br/>cache_max_age: int<br/>cache_path: str"]
    end

    subgraph QuotaHook ["● quota_check.py (PreToolUse)"]
        QDECIDE["decision logic<br/>━━━━━━━━━━<br/>compare utilization<br/>vs threshold"]
        QRESOLVE["★ _resolve_quota_log_dir()<br/>━━━━━━━━━━<br/>XDG / macOS / default<br/>AUTOSKILLIT_LOG_DIR env"]
        QWRITE["★ _write_quota_event()<br/>━━━━━━━━━━<br/>event: approved|blocked<br/>|cache_miss|parse_error"]
    end

    subgraph QuotaArtifacts ["★ New Diagnostic Artifacts"]
        QLOG[("quota_events.jsonl<br/>━━━━━━━━━━<br/>append-only JSONL<br/>at log root")]
    end

    %% ── TOKEN ELAPSED LINEAGE ── %%
    subgraph TokenSource ["Token Source (unchanged)"]
        HEADLESS["headless.py<br/>━━━━━━━━━━<br/>token_log.record(<br/>  step_name,<br/>  elapsed_seconds)"]
        TOKENTRY["TokenEntry<br/>━━━━━━━━━━<br/>elapsed_seconds: float<br/>input_tokens, output_tokens<br/>invocation_count"]
    end

    subgraph TokenSurfaces ["Token Summary Surfaces"]
        GETTOK["get_token_summary<br/>━━━━━━━━━━<br/>JSON: steps[].elapsed_seconds<br/>(already present in output)"]
        PRETTYOUT["● _fmt_get_token_summary<br/>━━━━━━━━━━<br/>pretty_output.py<br/>Markdown-KV render"]
        FMTTOKEN["● _format_token_summary<br/>━━━━━━━━━━<br/>tools_status.py<br/>markdown render"]
    end

    subgraph TokenArtifacts ["Updated Artifacts"]
        PRETTYRENDER["● pretty_output render<br/>━━━━━━━━━━<br/>★ + t:{elapsed:.1f}s<br/>per step line"]
        TOKENMD["● token_summary.md<br/>━━━━━━━━━━<br/>★ + elapsed_seconds:<br/>per step section"]
    end

    %% QUOTA FLOWS %%
    QCACHE -->|"read: utilization,<br/>resets_at"| QDECIDE
    HCFG -->|"read: threshold,<br/>cache_max_age"| QDECIDE
    QDECIDE -->|"decision outcome"| QWRITE
    QRESOLVE -->|"log dir path"| QWRITE
    QWRITE -.->|"append JSON line"| QLOG

    %% TOKEN FLOWS %%
    HEADLESS -->|"record(step, elapsed)"| TOKENTRY
    TOKENTRY -->|"to_dict() all 7 fields"| GETTOK
    GETTOK -->|"steps[].elapsed_seconds"| PRETTYOUT
    GETTOK -->|"steps[].elapsed_seconds"| FMTTOKEN
    PRETTYOUT -.->|"★ renders elapsed"| PRETTYRENDER
    FMTTOKEN -.->|"★ renders elapsed"| TOKENMD

    %% CLASS ASSIGNMENTS %%
    class QCACHE,HCFG cli;
    class TOKENTRY,GETTOK stateNode;
    class QDECIDE,QRESOLVE,QWRITE handler;
    class HEADLESS phase;
    class PRETTYOUT,FMTTOKEN newComponent;
    class QLOG,TOKENMD,PRETTYRENDER output;
    class TOKENTRY stateNode;
Loading

Color Legend:

Color Category Description
Dark Blue Input Data source files (cache, config)
Teal State In-memory token log and JSON output
Orange Handler Quota decision + event write logic
Purple Phase headless.py session executor
Green Modified Existing formatters updated to emit elapsed
Dark Teal Artifacts Write-only outputs (JSONL log, markdown files)

Operational Diagram

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;

    subgraph Config ["CONFIGURATION"]
        direction LR
        ENVCACHE["AUTOSKILLIT_QUOTA_CACHE<br/>━━━━━━━━━━<br/>quota cache path override"]
        ENVLOG["AUTOSKILLIT_LOG_DIR<br/>━━━━━━━━━━<br/>log root override<br/>(XDG_DATA_HOME fallback)"]
        HOOKCFG["hook_config.json<br/>━━━━━━━━━━<br/>threshold: 90.0<br/>cache_max_age: 300s<br/>cache_path"]
    end

    subgraph Hooks ["HOOKS (Claude Code Lifecycle)"]
        direction TB
        QHOOK["● quota_check.py<br/>━━━━━━━━━━<br/>PreToolUse on run_skill<br/>approve / block decision"]
        PHOOK["● pretty_output.py<br/>━━━━━━━━━━<br/>PostToolUse on all MCP tools<br/>reformats JSON → Markdown-KV"]
    end

    subgraph MCP ["MCP TOOLS (Status & Telemetry)"]
        direction TB
        GETTOK["get_token_summary<br/>━━━━━━━━━━<br/>ungated, inline<br/>JSON: steps[].elapsed_seconds"]
        WRITETELE["write_telemetry_files<br/>━━━━━━━━━━<br/>kitchen-gated<br/>output_dir: str"]
        TOKTOOL["● tools_status.py<br/>━━━━━━━━━━<br/>_format_token_summary<br/>adds elapsed_seconds: line"]
    end

    subgraph Observability ["OBSERVABILITY OUTPUTS (Write-Only)"]
        direction TB
        QLOG["★ quota_events.jsonl<br/>━━━━━━━━━━<br/>append-only JSONL at log root<br/>event: approved|blocked|<br/>cache_miss|parse_error"]
        PRETTYRENDER["● inline token render<br/>━━━━━━━━━━<br/>step x{n} [in:X out:X cached:X t:N.Ns]<br/>Claude's tool response view"]
        TOKENMD["● token_summary.md<br/>━━━━━━━━━━<br/>## step_name<br/>- elapsed_seconds: N<br/>atomic write"]
    end

    subgraph Querying ["OPERATOR QUERY PATTERNS"]
        direction TB
        JQQUERY["jq 'select(.event==\"blocked\")'<br/>━━━━━━━━━━<br/>quota_events.jsonl<br/>filter blocked events"]
        JQCOUNT["jq -r '.event' | sort | uniq -c<br/>━━━━━━━━━━<br/>event type distribution"]
    end

    %% CONFIG → HOOKS %%
    ENVCACHE -->|"cache path"| QHOOK
    ENVLOG -->|"log root"| QHOOK
    HOOKCFG -->|"threshold, cache_max_age"| QHOOK

    %% HOOKS → OUTPUTS %%
    QHOOK -.->|"★ _write_quota_event() appends"| QLOG
    PHOOK -->|"_fmt_get_token_summary route"| PRETTYRENDER

    %% MCP → HOOK INTERACTION %%
    GETTOK -->|"JSON response → PostToolUse"| PHOOK
    GETTOK -->|"steps[].elapsed_seconds"| TOKTOOL
    WRITETELE --> TOKTOOL
    TOKTOOL -.->|"atomic write"| TOKENMD

    %% OPERATOR QUERIES %%
    QLOG -.->|"operator queries"| JQQUERY
    QLOG -.->|"operator queries"| JQCOUNT

    %% CLASS ASSIGNMENTS %%
    class ENVCACHE,ENVLOG,HOOKCFG phase;
    class QHOOK,PHOOK handler;
    class GETTOK,WRITETELE,TOKTOOL stateNode;
    class QLOG,TOKENMD,PRETTYRENDER output;
    class JQQUERY,JQCOUNT cli;
Loading

Color Legend:

Color Category Description
Purple Config Configuration sources (env vars, hook config file)
Orange Hooks Claude Code lifecycle hooks (PreToolUse, PostToolUse)
Teal MCP Tools Status and telemetry MCP tool handlers
Dark Teal Outputs Write-only observability artifacts
Dark Blue Querying Operator query patterns and diagnostic commands

Closes #302

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-302-20260310-073833-386833/temp/make-plan/pipeline_observability_plan_2026-03-10_120000.md

Token Usage Summary

Token Summary

implement

  • input_tokens: 20233
  • output_tokens: 499703
  • cache_creation_input_tokens: 1532389
  • cache_read_input_tokens: 99527456
  • invocation_count: 16

audit_impl

  • input_tokens: 7600
  • output_tokens: 134323
  • cache_creation_input_tokens: 520444
  • cache_read_input_tokens: 3863375
  • invocation_count: 13

open_pr

  • input_tokens: 925
  • output_tokens: 73212
  • cache_creation_input_tokens: 247356
  • cache_read_input_tokens: 4293763
  • invocation_count: 4

retry_worktree

  • input_tokens: 45
  • output_tokens: 22321
  • cache_creation_input_tokens: 111687
  • cache_read_input_tokens: 3405410
  • invocation_count: 1

make_plan

  • input_tokens: 44
  • output_tokens: 69854
  • cache_creation_input_tokens: 121021
  • cache_read_input_tokens: 2735054
  • invocation_count: 1

review_pr

  • input_tokens: 8568
  • output_tokens: 166539
  • cache_creation_input_tokens: 420350
  • cache_read_input_tokens: 5838460
  • invocation_count: 7

dry_walkthrough

  • input_tokens: 57
  • output_tokens: 30444
  • cache_creation_input_tokens: 114452
  • cache_read_input_tokens: 1679519
  • invocation_count: 3

resolve_review

  • input_tokens: 404
  • output_tokens: 211704
  • cache_creation_input_tokens: 502896
  • cache_read_input_tokens: 23247211
  • invocation_count: 7

fix

  • input_tokens: 291
  • output_tokens: 113181
  • cache_creation_input_tokens: 415961
  • cache_read_input_tokens: 19115239
  • invocation_count: 6

investigate

  • input_tokens: 2053
  • output_tokens: 12607
  • cache_creation_input_tokens: 52059
  • cache_read_input_tokens: 382017
  • invocation_count: 1

rectify

  • input_tokens: 4616
  • output_tokens: 21997
  • cache_creation_input_tokens: 75946
  • cache_read_input_tokens: 854207
  • invocation_count: 1

open_pr_step

  • input_tokens: 52
  • output_tokens: 27985
  • cache_creation_input_tokens: 102646
  • cache_read_input_tokens: 1871801
  • invocation_count: 2

analyze_prs

  • input_tokens: 13
  • output_tokens: 18525
  • cache_creation_input_tokens: 55984
  • cache_read_input_tokens: 358201
  • invocation_count: 1

merge_pr

  • input_tokens: 54
  • output_tokens: 11011
  • cache_creation_input_tokens: 124523
  • cache_read_input_tokens: 1156759
  • invocation_count: 4

create_review_pr

  • input_tokens: 35
  • output_tokens: 17881
  • cache_creation_input_tokens: 52674
  • cache_read_input_tokens: 1060679
  • invocation_count: 1

plan

  • input_tokens: 6248
  • output_tokens: 238067
  • cache_creation_input_tokens: 765044
  • cache_read_input_tokens: 9628970
  • invocation_count: 9

verify

  • input_tokens: 2284
  • output_tokens: 119880
  • cache_creation_input_tokens: 539579
  • cache_read_input_tokens: 6350120
  • invocation_count: 10

Timing Summary

Timing Summary

implement

  • total_seconds: 11896.63371942683
  • invocation_count: 16

audit_impl

  • total_seconds: 3964.286584958052
  • invocation_count: 13

open_pr

  • total_seconds: 1599.6864469241118
  • invocation_count: 4

retry_worktree

  • total_seconds: 483.85407130105887
  • invocation_count: 1

make_plan

  • total_seconds: 1572.1653282630723
  • invocation_count: 1

review_pr

  • total_seconds: 2817.96798053605
  • invocation_count: 7

dry_walkthrough

  • total_seconds: 539.9472401479725
  • invocation_count: 3

resolve_review

  • total_seconds: 4706.892171586864
  • invocation_count: 7

fix

  • total_seconds: 3109.516113938147
  • invocation_count: 6

investigate

  • total_seconds: 586.5382829050068
  • invocation_count: 1

rectify

  • total_seconds: 725.7269057529047
  • invocation_count: 1

open_pr_step

  • total_seconds: 579.5783970400225
  • invocation_count: 2

analyze_prs

  • total_seconds: 279.4915256820386
  • invocation_count: 1

merge_pr

  • total_seconds: 285.0633245370118
  • invocation_count: 4

create_review_pr

  • total_seconds: 441.7614419440506
  • invocation_count: 1

plan

  • total_seconds: 6489.170513134953
  • invocation_count: 9

verify

  • total_seconds: 2771.1426656231615
  • invocation_count: 10

clone

  • total_seconds: 5.5492194169996765
  • invocation_count: 6

capture_base_sha

  • total_seconds: 0.017372207999414968
  • invocation_count: 6

create_branch

  • total_seconds: 4.241269170999658
  • invocation_count: 6

push_merge_target

  • total_seconds: 5.88246012000036
  • invocation_count: 6

test

  • total_seconds: 497.636292823996
  • invocation_count: 9

merge

  • total_seconds: 877.7062499809981
  • invocation_count: 7

push

  • total_seconds: 5.489298081000015
  • invocation_count: 5

🤖 Generated with Claude Code via AutoSkillit

…n summary

Implements issue #302 (combines #218 and #65):

1. Quota guard event logging (#218): Instruments quota_check.py to write
   a structured event to quota_events.jsonl at every decision point
   (approved, blocked, cache_miss, parse_error). Adds _resolve_quota_log_dir()
   with AUTOSKILLIT_LOG_DIR env override and platform defaults, and
   _write_quota_event() that is always fail-open (never blocks run_skill).

2. Per-step elapsed time (#65): Surfaces elapsed_seconds (already stored in
   TokenEntry) through two rendering paths: _fmt_get_token_summary in
   pretty_output.py now includes t:{elapsed:.1f}s per step, and
   _format_token_summary in tools_status.py now includes elapsed_seconds
   in the markdown token_summary.md output.

Adds 8 new tests (T1-T8) covering all event types and elapsed rendering paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

lines.append(f"- elapsed_seconds: {step['elapsed_seconds']}\n\n")
return "".join(lines)


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] bugs: step['elapsed_seconds'] uses direct dict access with no default. If any TokenEntry dict predates this field, this raises KeyError and crashes _format_token_summary entirely. Use step.get('elapsed_seconds', 0.0) consistent with how pretty_output.py handles the same field.


Silently no-ops on any error — hook observability must never block run_skill.
Event schema: {ts, event, threshold, utilization?, sleep_seconds?, resets_at?, cache_path?}
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _write_quota_event parameter annotation uses string literal 'Path | None' for log_dir instead of the actual union type. Path is already imported and _resolve_quota_log_dir uses the unquoted form. Remove the quotes for consistency.

@@ -163,7 +163,8 @@ def _format_token_summary(steps: list) -> str:
lines.append(f"- output_tokens: {step['output_tokens']}\n")
lines.append(f"- cache_creation_input_tokens: {step['cache_creation_input_tokens']}\n")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] defense: _format_token_summary directly accesses step['elapsed_seconds'] with no .get() fallback — newly added field most likely to be absent in historical data. Use step.get('elapsed_seconds', 0.0).

rendered = data["hookSpecificOutput"]["updatedMCPToolOutput"]
# Must include step name and elapsed time
assert "implement" in rendered
assert "123.4s" in rendered or "t:123.4s" in rendered
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: Disjunctive assertion '"123.4s" in rendered or "t:123.4s" in rendered' passes even if the format is wrong. The format spec is known (t:{elapsed:.1f}s), so pin to '"t:123.4s" in rendered' only, matching the updated compact-line assertions at lines 349-351.

assert events[0]["event"] == "parse_error"


# T6
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: T6 (test_quota_event_no_crash_when_log_dir_unresolvable) only asserts empty stdout but does not verify the hook exits 0. A silent crash would also produce empty stdout. If _run_hook captures exit code, assert it is 0.

HOOK_CONFIG_FILENAME = ".autoskillit_hook_config.json"
HOOK_DIR_COMPONENTS = (".autoskillit", "temp")

_AUTOSKILLIT_LOG_DIR_ENV = "AUTOSKILLIT_LOG_DIR"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: AUTOSKILLIT_LOG_DIR env var resolution and platform path logic duplicate execution/session_log.py:resolve_log_dir(). Consider whether the hook should share this logic or keep it standalone (hooks cannot import from the package).

return None


def _resolve_quota_log_dir() -> Path | None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: write_quota_event does not carry the 'quota' infix used throughout the module (e.g., _read_quota_cache, _resolve_quota_log_dir). Rename to _write_quota_log_event for naming symmetry.

text = json.loads(out)["hookSpecificOutput"]["updatedMCPToolOutput"]
assert "tool exception" in text.lower()
assert "TimeoutError: process hung" in text

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] cohesion: Loose 'or' assertion pattern deviates from the precise assertion style used at lines 349-351. Mirror exact format: '"t:123.4s" in rendered'.

Copy link
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review found 7 blocking issues. See inline comments.

Verdict: changes_requested

Critical (1): KeyError on step['elapsed_seconds'] in _format_token_summary — use .get() with a default.
Warnings (6): string-literal type annotation, missing .get() fallback in tools_status, disjunctive test assertions (2), T6 missing exit-code check, naming inconsistency (_write_quota_event).

Trecek and others added 4 commits March 10, 2026 14:43
…_summary

step['elapsed_seconds'] raises KeyError if any TokenEntry predates this field.
Consistent with how pretty_output.py handles the same field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent for naming symmetry

- Remove string literal from log_dir: 'Path | None' -> Path | None (Path already imported)
- Rename _write_quota_event -> _write_quota_log_event for consistency with
  _read_quota_cache / _resolve_quota_log_dir naming pattern in the module

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…etty_output

Replace disjunctive 'in rendered or ...' pattern with precise assertion.
Format spec is known (t:{elapsed:.1f}s) so pin to "t:123.4s" only,
matching the assertion style at lines 349-351.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_run_hook now returns (stdout, exit_code) tuple so tests can verify the hook
exits cleanly. T6 adds assert exit_code == 0 to distinguish silent crashes
(which also produce empty stdout) from genuine approval.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek merged commit e08bb31 into integration Mar 10, 2026
2 checks passed
@Trecek Trecek deleted the combined-pipeline-observability-quota-guard-logging-and-per/302 branch March 10, 2026 23:35
Trecek added a commit that referenced this pull request Mar 11, 2026
…nto integration (#323)

## Integration Summary

Collapsed 7 PRs into `pr-batch/pr-merge-20260310-163009` targeting
`integration`.

## Merged PRs

| # | Title | Complexity | Additions | Deletions | Overlaps |
|---|-------|-----------|-----------|-----------|---------|
| #319 | Add Step 4.5 Historical Regression Check to dry-walkthrough
skill | simple | +200 | -0 | — |
| #318 | Implementation Plan: Pipeline Observability — Quota Guard
Logging and Per-Step Elapsed Time | simple | +267 | -23 | — |
| #316 | Release CI Automation — Version Bump, Branch Sync, and Release
Infrastructure | simple | +471 | -0 | — |
| #314 | docs: complete release documentation sprint | simple | +1344 |
-104 | — |
| #315 | Recipe remediation — source_dir resolution and cook command
display | simple | +317 | -71 | #320 |
| #317 | Implementation Plan: PR Review Pipeline Gates — Fidelity
Checks, CI Gating, and Review-First Enforcement | simple | +1057 | -7 |
#320 |
| #320 | feat: add recipe composition with run_recipe tool and
dev-sprint consumer | needs_check | +724 | -19 | #315, #317 |

## Audit

**Verdict:** GO

## Architecture Impact

### Development Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;

    subgraph Structure ["PROJECT STRUCTURE"]
        direction LR
        SRC["src/autoskillit/<br/>━━━━━━━━━━<br/>105 source files<br/>10 sub-packages"]
        TESTS["tests/<br/>━━━━━━━━━━<br/>170 test files<br/>Mirrors src layout"]
    end

    subgraph Build ["BUILD TOOLING"]
        direction LR
        PYPROJECT["● pyproject.toml<br/>━━━━━━━━━━<br/>hatchling backend<br/>uv package manager"]
        TASKFILE["Taskfile.yml<br/>━━━━━━━━━━<br/>task test-all<br/>task test-check<br/>task install-worktree"]
        UVLOCK["uv.lock<br/>━━━━━━━━━━<br/>Locked manifest"]
    end

    subgraph PreCommit ["PRE-COMMIT HOOKS"]
        direction TB
        RUFF_FMT["ruff-format<br/>━━━━━━━━━━<br/>Auto-format<br/>(writes source)"]
        RUFF_LINT["ruff check<br/>━━━━━━━━━━<br/>Lint --fix<br/>target: py311"]
        MYPY["mypy<br/>━━━━━━━━━━<br/>src/ type check<br/>--ignore-missing"]
        UVCHECK["uv lock --check<br/>━━━━━━━━━━<br/>Lockfile guard"]
        NOGEN["no-generated-configs<br/>━━━━━━━━━━<br/>Blocks hooks.json<br/>recipes/diagrams/"]
        GITLEAKS["gitleaks v8.30.0<br/>━━━━━━━━━━<br/>Secret scanning"]
    end

    subgraph Testing ["TEST FRAMEWORK"]
        direction LR
        PYTEST["pytest -n 4<br/>━━━━━━━━━━<br/>asyncio_mode=auto<br/>timeout=60s"]
        FIXTURES["conftest.py<br/>━━━━━━━━━━<br/>StatefulMockTester<br/>MockSubprocessRunner<br/>tool_ctx fixture"]
        IMPORTLINT["import-linter<br/>━━━━━━━━━━<br/>L0→L1→L2→L3<br/>Architecture gates"]
    end

    subgraph NewTests ["★ NEW TEST FILES"]
        direction LR
        TGATE["★ test_analyze_prs_gates.py<br/>━━━━━━━━━━<br/>PR gate analysis"]
        TFIDELITY["★ test_review_pr_fidelity.py<br/>━━━━━━━━━━<br/>Fidelity checks"]
        TRELEASE["★ test_release_workflows.py<br/>━━━━━━━━━━<br/>CI workflow tests"]
        TDRY["★ test_dry_walkthrough_contracts.py<br/>━━━━━━━━━━<br/>10 contract assertions"]
    end

    subgraph CI ["CI/CD WORKFLOWS"]
        direction TB
        TESTS_CI["tests.yml<br/>━━━━━━━━━━<br/>ubuntu + macos matrix<br/>preflight: uv lock --check<br/>uv sync + task test-all"]
        VBUMP["★ version-bump.yml<br/>━━━━━━━━━━<br/>integration→main merged<br/>MAJOR.MINOR.PATCH+1<br/>sync main→integration"]
        RELEASE["★ release.yml<br/>━━━━━━━━━━<br/>stable branch merge<br/>MAJOR.MINOR+1.0<br/>git tag + GitHub Release"]
    end

    subgraph EntryPoints ["ENTRY POINTS"]
        direction LR
        CLI_EP["autoskillit CLI<br/>━━━━━━━━━━<br/>autoskillit.cli:main<br/>● app.py modified"]
        INSTALL["★ install.sh<br/>━━━━━━━━━━<br/>End-user installer<br/>uv tool install<br/>autoskillit install"]
    end

    %% FLOW %%
    SRC --> PYPROJECT
    TESTS --> PYPROJECT
    PYPROJECT --> TASKFILE
    PYPROJECT --> UVLOCK

    TASKFILE --> RUFF_FMT
    RUFF_FMT --> RUFF_LINT
    RUFF_LINT --> MYPY
    MYPY --> UVCHECK
    UVCHECK --> NOGEN
    NOGEN --> GITLEAKS

    TASKFILE --> PYTEST
    PYTEST --> FIXTURES
    PYTEST --> IMPORTLINT
    PYTEST --> TGATE
    PYTEST --> TFIDELITY
    PYTEST --> TRELEASE
    PYTEST --> TDRY

    PYPROJECT --> TESTS_CI
    TESTS_CI --> VBUMP
    VBUMP --> RELEASE

    PYPROJECT --> CLI_EP
    INSTALL --> CLI_EP

    %% CLASS ASSIGNMENTS %%
    class SRC,TESTS cli;
    class PYPROJECT,TASKFILE,UVLOCK phase;
    class RUFF_FMT,RUFF_LINT,MYPY,UVCHECK,NOGEN,GITLEAKS detector;
    class PYTEST,FIXTURES,IMPORTLINT handler;
    class TGATE,TFIDELITY,TRELEASE,TDRY,VBUMP,RELEASE,INSTALL newComponent;
    class TESTS_CI stateNode;
    class CLI_EP output;
```

### Process Flow Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;

    %% TERMINALS %%
    START([START])
    SUCCESS([SUCCESS])
    VALFAIL([VALIDATION ERROR])

    subgraph Val ["Load-Time Validation"]
        direction LR
        ValidateRecipe["validate_recipe<br/>━━━━━━━━━━<br/>recipe/_api.py"]
        RulesRecipe["★ rules_recipe.py<br/>━━━━━━━━━━<br/>unknown-sub-recipe rule"]
        AvailCtx["★ available_recipes<br/>━━━━━━━━━━<br/>ValidationContext field"]
        NameKnown{"sub-recipe<br/>name known?"}
    end

    subgraph Exec ["Runtime Execution"]
        direction TB
        ToolHandler["★ run_recipe<br/>━━━━━━━━━━<br/>server/tools_recipe.py"]
        FindRecipe["★ _find_recipe()<br/>━━━━━━━━━━<br/>server/helpers.py"]
        FoundDec{"recipe<br/>file found?"}
        RunSub["★ _run_subrecipe_session()<br/>━━━━━━━━━━<br/>server/helpers.py"]
        BuildPrompt["★ build_subrecipe_prompt()<br/>━━━━━━━━━━<br/>cli/_prompts.py"]
        BuildCmd["★ build_subrecipe_cmd()<br/>━━━━━━━━━━<br/>execution/commands.py"]
        SubSess["★ run_subrecipe_session()<br/>━━━━━━━━━━<br/>execution/headless.py"]
    end

    HeadlessProc["Headless Claude<br/>━━━━━━━━━━<br/>AUTOSKILLIT_KITCHEN_OPEN=1<br/>executes sub-recipe YAML"]
    ResultDec{"success: True?"}

    START -->|"recipe YAML loaded"| ValidateRecipe
    ValidateRecipe --> RulesRecipe
    RulesRecipe --> AvailCtx
    AvailCtx --> NameKnown
    NameKnown -->|"unknown name"| VALFAIL
    NameKnown -->|"valid"| ToolHandler
    ToolHandler --> FindRecipe
    FindRecipe --> FoundDec
    FoundDec -->|"not found"| SUCCESS
    FoundDec -->|"found"| RunSub
    RunSub --> BuildPrompt
    RunSub --> BuildCmd
    BuildPrompt -->|"prompt str"| SubSess
    BuildCmd -->|"CLI cmd + env"| SubSess
    SubSess --> HeadlessProc
    HeadlessProc --> ResultDec
    ResultDec -->|"yes"| SUCCESS
    ResultDec -->|"no"| SUCCESS

    %% CLASS ASSIGNMENTS %%
    class START,SUCCESS,VALFAIL terminal;
    class ValidateRecipe handler;
    class RulesRecipe detector;
    class AvailCtx,NameKnown,FoundDec stateNode;
    class ToolHandler,FindRecipe,RunSub,BuildPrompt,BuildCmd,SubSess newComponent;
    class HeadlessProc integration;
    class ResultDec stateNode;
```

### Deployment Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    subgraph DevMachine ["DEVELOPER MACHINE"]
        direction TB

        subgraph Bootstrap ["BOOTSTRAP (★ NEW)"]
            INSTALL_SH["★ install.sh<br/>━━━━━━━━━━<br/>curl from GitHub stable<br/>Python 3.11+ · uv · claude<br/>uv tool install"]
        end

        subgraph UVTool ["INSTALLED PACKAGE"]
            PKG["autoskillit package<br/>━━━━━━━━━━<br/>~/.local/share/uv/tools/<br/>autoskillit/lib/python3.x/<br/>site-packages/autoskillit/"]
            PLUGIN_CACHE["Plugin cache<br/>━━━━━━━━━━<br/>~/.claude/plugins/cache/<br/>autoskillit-local/"]
        end

        subgraph ClaudeCode ["CLAUDE CODE PROCESS"]
            CC["Claude Code IDE<br/>━━━━━━━━━━<br/>Reads hooks from<br/>~/.claude/settings.json<br/>.claude/settings.json"]
            MCP["● MCP Server (stdio)<br/>━━━━━━━━━━<br/>FastMCP · stdin/stdout<br/>12 ungated + 26 kitchen tools<br/>★ run_recipe · ★ get_ci_status"]
        end

        subgraph HeadlessSessions ["HEADLESS SUBPROCESS SESSIONS (pty)"]
            SKILL_SESS["Skill session<br/>━━━━━━━━━━<br/>claude --print prompt<br/>AUTOSKILLIT_HEADLESS=1<br/>KITCHEN_OPEN via open_kitchen"]
            SUBRECIPE["★ Sub-recipe session<br/>━━━━━━━━━━<br/>claude --print sous-chef-prompt<br/>AUTOSKILLIT_KITCHEN_OPEN=1<br/>NO HEADLESS flag<br/>build_subrecipe_cmd"]
        end

        subgraph LocalStorage ["LOCAL STORAGE"]
            SESSION_LOGS[("Session logs<br/>━━━━━━━━━━<br/>~/.local/share/autoskillit/<br/>logs/sessions/*.jsonl<br/>proc_trace · anomalies")]
            CRASH_TMP[("Crash traces<br/>━━━━━━━━━━<br/>/dev/shm/<br/>autoskillit_trace_pid.jsonl<br/>Linux tmpfs")]
            PROJ_STORE[("Project storage<br/>━━━━━━━━━━<br/>project/.autoskillit/<br/>config.yaml · .secrets.yaml<br/>temp/ · recipes/")]
            CLAUDE_LOGS[("Claude Code logs<br/>━━━━━━━━━━<br/>~/.claude/projects/<br/>encoded-cwd/<br/>session-id.jsonl")]
        end
    end

    subgraph GitHub ["GITHUB INFRASTRUCTURE"]
        direction TB
        GH_ACTIONS["GitHub Actions<br/>━━━━━━━━━━<br/>tests.yml: ubuntu + macos-15<br/>uv sync · task test-all"]
        VBUMP_WF["★ version-bump.yml<br/>━━━━━━━━━━<br/>integration→main merge<br/>patch bump + uv lock<br/>sync main→integration"]
        RELEASE_WF["★ release.yml<br/>━━━━━━━━━━<br/>stable branch merge<br/>minor bump + uv lock<br/>git tag + GitHub Release"]
        GH_REPO["GitHub Repository<br/>━━━━━━━━━━<br/>main · integration · stable<br/>Issues · PRs · Releases"]
        GH_API["GitHub API<br/>━━━━━━━━━━<br/>api.github.com<br/>Actions runs · CI jobs<br/>Issues · PR reviews"]
    end

    subgraph Anthropic ["EXTERNAL: ANTHROPIC"]
        ANT_API["Anthropic API<br/>━━━━━━━━━━<br/>api.anthropic.com<br/>/api/oauth/usage<br/>5-hour quota check"]
    end

    %% BOOTSTRAP FLOW %%
    INSTALL_SH -->|"uv tool install git@stable"| PKG
    PKG -->|"autoskillit install<br/>claude plugin install"| PLUGIN_CACHE

    %% CLAUDE CODE %%
    PLUGIN_CACHE -->|"plugin load"| CC
    CC -->|"spawns stdio"| MCP

    %% MCP → SESSIONS %%
    MCP -->|"run_skill: spawns pty"| SKILL_SESS
    MCP -->|"★ run_recipe: spawns pty<br/>KITCHEN_OPEN=1"| SUBRECIPE

    %% STORAGE WRITES %%
    SKILL_SESS -->|"writes diagnostics"| SESSION_LOGS
    SKILL_SESS -->|"crash: writes"| CRASH_TMP
    SUBRECIPE -->|"writes diagnostics"| SESSION_LOGS
    MCP -->|"reads/writes"| PROJ_STORE
    CC -->|"writes JSONL"| CLAUDE_LOGS

    %% CI/RELEASE %%
    GH_REPO -->|"push/PR event"| GH_ACTIONS
    GH_REPO -->|"integration→main merge"| VBUMP_WF
    GH_REPO -->|"★ PR→stable merge"| RELEASE_WF
    RELEASE_WF -->|"creates"| GH_REPO
    VBUMP_WF -->|"pushes commits"| GH_REPO

    %% EXTERNAL API CALLS %%
    MCP -->|"httpx HTTPS<br/>CI polling"| GH_API
    GH_ACTIONS -->|"gh CLI"| GH_API
    MCP -->|"httpx HTTPS<br/>quota check"| ANT_API
    SKILL_SESS -->|"Anthropic API<br/>inference"| ANT_API

    %% CLASS ASSIGNMENTS %%
    class INSTALL_SH,VBUMP_WF,RELEASE_WF,SUBRECIPE newComponent;
    class CC,SKILL_SESS cli;
    class MCP,GH_ACTIONS handler;
    class PKG,PLUGIN_CACHE phase;
    class SESSION_LOGS,CRASH_TMP,PROJ_STORE,CLAUDE_LOGS stateNode;
    class GH_REPO,GH_API,ANT_API integration;
```

Closes #307
Closes #302
Closes #298
Closes #297
Closes #300
Closes #303

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek added a commit that referenced this pull request Mar 11, 2026
… Per-Step Elapsed Time (#318)

## Summary

This PR implements two related pipeline observability improvements from
issue #302 (combining #218 and #65):

1. **Quota guard event logging (#218)**: Instruments
`hooks/quota_check.py` to write a structured event to
`quota_events.jsonl` at every decision point (approved, blocked,
cache_miss, parse_error), giving operators a diagnostic trail for
quota-guard activity.

2. **Per-step elapsed time in token summary (#65)**: Surfaces the
`elapsed_seconds` field already stored in `TokenEntry` through two
rendering paths: the `_fmt_get_token_summary` formatter in
`pretty_output.py` (what operators see inline during pipeline runs) and
`_format_token_summary` in `tools_status.py` (what
`write_telemetry_files` writes to `token_summary.md`).

## Architecture Impact

### Data Lineage Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
flowchart LR
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    %% ── QUOTA GUARD LINEAGE ── %%
    subgraph QuotaInputs ["Quota Guard Inputs"]
        QCACHE["quota_cache.json<br/>━━━━━━━━━━<br/>utilization: float<br/>resets_at: str|null<br/>fetched_at: ISO ts"]
        HCFG["hook_config.json<br/>━━━━━━━━━━<br/>threshold: float<br/>cache_max_age: int<br/>cache_path: str"]
    end

    subgraph QuotaHook ["● quota_check.py (PreToolUse)"]
        QDECIDE["decision logic<br/>━━━━━━━━━━<br/>compare utilization<br/>vs threshold"]
        QRESOLVE["★ _resolve_quota_log_dir()<br/>━━━━━━━━━━<br/>XDG / macOS / default<br/>AUTOSKILLIT_LOG_DIR env"]
        QWRITE["★ _write_quota_event()<br/>━━━━━━━━━━<br/>event: approved|blocked<br/>|cache_miss|parse_error"]
    end

    subgraph QuotaArtifacts ["★ New Diagnostic Artifacts"]
        QLOG[("quota_events.jsonl<br/>━━━━━━━━━━<br/>append-only JSONL<br/>at log root")]
    end

    %% ── TOKEN ELAPSED LINEAGE ── %%
    subgraph TokenSource ["Token Source (unchanged)"]
        HEADLESS["headless.py<br/>━━━━━━━━━━<br/>token_log.record(<br/>  step_name,<br/>  elapsed_seconds)"]
        TOKENTRY["TokenEntry<br/>━━━━━━━━━━<br/>elapsed_seconds: float<br/>input_tokens, output_tokens<br/>invocation_count"]
    end

    subgraph TokenSurfaces ["Token Summary Surfaces"]
        GETTOK["get_token_summary<br/>━━━━━━━━━━<br/>JSON: steps[].elapsed_seconds<br/>(already present in output)"]
        PRETTYOUT["● _fmt_get_token_summary<br/>━━━━━━━━━━<br/>pretty_output.py<br/>Markdown-KV render"]
        FMTTOKEN["● _format_token_summary<br/>━━━━━━━━━━<br/>tools_status.py<br/>markdown render"]
    end

    subgraph TokenArtifacts ["Updated Artifacts"]
        PRETTYRENDER["● pretty_output render<br/>━━━━━━━━━━<br/>★ + t:{elapsed:.1f}s<br/>per step line"]
        TOKENMD["● token_summary.md<br/>━━━━━━━━━━<br/>★ + elapsed_seconds:<br/>per step section"]
    end

    %% QUOTA FLOWS %%
    QCACHE -->|"read: utilization,<br/>resets_at"| QDECIDE
    HCFG -->|"read: threshold,<br/>cache_max_age"| QDECIDE
    QDECIDE -->|"decision outcome"| QWRITE
    QRESOLVE -->|"log dir path"| QWRITE
    QWRITE -.->|"append JSON line"| QLOG

    %% TOKEN FLOWS %%
    HEADLESS -->|"record(step, elapsed)"| TOKENTRY
    TOKENTRY -->|"to_dict() all 7 fields"| GETTOK
    GETTOK -->|"steps[].elapsed_seconds"| PRETTYOUT
    GETTOK -->|"steps[].elapsed_seconds"| FMTTOKEN
    PRETTYOUT -.->|"★ renders elapsed"| PRETTYRENDER
    FMTTOKEN -.->|"★ renders elapsed"| TOKENMD

    %% CLASS ASSIGNMENTS %%
    class QCACHE,HCFG cli;
    class TOKENTRY,GETTOK stateNode;
    class QDECIDE,QRESOLVE,QWRITE handler;
    class HEADLESS phase;
    class PRETTYOUT,FMTTOKEN newComponent;
    class QLOG,TOKENMD,PRETTYRENDER output;
    class TOKENTRY stateNode;
```

**Color Legend:**
| Color | Category | Description |
|-------|----------|-------------|
| Dark Blue | Input | Data source files (cache, config) |
| Teal | State | In-memory token log and JSON output |
| Orange | Handler | Quota decision + event write logic |
| Purple | Phase | headless.py session executor |
| Green | Modified | Existing formatters updated to emit elapsed |
| Dark Teal | Artifacts | Write-only outputs (JSONL log, markdown files)
|

### Operational Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;

    subgraph Config ["CONFIGURATION"]
        direction LR
        ENVCACHE["AUTOSKILLIT_QUOTA_CACHE<br/>━━━━━━━━━━<br/>quota cache path override"]
        ENVLOG["AUTOSKILLIT_LOG_DIR<br/>━━━━━━━━━━<br/>log root override<br/>(XDG_DATA_HOME fallback)"]
        HOOKCFG["hook_config.json<br/>━━━━━━━━━━<br/>threshold: 90.0<br/>cache_max_age: 300s<br/>cache_path"]
    end

    subgraph Hooks ["HOOKS (Claude Code Lifecycle)"]
        direction TB
        QHOOK["● quota_check.py<br/>━━━━━━━━━━<br/>PreToolUse on run_skill<br/>approve / block decision"]
        PHOOK["● pretty_output.py<br/>━━━━━━━━━━<br/>PostToolUse on all MCP tools<br/>reformats JSON → Markdown-KV"]
    end

    subgraph MCP ["MCP TOOLS (Status & Telemetry)"]
        direction TB
        GETTOK["get_token_summary<br/>━━━━━━━━━━<br/>ungated, inline<br/>JSON: steps[].elapsed_seconds"]
        WRITETELE["write_telemetry_files<br/>━━━━━━━━━━<br/>kitchen-gated<br/>output_dir: str"]
        TOKTOOL["● tools_status.py<br/>━━━━━━━━━━<br/>_format_token_summary<br/>adds elapsed_seconds: line"]
    end

    subgraph Observability ["OBSERVABILITY OUTPUTS (Write-Only)"]
        direction TB
        QLOG["★ quota_events.jsonl<br/>━━━━━━━━━━<br/>append-only JSONL at log root<br/>event: approved|blocked|<br/>cache_miss|parse_error"]
        PRETTYRENDER["● inline token render<br/>━━━━━━━━━━<br/>step x{n} [in:X out:X cached:X t:N.Ns]<br/>Claude's tool response view"]
        TOKENMD["● token_summary.md<br/>━━━━━━━━━━<br/>## step_name<br/>- elapsed_seconds: N<br/>atomic write"]
    end

    subgraph Querying ["OPERATOR QUERY PATTERNS"]
        direction TB
        JQQUERY["jq 'select(.event==\"blocked\")'<br/>━━━━━━━━━━<br/>quota_events.jsonl<br/>filter blocked events"]
        JQCOUNT["jq -r '.event' | sort | uniq -c<br/>━━━━━━━━━━<br/>event type distribution"]
    end

    %% CONFIG → HOOKS %%
    ENVCACHE -->|"cache path"| QHOOK
    ENVLOG -->|"log root"| QHOOK
    HOOKCFG -->|"threshold, cache_max_age"| QHOOK

    %% HOOKS → OUTPUTS %%
    QHOOK -.->|"★ _write_quota_event() appends"| QLOG
    PHOOK -->|"_fmt_get_token_summary route"| PRETTYRENDER

    %% MCP → HOOK INTERACTION %%
    GETTOK -->|"JSON response → PostToolUse"| PHOOK
    GETTOK -->|"steps[].elapsed_seconds"| TOKTOOL
    WRITETELE --> TOKTOOL
    TOKTOOL -.->|"atomic write"| TOKENMD

    %% OPERATOR QUERIES %%
    QLOG -.->|"operator queries"| JQQUERY
    QLOG -.->|"operator queries"| JQCOUNT

    %% CLASS ASSIGNMENTS %%
    class ENVCACHE,ENVLOG,HOOKCFG phase;
    class QHOOK,PHOOK handler;
    class GETTOK,WRITETELE,TOKTOOL stateNode;
    class QLOG,TOKENMD,PRETTYRENDER output;
    class JQQUERY,JQCOUNT cli;
```

**Color Legend:**
| Color | Category | Description |
|-------|----------|-------------|
| Purple | Config | Configuration sources (env vars, hook config file) |
| Orange | Hooks | Claude Code lifecycle hooks (PreToolUse, PostToolUse)
|
| Teal | MCP Tools | Status and telemetry MCP tool handlers |
| Dark Teal | Outputs | Write-only observability artifacts |
| Dark Blue | Querying | Operator query patterns and diagnostic commands
|

Closes #302

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-302-20260310-073833-386833/temp/make-plan/pipeline_observability_plan_2026-03-10_120000.md`

## Token Usage Summary

# Token Summary

## implement

- input_tokens: 20233
- output_tokens: 499703
- cache_creation_input_tokens: 1532389
- cache_read_input_tokens: 99527456
- invocation_count: 16

## audit_impl

- input_tokens: 7600
- output_tokens: 134323
- cache_creation_input_tokens: 520444
- cache_read_input_tokens: 3863375
- invocation_count: 13

## open_pr

- input_tokens: 925
- output_tokens: 73212
- cache_creation_input_tokens: 247356
- cache_read_input_tokens: 4293763
- invocation_count: 4

## retry_worktree

- input_tokens: 45
- output_tokens: 22321
- cache_creation_input_tokens: 111687
- cache_read_input_tokens: 3405410
- invocation_count: 1

## make_plan

- input_tokens: 44
- output_tokens: 69854
- cache_creation_input_tokens: 121021
- cache_read_input_tokens: 2735054
- invocation_count: 1

## review_pr

- input_tokens: 8568
- output_tokens: 166539
- cache_creation_input_tokens: 420350
- cache_read_input_tokens: 5838460
- invocation_count: 7

## dry_walkthrough

- input_tokens: 57
- output_tokens: 30444
- cache_creation_input_tokens: 114452
- cache_read_input_tokens: 1679519
- invocation_count: 3

## resolve_review

- input_tokens: 404
- output_tokens: 211704
- cache_creation_input_tokens: 502896
- cache_read_input_tokens: 23247211
- invocation_count: 7

## fix

- input_tokens: 291
- output_tokens: 113181
- cache_creation_input_tokens: 415961
- cache_read_input_tokens: 19115239
- invocation_count: 6

## investigate

- input_tokens: 2053
- output_tokens: 12607
- cache_creation_input_tokens: 52059
- cache_read_input_tokens: 382017
- invocation_count: 1

## rectify

- input_tokens: 4616
- output_tokens: 21997
- cache_creation_input_tokens: 75946
- cache_read_input_tokens: 854207
- invocation_count: 1

## open_pr_step

- input_tokens: 52
- output_tokens: 27985
- cache_creation_input_tokens: 102646
- cache_read_input_tokens: 1871801
- invocation_count: 2

## analyze_prs

- input_tokens: 13
- output_tokens: 18525
- cache_creation_input_tokens: 55984
- cache_read_input_tokens: 358201
- invocation_count: 1

## merge_pr

- input_tokens: 54
- output_tokens: 11011
- cache_creation_input_tokens: 124523
- cache_read_input_tokens: 1156759
- invocation_count: 4

## create_review_pr

- input_tokens: 35
- output_tokens: 17881
- cache_creation_input_tokens: 52674
- cache_read_input_tokens: 1060679
- invocation_count: 1

## plan

- input_tokens: 6248
- output_tokens: 238067
- cache_creation_input_tokens: 765044
- cache_read_input_tokens: 9628970
- invocation_count: 9

## verify

- input_tokens: 2284
- output_tokens: 119880
- cache_creation_input_tokens: 539579
- cache_read_input_tokens: 6350120
- invocation_count: 10

## Timing Summary

# Timing Summary

## implement

- total_seconds: 11896.63371942683
- invocation_count: 16

## audit_impl

- total_seconds: 3964.286584958052
- invocation_count: 13

## open_pr

- total_seconds: 1599.6864469241118
- invocation_count: 4

## retry_worktree

- total_seconds: 483.85407130105887
- invocation_count: 1

## make_plan

- total_seconds: 1572.1653282630723
- invocation_count: 1

## review_pr

- total_seconds: 2817.96798053605
- invocation_count: 7

## dry_walkthrough

- total_seconds: 539.9472401479725
- invocation_count: 3

## resolve_review

- total_seconds: 4706.892171586864
- invocation_count: 7

## fix

- total_seconds: 3109.516113938147
- invocation_count: 6

## investigate

- total_seconds: 586.5382829050068
- invocation_count: 1

## rectify

- total_seconds: 725.7269057529047
- invocation_count: 1

## open_pr_step

- total_seconds: 579.5783970400225
- invocation_count: 2

## analyze_prs

- total_seconds: 279.4915256820386
- invocation_count: 1

## merge_pr

- total_seconds: 285.0633245370118
- invocation_count: 4

## create_review_pr

- total_seconds: 441.7614419440506
- invocation_count: 1

## plan

- total_seconds: 6489.170513134953
- invocation_count: 9

## verify

- total_seconds: 2771.1426656231615
- invocation_count: 10

## clone

- total_seconds: 5.5492194169996765
- invocation_count: 6

## capture_base_sha

- total_seconds: 0.017372207999414968
- invocation_count: 6

## create_branch

- total_seconds: 4.241269170999658
- invocation_count: 6

## push_merge_target

- total_seconds: 5.88246012000036
- invocation_count: 6

## test

- total_seconds: 497.636292823996
- invocation_count: 9

## merge

- total_seconds: 877.7062499809981
- invocation_count: 7

## push

- total_seconds: 5.489298081000015
- invocation_count: 5

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek added a commit that referenced this pull request Mar 11, 2026
…nto integration (#323)

## Integration Summary

Collapsed 7 PRs into `pr-batch/pr-merge-20260310-163009` targeting
`integration`.

## Merged PRs

| # | Title | Complexity | Additions | Deletions | Overlaps |
|---|-------|-----------|-----------|-----------|---------|
| #319 | Add Step 4.5 Historical Regression Check to dry-walkthrough
skill | simple | +200 | -0 | — |
| #318 | Implementation Plan: Pipeline Observability — Quota Guard
Logging and Per-Step Elapsed Time | simple | +267 | -23 | — |
| #316 | Release CI Automation — Version Bump, Branch Sync, and Release
Infrastructure | simple | +471 | -0 | — |
| #314 | docs: complete release documentation sprint | simple | +1344 |
-104 | — |
| #315 | Recipe remediation — source_dir resolution and cook command
display | simple | +317 | -71 | #320 |
| #317 | Implementation Plan: PR Review Pipeline Gates — Fidelity
Checks, CI Gating, and Review-First Enforcement | simple | +1057 | -7 |
#320 |
| #320 | feat: add recipe composition with run_recipe tool and
dev-sprint consumer | needs_check | +724 | -19 | #315, #317 |

## Audit

**Verdict:** GO

## Architecture Impact

### Development Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;

    subgraph Structure ["PROJECT STRUCTURE"]
        direction LR
        SRC["src/autoskillit/<br/>━━━━━━━━━━<br/>105 source files<br/>10 sub-packages"]
        TESTS["tests/<br/>━━━━━━━━━━<br/>170 test files<br/>Mirrors src layout"]
    end

    subgraph Build ["BUILD TOOLING"]
        direction LR
        PYPROJECT["● pyproject.toml<br/>━━━━━━━━━━<br/>hatchling backend<br/>uv package manager"]
        TASKFILE["Taskfile.yml<br/>━━━━━━━━━━<br/>task test-all<br/>task test-check<br/>task install-worktree"]
        UVLOCK["uv.lock<br/>━━━━━━━━━━<br/>Locked manifest"]
    end

    subgraph PreCommit ["PRE-COMMIT HOOKS"]
        direction TB
        RUFF_FMT["ruff-format<br/>━━━━━━━━━━<br/>Auto-format<br/>(writes source)"]
        RUFF_LINT["ruff check<br/>━━━━━━━━━━<br/>Lint --fix<br/>target: py311"]
        MYPY["mypy<br/>━━━━━━━━━━<br/>src/ type check<br/>--ignore-missing"]
        UVCHECK["uv lock --check<br/>━━━━━━━━━━<br/>Lockfile guard"]
        NOGEN["no-generated-configs<br/>━━━━━━━━━━<br/>Blocks hooks.json<br/>recipes/diagrams/"]
        GITLEAKS["gitleaks v8.30.0<br/>━━━━━━━━━━<br/>Secret scanning"]
    end

    subgraph Testing ["TEST FRAMEWORK"]
        direction LR
        PYTEST["pytest -n 4<br/>━━━━━━━━━━<br/>asyncio_mode=auto<br/>timeout=60s"]
        FIXTURES["conftest.py<br/>━━━━━━━━━━<br/>StatefulMockTester<br/>MockSubprocessRunner<br/>tool_ctx fixture"]
        IMPORTLINT["import-linter<br/>━━━━━━━━━━<br/>L0→L1→L2→L3<br/>Architecture gates"]
    end

    subgraph NewTests ["★ NEW TEST FILES"]
        direction LR
        TGATE["★ test_analyze_prs_gates.py<br/>━━━━━━━━━━<br/>PR gate analysis"]
        TFIDELITY["★ test_review_pr_fidelity.py<br/>━━━━━━━━━━<br/>Fidelity checks"]
        TRELEASE["★ test_release_workflows.py<br/>━━━━━━━━━━<br/>CI workflow tests"]
        TDRY["★ test_dry_walkthrough_contracts.py<br/>━━━━━━━━━━<br/>10 contract assertions"]
    end

    subgraph CI ["CI/CD WORKFLOWS"]
        direction TB
        TESTS_CI["tests.yml<br/>━━━━━━━━━━<br/>ubuntu + macos matrix<br/>preflight: uv lock --check<br/>uv sync + task test-all"]
        VBUMP["★ version-bump.yml<br/>━━━━━━━━━━<br/>integration→main merged<br/>MAJOR.MINOR.PATCH+1<br/>sync main→integration"]
        RELEASE["★ release.yml<br/>━━━━━━━━━━<br/>stable branch merge<br/>MAJOR.MINOR+1.0<br/>git tag + GitHub Release"]
    end

    subgraph EntryPoints ["ENTRY POINTS"]
        direction LR
        CLI_EP["autoskillit CLI<br/>━━━━━━━━━━<br/>autoskillit.cli:main<br/>● app.py modified"]
        INSTALL["★ install.sh<br/>━━━━━━━━━━<br/>End-user installer<br/>uv tool install<br/>autoskillit install"]
    end

    %% FLOW %%
    SRC --> PYPROJECT
    TESTS --> PYPROJECT
    PYPROJECT --> TASKFILE
    PYPROJECT --> UVLOCK

    TASKFILE --> RUFF_FMT
    RUFF_FMT --> RUFF_LINT
    RUFF_LINT --> MYPY
    MYPY --> UVCHECK
    UVCHECK --> NOGEN
    NOGEN --> GITLEAKS

    TASKFILE --> PYTEST
    PYTEST --> FIXTURES
    PYTEST --> IMPORTLINT
    PYTEST --> TGATE
    PYTEST --> TFIDELITY
    PYTEST --> TRELEASE
    PYTEST --> TDRY

    PYPROJECT --> TESTS_CI
    TESTS_CI --> VBUMP
    VBUMP --> RELEASE

    PYPROJECT --> CLI_EP
    INSTALL --> CLI_EP

    %% CLASS ASSIGNMENTS %%
    class SRC,TESTS cli;
    class PYPROJECT,TASKFILE,UVLOCK phase;
    class RUFF_FMT,RUFF_LINT,MYPY,UVCHECK,NOGEN,GITLEAKS detector;
    class PYTEST,FIXTURES,IMPORTLINT handler;
    class TGATE,TFIDELITY,TRELEASE,TDRY,VBUMP,RELEASE,INSTALL newComponent;
    class TESTS_CI stateNode;
    class CLI_EP output;
```

### Process Flow Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;

    %% TERMINALS %%
    START([START])
    SUCCESS([SUCCESS])
    VALFAIL([VALIDATION ERROR])

    subgraph Val ["Load-Time Validation"]
        direction LR
        ValidateRecipe["validate_recipe<br/>━━━━━━━━━━<br/>recipe/_api.py"]
        RulesRecipe["★ rules_recipe.py<br/>━━━━━━━━━━<br/>unknown-sub-recipe rule"]
        AvailCtx["★ available_recipes<br/>━━━━━━━━━━<br/>ValidationContext field"]
        NameKnown{"sub-recipe<br/>name known?"}
    end

    subgraph Exec ["Runtime Execution"]
        direction TB
        ToolHandler["★ run_recipe<br/>━━━━━━━━━━<br/>server/tools_recipe.py"]
        FindRecipe["★ _find_recipe()<br/>━━━━━━━━━━<br/>server/helpers.py"]
        FoundDec{"recipe<br/>file found?"}
        RunSub["★ _run_subrecipe_session()<br/>━━━━━━━━━━<br/>server/helpers.py"]
        BuildPrompt["★ build_subrecipe_prompt()<br/>━━━━━━━━━━<br/>cli/_prompts.py"]
        BuildCmd["★ build_subrecipe_cmd()<br/>━━━━━━━━━━<br/>execution/commands.py"]
        SubSess["★ run_subrecipe_session()<br/>━━━━━━━━━━<br/>execution/headless.py"]
    end

    HeadlessProc["Headless Claude<br/>━━━━━━━━━━<br/>AUTOSKILLIT_KITCHEN_OPEN=1<br/>executes sub-recipe YAML"]
    ResultDec{"success: True?"}

    START -->|"recipe YAML loaded"| ValidateRecipe
    ValidateRecipe --> RulesRecipe
    RulesRecipe --> AvailCtx
    AvailCtx --> NameKnown
    NameKnown -->|"unknown name"| VALFAIL
    NameKnown -->|"valid"| ToolHandler
    ToolHandler --> FindRecipe
    FindRecipe --> FoundDec
    FoundDec -->|"not found"| SUCCESS
    FoundDec -->|"found"| RunSub
    RunSub --> BuildPrompt
    RunSub --> BuildCmd
    BuildPrompt -->|"prompt str"| SubSess
    BuildCmd -->|"CLI cmd + env"| SubSess
    SubSess --> HeadlessProc
    HeadlessProc --> ResultDec
    ResultDec -->|"yes"| SUCCESS
    ResultDec -->|"no"| SUCCESS

    %% CLASS ASSIGNMENTS %%
    class START,SUCCESS,VALFAIL terminal;
    class ValidateRecipe handler;
    class RulesRecipe detector;
    class AvailCtx,NameKnown,FoundDec stateNode;
    class ToolHandler,FindRecipe,RunSub,BuildPrompt,BuildCmd,SubSess newComponent;
    class HeadlessProc integration;
    class ResultDec stateNode;
```

### Deployment Diagram

```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart TB
    %% CLASS DEFINITIONS %%
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    subgraph DevMachine ["DEVELOPER MACHINE"]
        direction TB

        subgraph Bootstrap ["BOOTSTRAP (★ NEW)"]
            INSTALL_SH["★ install.sh<br/>━━━━━━━━━━<br/>curl from GitHub stable<br/>Python 3.11+ · uv · claude<br/>uv tool install"]
        end

        subgraph UVTool ["INSTALLED PACKAGE"]
            PKG["autoskillit package<br/>━━━━━━━━━━<br/>~/.local/share/uv/tools/<br/>autoskillit/lib/python3.x/<br/>site-packages/autoskillit/"]
            PLUGIN_CACHE["Plugin cache<br/>━━━━━━━━━━<br/>~/.claude/plugins/cache/<br/>autoskillit-local/"]
        end

        subgraph ClaudeCode ["CLAUDE CODE PROCESS"]
            CC["Claude Code IDE<br/>━━━━━━━━━━<br/>Reads hooks from<br/>~/.claude/settings.json<br/>.claude/settings.json"]
            MCP["● MCP Server (stdio)<br/>━━━━━━━━━━<br/>FastMCP · stdin/stdout<br/>12 ungated + 26 kitchen tools<br/>★ run_recipe · ★ get_ci_status"]
        end

        subgraph HeadlessSessions ["HEADLESS SUBPROCESS SESSIONS (pty)"]
            SKILL_SESS["Skill session<br/>━━━━━━━━━━<br/>claude --print prompt<br/>AUTOSKILLIT_HEADLESS=1<br/>KITCHEN_OPEN via open_kitchen"]
            SUBRECIPE["★ Sub-recipe session<br/>━━━━━━━━━━<br/>claude --print sous-chef-prompt<br/>AUTOSKILLIT_KITCHEN_OPEN=1<br/>NO HEADLESS flag<br/>build_subrecipe_cmd"]
        end

        subgraph LocalStorage ["LOCAL STORAGE"]
            SESSION_LOGS[("Session logs<br/>━━━━━━━━━━<br/>~/.local/share/autoskillit/<br/>logs/sessions/*.jsonl<br/>proc_trace · anomalies")]
            CRASH_TMP[("Crash traces<br/>━━━━━━━━━━<br/>/dev/shm/<br/>autoskillit_trace_pid.jsonl<br/>Linux tmpfs")]
            PROJ_STORE[("Project storage<br/>━━━━━━━━━━<br/>project/.autoskillit/<br/>config.yaml · .secrets.yaml<br/>temp/ · recipes/")]
            CLAUDE_LOGS[("Claude Code logs<br/>━━━━━━━━━━<br/>~/.claude/projects/<br/>encoded-cwd/<br/>session-id.jsonl")]
        end
    end

    subgraph GitHub ["GITHUB INFRASTRUCTURE"]
        direction TB
        GH_ACTIONS["GitHub Actions<br/>━━━━━━━━━━<br/>tests.yml: ubuntu + macos-15<br/>uv sync · task test-all"]
        VBUMP_WF["★ version-bump.yml<br/>━━━━━━━━━━<br/>integration→main merge<br/>patch bump + uv lock<br/>sync main→integration"]
        RELEASE_WF["★ release.yml<br/>━━━━━━━━━━<br/>stable branch merge<br/>minor bump + uv lock<br/>git tag + GitHub Release"]
        GH_REPO["GitHub Repository<br/>━━━━━━━━━━<br/>main · integration · stable<br/>Issues · PRs · Releases"]
        GH_API["GitHub API<br/>━━━━━━━━━━<br/>api.github.com<br/>Actions runs · CI jobs<br/>Issues · PR reviews"]
    end

    subgraph Anthropic ["EXTERNAL: ANTHROPIC"]
        ANT_API["Anthropic API<br/>━━━━━━━━━━<br/>api.anthropic.com<br/>/api/oauth/usage<br/>5-hour quota check"]
    end

    %% BOOTSTRAP FLOW %%
    INSTALL_SH -->|"uv tool install git@stable"| PKG
    PKG -->|"autoskillit install<br/>claude plugin install"| PLUGIN_CACHE

    %% CLAUDE CODE %%
    PLUGIN_CACHE -->|"plugin load"| CC
    CC -->|"spawns stdio"| MCP

    %% MCP → SESSIONS %%
    MCP -->|"run_skill: spawns pty"| SKILL_SESS
    MCP -->|"★ run_recipe: spawns pty<br/>KITCHEN_OPEN=1"| SUBRECIPE

    %% STORAGE WRITES %%
    SKILL_SESS -->|"writes diagnostics"| SESSION_LOGS
    SKILL_SESS -->|"crash: writes"| CRASH_TMP
    SUBRECIPE -->|"writes diagnostics"| SESSION_LOGS
    MCP -->|"reads/writes"| PROJ_STORE
    CC -->|"writes JSONL"| CLAUDE_LOGS

    %% CI/RELEASE %%
    GH_REPO -->|"push/PR event"| GH_ACTIONS
    GH_REPO -->|"integration→main merge"| VBUMP_WF
    GH_REPO -->|"★ PR→stable merge"| RELEASE_WF
    RELEASE_WF -->|"creates"| GH_REPO
    VBUMP_WF -->|"pushes commits"| GH_REPO

    %% EXTERNAL API CALLS %%
    MCP -->|"httpx HTTPS<br/>CI polling"| GH_API
    GH_ACTIONS -->|"gh CLI"| GH_API
    MCP -->|"httpx HTTPS<br/>quota check"| ANT_API
    SKILL_SESS -->|"Anthropic API<br/>inference"| ANT_API

    %% CLASS ASSIGNMENTS %%
    class INSTALL_SH,VBUMP_WF,RELEASE_WF,SUBRECIPE newComponent;
    class CC,SKILL_SESS cli;
    class MCP,GH_ACTIONS handler;
    class PKG,PLUGIN_CACHE phase;
    class SESSION_LOGS,CRASH_TMP,PROJ_STORE,CLAUDE_LOGS stateNode;
    class GH_REPO,GH_API,ANT_API integration;
```

Closes #307
Closes #302
Closes #298
Closes #297
Closes #300
Closes #303

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek added a commit that referenced this pull request Mar 15, 2026
…, Headless Isolation (#404)

## Summary

Integration rollup of **43 PRs** (#293#406) consolidating **62
commits** across **291 files** (+27,909 / −6,040 lines). This release
advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue
integration, sub-recipe composition, a PostToolUse output reformatter,
headless session isolation guards, and comprehensive pipeline
observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new
test files.

---

## Major Features

### GitHub Merge Queue Integration (#370, #362, #390)
- New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's
merge queue until merged, ejected, or timed out (default 600s). Uses
REST + GraphQL APIs with stuck-queue detection and auto-merge
re-enrollment
- New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`)
— never raises; all outcomes are structured results
- `parse_merge_queue_response()` pure function for GraphQL queue entry
parsing
- New `auto_merge` ingredient in `implementation.yaml` and
`remediation.yaml` — enrolls PRs in the merge queue after CI passes
- Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue
→ wait → handle ejections → re-enter
- `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step
1.5 (CI/review eligibility filtering)

### Sub-Recipe Composition (#380)
- Recipe steps can now reference sub-recipes via `sub_recipe` + `gate`
fields — lazy-loaded and merged at validation time
- Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines
sub-recipe steps with safe name-prefixing and route remapping (`done` →
parent's `on_success`, `escalate` → parent's `on_failure`)
- `_build_active_recipe()` evaluates gate ingredients against
overrides/defaults; dual validation runs on both active and combined
recipes
- First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm →
dispatch workflow, gated by `sprint_mode` ingredient (hidden, default
false)
- Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry`
placeholder step
- New semantic rules: `unknown-sub-recipe` (ERROR),
`circular-sub-recipe` (ERROR) with DFS cycle detection

### PostToolUse Output Reformatter (#293, #405)
- `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw
MCP JSON responses to Markdown-KV before Claude consumes them (30–77%
token overhead reduction)
- Dedicated formatters for 11 high-traffic tools (`run_skill`,
`run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.)
plus a generic KV formatter for remaining tools
- Pipeline vs. interactive mode detection via hook config file
- Unwraps Claude Code's `{"result": "<json-string>"}` envelope before
dispatching
- 1,516-line test file with 40+ behavioral tests

### Headless Session Isolation (#359, #393, #397, #405, #406)
- **Env isolation**: `build_sanitized_env()` strips
`AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing
`AUTOSKILLIT_HEADLESS=1` from leaking into test runners
- **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all
relative paths to session CWD; `_validate_output_paths()` checks
structured output tokens against CWD prefix; `_scan_jsonl_write_paths()`
post-session scanner catches actual Write/Edit/Bash tool calls outside
CWD
- **Headless orchestration guard**: new PreToolUse hook blocks
`run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`,
enforcing Tier 1/Tier 2 nesting invariant
- **`_require_not_headless()` server-side guard**: blocks 10
orchestration-only tools from headless sessions at the handler layer
- **Unified error response contract**: `headless_error_result()`
produces consistent 9-field responses;
`_build_headless_error_response()` canonical builder for all failure
paths in `tools_integrations.py`

### Cook UX Overhaul (#375, #363)
- `open_kitchen` now accepts optional `name` + `overrides` — opens
kitchen AND loads recipe in a single call
- Pre-launch terminal preview with ANSI-colored flow diagram and
ingredients table via new `cli/_ansi.py` module
- `--dangerously-skip-permissions` warning banner with interactive
confirmation prompt
- Randomized session greetings from themed pools
- Orchestrator prompt rewritten: recipe YAML no longer injected via
`--append-system-prompt`; session calls `open_kitchen('{recipe_name}')`
as first action
- Conversational ingredient collection replaces mechanical per-field
prompting

---

## New MCP Tools

| Tool | Gate | Description |
|------|------|-------------|
| `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue
(REST + GraphQL) |
| `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA
for review-first gating |
| `get_quota_events` | Ungated | Surfaces quota guard decisions from
`quota_events.jsonl` |

---

## Pipeline Observability (#318, #341)

- **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source
of truth for all telemetry rendering; replaces dual-formatter
anti-pattern. Four rendering modes: Markdown table, terminal table,
compact KV (for PostToolUse hook)
- `get_token_summary` and `get_timing_summary` gain `format` parameter
(`"json"` | `"table"`)
- `wall_clock_seconds` merged into token summary output — see duration
alongside token counts in one call
- **Telemetry clear marker**: `write_telemetry_clear_marker()` /
`read_telemetry_clear_marker()` prevent token accounting drift on MCP
server restart after `clear=True`
- **Quota event logging**: `quota_check.py` hook now writes structured
JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to
`quota_events.jsonl`

---

## CI Watcher & Remote Resolution Fixes (#395, #406)

- **`CIRunScope` value object** — carries `workflow` + `head_sha` scope;
replaces bare `head_sha` parameter across all CI watcher signatures
- **Workflow filter**: `wait_for_ci` and `get_ci_status` accept
`workflow` parameter (falls back to project-level `config.ci.workflow`),
preventing unrelated workflows (version bumps, labelers) from satisfying
CI checks
- **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out,
startup_failure, cancelled}`
- **Canonical remote resolver** (`execution/remote_resolver.py`):
`resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` —
correctly resolves `owner/repo` after `clone_repo` sets `origin` to
`file://` isolation URL
- **Clone isolation fix**: `clone_repo` now always clones from remote
URL (never local path); sets `origin=file:///<clone>` for isolation and
`upstream=<real_url>` for push/CI operations

---

## PR Pipeline Gates (#317, #343)

- **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`,
`partition_prs()` — partitions PRs into
eligible/CI-blocked/review-blocked with human-readable reasons
- **`pipeline/fidelity.py`**: `extract_linked_issues()`
(Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema
validation
- **`check_pr_mergeable`** now returns `mergeable_status` field
alongside boolean
- **`release_issue`** gains `target_branch` + `staged_label` parameters
for staged issue lifecycle on non-default branches (#392)

---

## Recipe System Changes

### Structural
- `RecipeIngredient.hidden` field — excluded from ingredients table
(used for internal flags like `sprint_mode`)
- `Recipe.experimental` flag parsed from YAML
- `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth
- `format_ingredients_table()` with sorted display order (required →
auto-detect → flags → optional → constants)
- Diagram rendering engine (~670 lines) removed from `diagrams.py` —
rendering now handled by `/render-recipe` skill; format version bumped
to v7

### Recipe YAML Changes
- **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`,
`bugfix-loop.yaml`
- **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml`
- **`implementation.yaml`**: merge queue steps,
`auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""`
(auto-detect), CI workflow filter, `extract_pr_number` step
- **`remediation.yaml`**: `topic` → `task` rename, merge queue steps,
`dry_walkthrough` retries:3 with forward-only routing, `verify` → `test`
rename
- **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step
(replaces `create-review-pr`), post-PR mergeability polling, review
cycle with `resolve-review` retries

### New Semantic Rules
- `missing-output-patterns` (WARNING) — flags `run_skill` steps without
`expected_output_patterns`
- `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist
- `circular-sub-recipe` (ERROR) — DFS cycle detection
- `unknown-skill-command` (ERROR) — validates skill names against
bundled set
- `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes
`open-pr`

---

## New Skills (24)

### Architecture Lens Family (13)
`arch-lens-c4-container`, `arch-lens-concurrency`,
`arch-lens-data-lineage`, `arch-lens-deployment`,
`arch-lens-development`, `arch-lens-error-resilience`,
`arch-lens-module-dependency`, `arch-lens-operational`,
`arch-lens-process-flow`, `arch-lens-repository-access`,
`arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle`

### Audit Family (5)
`audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`,
`audit-tests`

### Planning & Diagramming (3)
`elaborate-phase`, `make-arch-diag`, `make-req`

### Bug/Guard Lifecycle (2)
`design-guards`, `verify-diag`

### Pipeline (1)
`open-integration-pr` — creates integration PRs with per-PR details,
arch-lens diagrams, carried-forward `Closes #N` references, and
auto-closes collapsed PRs

### Sprint Planning (1 — gated by sub-recipe)
`sprint-planner` — selects a focused, conflict-free sprint from a triage
manifest

---

## Skill Modifications (Highlights)

- **`analyze-prs`**: merge queue detection, CI/review eligibility
filtering, queue-mode ordering
- **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git
history mining + GitHub issue cross-reference)
- **`review-pr`**: deterministic diff annotation via
`diff_annotator.py`, echo-primary-obligation step, post-completion
confirmation, degraded-mode narration
- **`collapse-issues`**: content fidelity enforcement — per-issue
`fetch_github_issue` calls, copy-mode body assembly (#388)
- **`prepare-issue`**: multi-keyword dedup search, numbered candidate
selection, extend-existing-issue flow
- **`resolve-review`**: GraphQL thread auto-resolution after addressing
findings (#379)
- **`resolve-merge-conflicts`**: conflict resolution decision report
with per-file log (#389)
- **Cross-skill**: output tokens migrated to `key = value` format;
code-index paths made generic with fallback notes; arch-lens references
fully qualified; anti-prose guards at loop boundaries

---

## CLI & Hooks

### New CLI Commands
- `autoskillit install` — plugin installation + cache refresh
- `autoskillit upgrade` — `.autoskillit/scripts/` →
`.autoskillit/recipes/` migration

### CLI Changes
- `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix`
flag removed
- `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware
registration
- `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions`
confirmation
- `recipes render`: repurposed from generator to viewer (delegates to
`/render-recipe`)
- `serve`: server import deferred to after `configure_logging()` to
prevent stdout corruption

### New Hooks
- `branch_protection_guard.py` (PreToolUse) — denies
`merge_worktree`/`push_to_remote` targeting protected branches
- `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration
tools in headless sessions
- `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter

### Hook Infrastructure
- `HookDef.event_type` field — registry now handles both PreToolUse and
PostToolUse
- `generate_hooks_json()` groups entries by event type
- `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made
event-type-agnostic

---

## Core & Config

### New Core Modules
- `core/branch_guard.py` — `is_protected_branch()` pure function
- `core/github_url.py` — `parse_github_repo()` +
`normalize_owner_repo()` canonical parsers

### Core Type Expansions
- `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset
- `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from
`UNGATED_TOOLS`
- `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response
- `CIRunScope` — immutable scope for CI watcher calls
- `MergeQueueWatcher` protocol
- `SkillResult.cli_subtype` + `write_path_warnings` fields
- `SubprocessRunner.env` parameter

### Config
- `safety.protected_branches`: `[main, integration, stable]`
- `github.staged_label`: `"staged"`
- `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`)
- `branching.default_base_branch`: `"integration"` → `"main"`
- `ModelConfig.default`: `str | None` → `str = "sonnet"`

---

## Infrastructure & Release

### Version
- `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock`
- FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399)

### CI/CD Workflows
- **`version-bump.yml`** (new) — auto patch-bumps `main` on integration
PR merge, force-syncs integration branch one patch ahead
- **`release.yml`** (new) — minor version bump + GitHub Release on merge
to `stable`
- **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python +
Actions)
- **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for
`stable`

### PyPI Readiness
- `pyproject.toml`: `readme`, `license`, `authors`, `keywords`,
`classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion
list

### readOnlyHint Parallel Execution Fix
- All MCP tools annotated `readOnlyHint=True` — enables Claude Code
parallel tool execution (~7x speedup). One deliberate exception:
`wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue
state)

### Tool Response Exception Boundary
- `track_response_size` decorator catches unhandled exceptions and
serializes them as `{"success": false, "subtype": "tool_exception"}` —
prevents FastMCP opaque error wrapping

### SkillResult Subtype Normalization (#358)
- `_normalize_subtype()` gate eliminates dual-source contradiction
between CLI subtype and session outcome
- Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race
artifact)
- Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` /
`"missing_completion_marker"` / `"adjudicated_failure"`

---

## Test Coverage

**47 new test files** (+12,703 lines) covering:

| Area | Key Tests |
|------|-----------|
| Merge queue watcher state machine | `test_merge_queue.py` (226 lines)
|
| Clone isolation × CI resolution | `test_clone_ci_contract.py`,
`test_remote_resolver.py` |
| PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) |
| Branch protection + headless guards |
`test_branch_protection_guard.py`,
`test_headless_orchestration_guard.py` |
| Sub-recipe composition | 5 test files (schema, loading, validation,
sprint mode × 2) |
| Telemetry formatter | `test_telemetry_formatter.py` (281 lines) |
| PR pipeline gates | `test_analyze_prs_gates.py`,
`test_review_pr_fidelity.py` |
| Diff annotator | `test_diff_annotator.py` (242 lines) |
| Skill compliance | Output token format, genericization, loop-boundary
guards |
| Release workflows | Structural contracts for `version-bump.yml`,
`release.yml` |
| Issue content fidelity | Body-assembling skills must call
`fetch_github_issue` per-issue |
| CI watcher scope | `test_ci_params.py` — workflow_id query param
composition |

---

## Consolidated PRs

#293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337,
#338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366,
#368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392,
#393, #395, #396, #397, #399, #405, #406

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant