Pipeline observability: quota events, wall-clock timing, drift fix#341
Conversation
…onds, and telemetry drift fix - Add `get_quota_events` ungated MCP tool to surface quota_check.py hook decisions (approved/blocked/cache_miss) from quota_events.jsonl - Merge `timing_log.total_seconds` into `get_token_summary` response as `wall_clock_seconds` per step; falls back to `elapsed_seconds` when no timing entry exists; also added to `_format_token_summary` markdown output - Write `.telemetry_cleared_at` marker on `clear=True` in all three status tools (get_token_summary, get_timing_summary, get_pipeline_report) - `_state._initialize` reads the marker on startup and uses `max(now-24h, marker)` as the effective `since` lower bound, preventing double-counting of cleared sessions on server restart - Add `write_telemetry_clear_marker` / `read_telemetry_clear_marker` to `execution/session_log.py` and re-export from `execution/__init__.py` - Update CLAUDE.md tool count 38 → 39, add get_quota_events to tool list - Add `get_quota_events` to UNGATED_TOOLS frozenset in core/types.py - Tests: clear marker roundtrip, _initialize drift prevention, get_quota_events, wall_clock_seconds fallback, clear=True marker writes for all three tools Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lementation - Route write_telemetry_clear_marker/resolve_log_dir through server/helpers.py re-exports so tools_status.py does not import from autoskillit.execution (REQ-IMP-003, test_server_tools_import_only_allowed_packages, test_no_cross_package_submodule_imports) - Extract for-loop/dict-comprehension from get_token_summary into _merge_wall_clock_seconds() module-level helper (REQ-CNST-008) - Replace except Exception: pass with logger.debug(..., exc_info=True) in tools_status.py and _state.py (ARCH-003) - Fix except Exception: continue in _read_quota_events to use specific json.JSONDecodeError (ARCH-003) - Add get_quota_events to expected frozenset in test_ungated_tools_contains_expected_names Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Centralizes the repeated resolve_log_dir(_get_ctx().config.linux_tracing.log_dir) expression into a module-private helper. Replaces the three identical inline call sites in get_pipeline_report, get_token_summary, and get_timing_summary. Adds TestGetLogRoot unit tests for the new helper. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: changes_requested
| assert result["events"][0]["event"] == "blocked" # most recent first | ||
|
|
||
| @pytest.mark.anyio | ||
| async def test_limits_to_n_events(self, tool_ctx, tmp_path, monkeypatch): |
There was a problem hiding this comment.
[warning] tests: test_limits_to_n_events does not assert total_count equals 10 (the full log size). Only checks len(result["events"]) == 3 but omits verifying total_count reflects the full dataset.
| ] | ||
| (log_dir / "quota_events.jsonl").write_text("\n".join(lines) + "\n") | ||
| monkeypatch.setattr(tool_ctx.config.linux_tracing, "log_dir", str(log_dir)) | ||
| result = json.loads(await get_quota_events(n=3)) |
There was a problem hiding this comment.
[warning] tests: test_limits_to_n_events does not verify ordering of returned events (most-recent-first). The 10-event pagination test never checks which 3 events are returned.
| result = json.loads(await get_token_summary()) | ||
| step = next(s for s in result["steps"] if s["step_name"] == "step-b") | ||
| # No timing_log entry → falls back to elapsed_seconds | ||
| assert step["wall_clock_seconds"] == pytest.approx(5.0) |
There was a problem hiding this comment.
[warning] tests: test_wall_clock_falls_back_to_elapsed_when_no_timing never verifies timing_log has no step-b entry. Parallel test pollution could silently bypass the fallback path.
tests/server/test_server_init.py
Outdated
| (log_dir / ".telemetry_cleared_at").write_text(three_hours_ago.isoformat()) | ||
|
|
||
| monkeypatch.setattr(tool_ctx.config.linux_tracing, "log_dir", str(log_dir)) | ||
| monkeypatch.setattr(_state, "_ctx", None) |
There was a problem hiding this comment.
[warning] tests: test_initialize_uses_clear_marker_as_since_bound: monkeypatching _state._ctx=None conflicts with tool_ctx fixture which already patches it. Interleaved None-then-reinit may leave _ctx in unexpected state under xdist teardown.
tests/server/test_server_init.py
Outdated
| ) | ||
|
|
||
| monkeypatch.setattr(tool_ctx.config.linux_tracing, "log_dir", str(log_dir)) | ||
| monkeypatch.setattr(_state, "_ctx", None) |
There was a problem hiding this comment.
[warning] tests: Same xdist/fixture-teardown concern as L661. Also missing boundary condition: marker timestamp == session timestamp (boundary for <= vs < in since_dt logic).
src/autoskillit/server/helpers.py
Outdated
| from autoskillit.core import RESERVED_LOG_RECORD_KEYS, TerminationReason, get_logger | ||
| from autoskillit.execution import ( | ||
| resolve_log_dir, # noqa: F401 — used by tools_integrations.py | ||
| read_telemetry_clear_marker, # noqa: F401 — used by tools_status.py |
There was a problem hiding this comment.
[warning] slop: noqa comment claims read_telemetry_clear_marker is used by tools_status.py but it is not called anywhere in tools_status.py. The re-export and comment are misleading dead weight.
|
|
||
| @mcp.tool(tags={"automation"}) | ||
| @track_response_size("kitchen_status") | ||
| async def kitchen_status() -> str: |
There was a problem hiding this comment.
[warning] defense: _merge_wall_clock_seconds parameter timing_log is typed as Any, bypassing static type checking. Should be typed as DefaultTimingLog or its protocol to enable mypy to catch misuse at call sites.
| @@ -151,9 +190,66 @@ async def get_timing_summary(clear: bool = False) -> str: | |||
| total = _get_ctx().timing_log.compute_total() | |||
| if clear: | |||
There was a problem hiding this comment.
[warning] fidelity: write_telemetry_clear_marker is called when get_timing_summary(clear=True) fires, advancing the shared fence even when token_log and audit are NOT cleared. On next restart, _state._initialize skips sessions for all three log types — may under-count token and audit data that was never cleared. Issue #302 does not describe this shared-fence side effect.
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit Review Findings
Verdict: changes_requested
8 actionable findings (all warning severity). Implementation is correct and addresses all three requirements from issue #302 (quota events tool, wall-clock seconds in token summary, drift prevention fence). Inline comments posted above.
src/autoskillit/server/helpers.py
- L14 [warning/slop]: noqa comment claims
read_telemetry_clear_markeris 'used by tools_status.py' but it is not called anywhere in tools_status.py — unused re-export with misleading comment.
src/autoskillit/server/tools_status.py
- L37 [warning/defense]:
_merge_wall_clock_secondsparametertiming_logtyped asAny— bypasses static type checking; useDefaultTimingLogor its protocol. - L191 [warning/fidelity]: Shared-fence side effect —
get_timing_summary(clear=True)advances the fence for all three log types (token_log, timing_log, audit), not just timing_log. On next restart,_state._initializemay skip token/audit sessions that were never explicitly cleared. Not described in issue #302.
tests/server/test_tools_status.py
- L627 [warning/tests]:
test_limits_to_n_eventsomitsassert result["total_count"] == 10— total_count goes untested for the n-limiting case. - L636 [warning/tests]:
test_limits_to_n_eventsnever asserts which 3 events are returned (oldest or newest) — ordering unverified in the paginated case. - L695 [warning/tests]:
test_wall_clock_falls_back_to_elapsed_when_no_timingnever asserts timing_log has no 'step-b' entry — parallel pollution could silently bypass the fallback.
tests/server/test_server_init.py
- L661 [warning/tests]:
test_initialize_uses_clear_marker_as_since_bounduses bare_state._ctx = Noneassignment conflicting with fixture monkeypatch — xdist teardown ordering may leave _ctx in unexpected state. - L705 [warning/tests]: Same xdist concern as L661; missing boundary condition test (marker ts == session ts).
…misleading comment from helpers.py
…imingStore protocol instead of Any
…clear=True in docstring
…allback assertions
…annotations Both sides of the conflict were complementary: the PR added _get_log_root() and _merge_wall_clock_seconds() helpers plus get_quota_events tool, while integration added readOnlyHint=True annotations to all @mcp.tool decorators. Resolution keeps all new code and applies readOnlyHint to every decorator including the new get_quota_events tool. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…333, #342 into integration (#351) ## Integration Summary Collapsed 9 PRs into `pr-batch/pr-merge-20260311-133920` targeting `integration`. ## Merged PRs | # | Title | Complexity | Additions | Deletions | Overlaps | |---|-------|-----------|-----------|-----------|---------| | #337 | Implementation Plan: Dry Walkthrough — Test Command Genericization (Issue #307) | simple | +29 | -2 | — | | #339 | Implementation Plan: Release CI — Force-Push Integration Back-Sync | simple | +88 | -45 | — | | #336 | Enhance prepare-issue with Duplicate Detection and Broader Triggers | needs_check | +161 | -8 | — | | #332 | Rectify: Display Output Bugs #329 — Terminal Targets Consolidation — PART A ONLY | needs_check | +783 | -13 | — | | #338 | Implementation Plan: Pre-release Readiness — Stability Fixes | needs_check | +238 | -36 | — | | #343 | Implementation Plan: PR Pipeline Gates — Mergeability Gate and Review Cycle | needs_check | +384 | -5 | #338 | | #341 | Pipeline observability: quota events, wall-clock timing, drift fix | needs_check | +480 | -5 | #332, #338 | | #333 | Remove run_recipe — Eliminate Sub-Orchestrator Pattern | needs_check | +538 | -655 | #332, #338, #341 | | #342 | feat: genericize codebase and bundle external dependencies for public release | needs_check | +5286 | -1062 | #332, #333, #338, #341, #343 | ## Audit **Verdict:** GO ## Architecture Impact ### Development Diagram ```mermaid %%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; subgraph SourceTree ["PROJECT STRUCTURE (● = modified)"] direction TB SRC["● src/autoskillit/<br/>━━━━━━━━━━<br/>105 .py source files<br/>cli · config · core<br/>execution · hooks · pipeline<br/>recipe · server · workspace"] SKILLS["● + ★ src/autoskillit/skills/<br/>━━━━━━━━━━<br/>52 bundled skills<br/>★ 13 arch-lens-* SKILL.md added<br/>★ 3 audit-* SKILL.md added<br/>● 14 existing skills updated"] RECIPES["● src/autoskillit/recipes/<br/>━━━━━━━━━━<br/>8 bundled YAML recipes<br/>All recipes updated"] TESTS["● + ★ tests/<br/>━━━━━━━━━━<br/>173 .py test files<br/>★ 6 new test files added"] end subgraph Build ["BUILD TOOLING"] direction TB PYPROJECT["● pyproject.toml<br/>━━━━━━━━━━<br/>hatchling build backend<br/>uv package manager<br/>10 runtime deps"] TASKFILE["Taskfile.yml<br/>━━━━━━━━━━<br/>test-all · test-check<br/>test-smoke · install-worktree"] end subgraph Quality ["CODE QUALITY GATES"] direction TB RFMT["ruff-format<br/>━━━━━━━━━━<br/>Auto-fix formatting"] RLINT["ruff<br/>━━━━━━━━━━<br/>Lint + auto-fix"] MYPY["mypy src/<br/>━━━━━━━━━━<br/>--ignore-missing-imports"] UVLOCK["uv lock --check<br/>━━━━━━━━━━<br/>Lock file integrity"] SECRETS["gitleaks<br/>━━━━━━━━━━<br/>Secret scanning"] GUARD["★ headless_orchestration_guard.py<br/>━━━━━━━━━━<br/>★ PreToolUse hook<br/>Blocks run_skill/run_cmd/run_python<br/>from headless sessions"] end subgraph Testing ["TEST FRAMEWORK"] direction TB PYTEST["pytest + asyncio_mode=auto<br/>━━━━━━━━━━<br/>xdist -n 4 parallel<br/>timeout=60s signal method"] NEWTEST["★ New Test Files<br/>━━━━━━━━━━<br/>★ test_headless_orchestration_guard<br/>★ test_audit_and_fix_degradation<br/>★ test_rules_inputs<br/>★ test_skill_genericization<br/>★ test_pyproject_metadata<br/>★ test_release_sanity"] end subgraph CI ["CI/CD WORKFLOWS"] direction LR TESTS_WF["tests.yml<br/>━━━━━━━━━━<br/>PR test gate"] RELEASE_WF["release.yml<br/>━━━━━━━━━━<br/>Release automation"] BUMP_WF["● version-bump.yml<br/>━━━━━━━━━━<br/>● Force-push back-sync<br/>integration → main"] end subgraph EntryPoints ["ENTRY POINTS"] EP["autoskillit CLI<br/>━━━━━━━━━━<br/>serve · init · skills<br/>recipes · doctor · workspace"] end SRC --> PYPROJECT SKILLS --> PYPROJECT TESTS --> PYTEST PYPROJECT --> TASKFILE PYPROJECT --> RFMT RFMT --> RLINT RLINT --> MYPY MYPY --> UVLOCK UVLOCK --> SECRETS SECRETS --> GUARD GUARD --> PYTEST PYTEST --> NEWTEST NEWTEST --> BUMP_WF TESTS_WF --> PYTEST PYPROJECT --> EP class SRC,TESTS stateNode; class SKILLS,RECIPES newComponent; class PYPROJECT,TASKFILE phase; class RFMT,RLINT,MYPY,UVLOCK,SECRETS detector; class GUARD newComponent; class PYTEST handler; class NEWTEST newComponent; class TESTS_WF,RELEASE_WF phase; class BUMP_WF newComponent; class EP output; ``` **Color Legend:** | Color | Category | Description | |-------|----------|-------------| | Dark Teal | Structure | Source directories and test suite | | Green (★) | New/Modified | New files and components added in this PR | | Purple | Build | Build configuration and task automation | | Red | Quality Gates | Pre-commit hooks, linters, type checker | | Orange | Test Runner | pytest execution engine | | Dark Teal | Entry Points | CLI commands | ### Module Dependency Diagram ```mermaid %%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%% graph TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff; subgraph L0 ["L0 — CORE (zero autoskillit imports)"] direction LR TYPES["● core/types.py<br/>━━━━━━━━━━<br/>GATED_TOOLS · UNGATED_TOOLS<br/>RecipeSource (★ promoted here)<br/>ClaudeFlags · StrEnums<br/>fan-in: ~75 files"] COREIO["core/io.py · logging.py · paths.py<br/>━━━━━━━━━━<br/>Atomic write · Logger · pkg_root()"] end subgraph L1P ["L1 — PIPELINE (imports L0 only)"] direction TB GATE["● pipeline/gate.py<br/>━━━━━━━━━━<br/>DefaultGateState<br/>gate_error_result()<br/>★ headless_error_result()<br/>re-exports GATED/UNGATED_TOOLS"] PIPEINIT["● pipeline/__init__.py<br/>━━━━━━━━━━<br/>Re-exports public surface<br/>ToolContext · AuditLog<br/>TokenLog · DefaultGateState"] end subgraph L1E ["L1 — EXECUTION (imports L0 only)"] direction TB HEADLESS["● execution/headless.py<br/>━━━━━━━━━━<br/>Headless Claude sessions<br/>Imports core types via TYPE_CHECKING<br/>for ToolContext (no runtime cycle)"] COMMANDS["● execution/commands.py<br/>━━━━━━━━━━<br/>ClaudeHeadlessCmd builder"] SESSION_LOG["● execution/session_log.py<br/>━━━━━━━━━━<br/>Session diagnostics writer"] end subgraph L2 ["L2 — RECIPE (imports L0+L1)"] direction TB SCHEMA["● recipe/schema.py<br/>━━━━━━━━━━<br/>Recipe · RecipeStep · DataFlowWarning<br/>RecipeSource (now from L0)"] RULES["● recipe/rules_inputs.py<br/>━━━━━━━━━━<br/>★ Ingredient validation rules<br/>reads GATED_TOOLS from L0 via<br/>pipeline re-export"] ANALYSIS["● recipe/_analysis.py<br/>━━━━━━━━━━<br/>Step graph builder"] VALIDATOR["● recipe/validator.py<br/>━━━━━━━━━━<br/>validate_recipe()"] end subgraph L3S ["L3 — SERVER (imports all layers)"] direction TB HELPERS["● server/helpers.py<br/>━━━━━━━━━━<br/>_require_enabled() — reads gate<br/>★ _require_not_headless()<br/>Shared by all tool handlers"] TOOLS_EX["● server/tools_execution.py<br/>━━━━━━━━━━<br/>run_cmd · run_python · run_skill<br/>✗ run_recipe REMOVED<br/>Uses _require_not_headless()"] TOOLS_GIT["● server/tools_git.py<br/>━━━━━━━━━━<br/>merge_worktree · classify_fix<br/>● check_pr_mergeable (new gate)"] TOOLS_K["● server/tools_kitchen.py<br/>━━━━━━━━━━<br/>open_kitchen · close_kitchen"] FACTORY["● server/_factory.py<br/>━━━━━━━━━━<br/>Composition root<br/>Wires ToolContext"] end subgraph L3H ["L3 — HOOKS (stdlib only for guard)"] direction LR HOOK_GUARD["★ hooks/headless_orchestration_guard.py<br/>━━━━━━━━━━<br/>★ PreToolUse hook (stdlib only)<br/>Blocks run_skill/run_cmd/run_python<br/>from AUTOSKILLIT_HEADLESS=1 sessions<br/>NO autoskillit imports"] PRETTY["● hooks/pretty_output.py<br/>━━━━━━━━━━<br/>PostToolUse response formatter"] end subgraph L3C ["L3 — CLI (imports all layers)"] direction LR CLI_APP["● cli/app.py<br/>━━━━━━━━━━<br/>serve · init · skills · recipes<br/>doctor · workspace"] CLI_PROMPTS["● cli/_prompts.py<br/>━━━━━━━━━━<br/>Orchestrator prompt builder"] end TYPES -->|"fan-in ~75"| GATE TYPES -->|"fan-in ~75"| HEADLESS TYPES -->|"fan-in ~75"| SCHEMA COREIO --> PIPEINIT GATE --> PIPEINIT PIPEINIT -->|"gate_error_result<br/>headless_error_result"| HELPERS HEADLESS --> HELPERS COMMANDS --> HEADLESS SESSION_LOG --> HELPERS SCHEMA -->|"RecipeSource from L0"| RULES RULES --> VALIDATOR ANALYSIS --> VALIDATOR HELPERS -->|"_require_not_headless"| TOOLS_EX HELPERS --> TOOLS_GIT HELPERS --> TOOLS_K VALIDATOR --> FACTORY PIPEINIT --> FACTORY FACTORY --> CLI_APP FACTORY --> CLI_PROMPTS HOOK_GUARD -.->|"ENV: AUTOSKILLIT_HEADLESS<br/>zero autoskillit imports"| TOOLS_EX class TYPES,COREIO stateNode; class GATE,PIPEINIT phase; class HEADLESS,COMMANDS,SESSION_LOG handler; class SCHEMA,RULES,ANALYSIS,VALIDATOR phase; class HELPERS,TOOLS_EX,TOOLS_GIT,TOOLS_K handler; class FACTORY cli; class CLI_APP,CLI_PROMPTS cli; class HOOK_GUARD newComponent; class PRETTY handler; ``` **Color Legend:** | Color | Category | Description | |-------|----------|-------------| | Teal | L0 Core | High fan-in foundation types (zero reverse deps) | | Purple | L1/L2 Control | Pipeline gate, recipe schema and rules | | Orange | L1/L3 Processors | Execution handlers, server tool handlers | | Dark Blue | L3 CLI | Composition root and CLI entry points | | Green (★) | New Components | headless_orchestration_guard — standalone hook | | Dashed | ENV Signal | OS-level check; no code import relationship | Closes #307 Closes #327 Closes #308 Closes #329 Closes #304 Closes #328 Closes #302 Closes #330 Closes #311 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit --- ## Merge Conflict Resolution The batch branch was rebased onto `integration` to resolve 17 file conflicts. All conflicts arose because PRs #337–#341 were squash-merged into both `integration` (directly) and the batch branch (via the pipeline), while PRs #333 and #342 required conflict resolution work that only exists on the batch branch. **Resolution principle:** Batch branch version wins for all files touched by #333/#342 conflict resolution and remediation, since that state was fully tested (3752 passed). Integration-only additions (e.g. `TestGetQuotaEvents`) were preserved where they don't overlap. ### Per-file decisions | File | Decision | Rationale | |------|----------|-----------| | `CLAUDE.md` | **Batch wins** | Batch has corrected tool inventory (run_recipe removed, get_quota_events added, 25 kitchen tools) | | `core/types.py` | **Batch wins** | Batch splits monolithic UNGATED_TOOLS into WORKER_TOOLS + HEADLESS_BLOCKED_UNGATED_TOOLS; removes run_recipe from GATED_TOOLS | | `execution/__init__.py` | **Batch wins** | Batch removes dead exports (build_subrecipe_cmd, run_subrecipe_session) | | `execution/headless.py` | **Batch wins** | Batch deletes run_subrecipe_session function (530+ lines); keeps run_headless_core with token_log error handling | | `hooks/pretty_output.py` | **Batch wins** | Batch removes run_recipe from _UNFORMATTED_TOOLS, adds get_quota_events | | `recipes/pr-merge-pipeline.yaml` | **Batch wins** | Batch has base_branch required:true, updated kitchen rules (main instead of integration) | | `server/_state.py` | **Batch wins** | Batch adds .telemetry_cleared_at marker reading in _initialize | | `server/helpers.py` | **Batch wins** | Batch removes _run_subrecipe and run_subrecipe_session import; adds _require_not_headless | | `server/tools_git.py` | **Batch wins** | Batch has updated classify_fix with git fetch and check_pr_mergeable gate | | `server/tools_kitchen.py` | **Batch wins** | Batch adds headless gates to open_kitchen/close_kitchen; adds TOOL_CATEGORIES listing | | `server/tools_status.py` | **Merge both** | Batch headless gates + wall_clock_seconds merged with integration's TestGetQuotaEvents (deduplicated) | | `tests/conftest.py` | **Batch wins** | Batch replaces AUTOSKILLIT_KITCHEN_OPEN with AUTOSKILLIT_HEADLESS in fixture | | `tests/execution/test_headless.py` | **Batch wins** | Batch removes run_subrecipe_session tests (deleted code); updates docstring | | `tests/recipe/test_bundled_recipes.py` | **Merge both** | Batch base_branch=main assertions + integration WF7 graph test both kept | | `tests/server/test_tools_kitchen.py` | **Batch wins** | Batch adds headless gate denial tests for open/close kitchen | | `tests/server/test_tools_status.py` | **Merge both** | Batch headless gate tests merged with integration quota events tests | ### Post-rebase fixes - Removed duplicate `TestGetQuotaEvents` class (existed in both batch commit and auto-merged integration code) - Fixed stale `_build_tool_listing` → `_build_tool_category_listing` attribute reference - Added `if diagram: print(diagram)` to `cli/app.py` cook function (test expected terminal output) ### Verification - **3752 passed**, 23 skipped, 0 failures - 7 architecture contracts kept, 0 broken - Pre-commit hooks all pass --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…, Headless Isolation (#404) ## Summary Integration rollup of **43 PRs** (#293–#406) consolidating **62 commits** across **291 files** (+27,909 / −6,040 lines). This release advances AutoSkillit from v0.2.0 to v0.3.1 with GitHub merge queue integration, sub-recipe composition, a PostToolUse output reformatter, headless session isolation guards, and comprehensive pipeline observability — plus 24 new bundled skills, 3 new MCP tools, and 47 new test files. --- ## Major Features ### GitHub Merge Queue Integration (#370, #362, #390) - New `wait_for_merge_queue` MCP tool — polls a PR through GitHub's merge queue until merged, ejected, or timed out (default 600s). Uses REST + GraphQL APIs with stuck-queue detection and auto-merge re-enrollment - New `DefaultMergeQueueWatcher` L1 service (`execution/merge_queue.py`) — never raises; all outcomes are structured results - `parse_merge_queue_response()` pure function for GraphQL queue entry parsing - New `auto_merge` ingredient in `implementation.yaml` and `remediation.yaml` — enrolls PRs in the merge queue after CI passes - Full queue-mode path added to `merge-prs.yaml`: detect queue → enqueue → wait → handle ejections → re-enter - `analyze-prs` skill gains Step 0.5 (merge queue detection) and Step 1.5 (CI/review eligibility filtering) ### Sub-Recipe Composition (#380) - Recipe steps can now reference sub-recipes via `sub_recipe` + `gate` fields — lazy-loaded and merged at validation time - Composition engine in `recipe/_api.py`: `_merge_sub_recipe()` inlines sub-recipe steps with safe name-prefixing and route remapping (`done` → parent's `on_success`, `escalate` → parent's `on_failure`) - `_build_active_recipe()` evaluates gate ingredients against overrides/defaults; dual validation runs on both active and combined recipes - First sub-recipe: `sprint-prefix.yaml` — triage → plan → confirm → dispatch workflow, gated by `sprint_mode` ingredient (hidden, default false) - Both `implementation.yaml` and `remediation.yaml` gain `sprint_entry` placeholder step - New semantic rules: `unknown-sub-recipe` (ERROR), `circular-sub-recipe` (ERROR) with DFS cycle detection ### PostToolUse Output Reformatter (#293, #405) - `pretty_output.py` — new 671-line PostToolUse hook that rewrites raw MCP JSON responses to Markdown-KV before Claude consumes them (30–77% token overhead reduction) - Dedicated formatters for 11 high-traffic tools (`run_skill`, `run_cmd`, `test_check`, `merge_worktree`, `get_token_summary`, etc.) plus a generic KV formatter for remaining tools - Pipeline vs. interactive mode detection via hook config file - Unwraps Claude Code's `{"result": "<json-string>"}` envelope before dispatching - 1,516-line test file with 40+ behavioral tests ### Headless Session Isolation (#359, #393, #397, #405, #406) - **Env isolation**: `build_sanitized_env()` strips `AUTOSKILLIT_PRIVATE_ENV_VARS` from subprocess environments, preventing `AUTOSKILLIT_HEADLESS=1` from leaking into test runners - **CWD path contamination defense**: `_inject_cwd_anchor()` anchors all relative paths to session CWD; `_validate_output_paths()` checks structured output tokens against CWD prefix; `_scan_jsonl_write_paths()` post-session scanner catches actual Write/Edit/Bash tool calls outside CWD - **Headless orchestration guard**: new PreToolUse hook blocks `run_skill`/`run_cmd`/`run_python` when `AUTOSKILLIT_HEADLESS=1`, enforcing Tier 1/Tier 2 nesting invariant - **`_require_not_headless()` server-side guard**: blocks 10 orchestration-only tools from headless sessions at the handler layer - **Unified error response contract**: `headless_error_result()` produces consistent 9-field responses; `_build_headless_error_response()` canonical builder for all failure paths in `tools_integrations.py` ### Cook UX Overhaul (#375, #363) - `open_kitchen` now accepts optional `name` + `overrides` — opens kitchen AND loads recipe in a single call - Pre-launch terminal preview with ANSI-colored flow diagram and ingredients table via new `cli/_ansi.py` module - `--dangerously-skip-permissions` warning banner with interactive confirmation prompt - Randomized session greetings from themed pools - Orchestrator prompt rewritten: recipe YAML no longer injected via `--append-system-prompt`; session calls `open_kitchen('{recipe_name}')` as first action - Conversational ingredient collection replaces mechanical per-field prompting --- ## New MCP Tools | Tool | Gate | Description | |------|------|-------------| | `wait_for_merge_queue` | Kitchen | Polls PR through GitHub merge queue (REST + GraphQL) | | `set_commit_status` | Kitchen | Posts GitHub Commit Status to a SHA for review-first gating | | `get_quota_events` | Ungated | Surfaces quota guard decisions from `quota_events.jsonl` | --- ## Pipeline Observability (#318, #341) - **`TelemetryFormatter`** (`pipeline/telemetry_fmt.py`) — single source of truth for all telemetry rendering; replaces dual-formatter anti-pattern. Four rendering modes: Markdown table, terminal table, compact KV (for PostToolUse hook) - `get_token_summary` and `get_timing_summary` gain `format` parameter (`"json"` | `"table"`) - `wall_clock_seconds` merged into token summary output — see duration alongside token counts in one call - **Telemetry clear marker**: `write_telemetry_clear_marker()` / `read_telemetry_clear_marker()` prevent token accounting drift on MCP server restart after `clear=True` - **Quota event logging**: `quota_check.py` hook now writes structured JSONL events (`cache_miss`, `parse_error`, `blocked`, `approved`) to `quota_events.jsonl` --- ## CI Watcher & Remote Resolution Fixes (#395, #406) - **`CIRunScope` value object** — carries `workflow` + `head_sha` scope; replaces bare `head_sha` parameter across all CI watcher signatures - **Workflow filter**: `wait_for_ci` and `get_ci_status` accept `workflow` parameter (falls back to project-level `config.ci.workflow`), preventing unrelated workflows (version bumps, labelers) from satisfying CI checks - **`FAILED_CONCLUSIONS` expanded**: `failure` → `{failure, timed_out, startup_failure, cancelled}` - **Canonical remote resolver** (`execution/remote_resolver.py`): `resolve_remote_repo()` with `REMOTE_PRECEDENCE = (upstream, origin)` — correctly resolves `owner/repo` after `clone_repo` sets `origin` to `file://` isolation URL - **Clone isolation fix**: `clone_repo` now always clones from remote URL (never local path); sets `origin=file:///<clone>` for isolation and `upstream=<real_url>` for push/CI operations --- ## PR Pipeline Gates (#317, #343) - **`pipeline/pr_gates.py`**: `is_ci_passing()`, `is_review_passing()`, `partition_prs()` — partitions PRs into eligible/CI-blocked/review-blocked with human-readable reasons - **`pipeline/fidelity.py`**: `extract_linked_issues()` (Closes/Fixes/Resolves patterns), `is_valid_fidelity_finding()` schema validation - **`check_pr_mergeable`** now returns `mergeable_status` field alongside boolean - **`release_issue`** gains `target_branch` + `staged_label` parameters for staged issue lifecycle on non-default branches (#392) --- ## Recipe System Changes ### Structural - `RecipeIngredient.hidden` field — excluded from ingredients table (used for internal flags like `sprint_mode`) - `Recipe.experimental` flag parsed from YAML - `_TERMINAL_TARGETS` moved to `schema.py` as single source of truth - `format_ingredients_table()` with sorted display order (required → auto-detect → flags → optional → constants) - Diagram rendering engine (~670 lines) removed from `diagrams.py` — rendering now handled by `/render-recipe` skill; format version bumped to v7 ### Recipe YAML Changes - **Deleted**: `audit-and-fix.yaml`, `batch-implementation.yaml`, `bugfix-loop.yaml` - **Renamed**: `pr-merge-pipeline.yaml` → `merge-prs.yaml` - **`implementation.yaml`**: merge queue steps, `auto_merge`/`sprint_mode` ingredients, `base_branch` default → `""` (auto-detect), CI workflow filter, `extract_pr_number` step - **`remediation.yaml`**: `topic` → `task` rename, merge queue steps, `dry_walkthrough` retries:3 with forward-only routing, `verify` → `test` rename - **`merge-prs.yaml`**: full queue-mode path, `open-integration-pr` step (replaces `create-review-pr`), post-PR mergeability polling, review cycle with `resolve-review` retries ### New Semantic Rules - `missing-output-patterns` (WARNING) — flags `run_skill` steps without `expected_output_patterns` - `unknown-sub-recipe` (ERROR) — validates sub-recipe references exist - `circular-sub-recipe` (ERROR) — DFS cycle detection - `unknown-skill-command` (ERROR) — validates skill names against bundled set - `telemetry-before-open-pr` (WARNING) — ensures telemetry step precedes `open-pr` --- ## New Skills (24) ### Architecture Lens Family (13) `arch-lens-c4-container`, `arch-lens-concurrency`, `arch-lens-data-lineage`, `arch-lens-deployment`, `arch-lens-development`, `arch-lens-error-resilience`, `arch-lens-module-dependency`, `arch-lens-operational`, `arch-lens-process-flow`, `arch-lens-repository-access`, `arch-lens-scenarios`, `arch-lens-security`, `arch-lens-state-lifecycle` ### Audit Family (5) `audit-arch`, `audit-bugs`, `audit-cohesion`, `audit-defense-standards`, `audit-tests` ### Planning & Diagramming (3) `elaborate-phase`, `make-arch-diag`, `make-req` ### Bug/Guard Lifecycle (2) `design-guards`, `verify-diag` ### Pipeline (1) `open-integration-pr` — creates integration PRs with per-PR details, arch-lens diagrams, carried-forward `Closes #N` references, and auto-closes collapsed PRs ### Sprint Planning (1 — gated by sub-recipe) `sprint-planner` — selects a focused, conflict-free sprint from a triage manifest --- ## Skill Modifications (Highlights) - **`analyze-prs`**: merge queue detection, CI/review eligibility filtering, queue-mode ordering - **`dry-walkthrough`**: Step 4.5 Historical Regression Check (git history mining + GitHub issue cross-reference) - **`review-pr`**: deterministic diff annotation via `diff_annotator.py`, echo-primary-obligation step, post-completion confirmation, degraded-mode narration - **`collapse-issues`**: content fidelity enforcement — per-issue `fetch_github_issue` calls, copy-mode body assembly (#388) - **`prepare-issue`**: multi-keyword dedup search, numbered candidate selection, extend-existing-issue flow - **`resolve-review`**: GraphQL thread auto-resolution after addressing findings (#379) - **`resolve-merge-conflicts`**: conflict resolution decision report with per-file log (#389) - **Cross-skill**: output tokens migrated to `key = value` format; code-index paths made generic with fallback notes; arch-lens references fully qualified; anti-prose guards at loop boundaries --- ## CLI & Hooks ### New CLI Commands - `autoskillit install` — plugin installation + cache refresh - `autoskillit upgrade` — `.autoskillit/scripts/` → `.autoskillit/recipes/` migration ### CLI Changes - `doctor`: plugin-aware MCP check, PostToolUse hook scanning, `--fix` flag removed - `init`: GitHub repo prompt, `.secrets.yaml` template, plugin-aware registration - `chefs-hat`: pre-launch banner, `--dangerously-skip-permissions` confirmation - `recipes render`: repurposed from generator to viewer (delegates to `/render-recipe`) - `serve`: server import deferred to after `configure_logging()` to prevent stdout corruption ### New Hooks - `branch_protection_guard.py` (PreToolUse) — denies `merge_worktree`/`push_to_remote` targeting protected branches - `headless_orchestration_guard.py` (PreToolUse) — blocks orchestration tools in headless sessions - `pretty_output.py` (PostToolUse) — MCP JSON → Markdown-KV reformatter ### Hook Infrastructure - `HookDef.event_type` field — registry now handles both PreToolUse and PostToolUse - `generate_hooks_json()` groups entries by event type - `_evict_stale_autoskillit_hooks` and `sync_hooks_to_settings` made event-type-agnostic --- ## Core & Config ### New Core Modules - `core/branch_guard.py` — `is_protected_branch()` pure function - `core/github_url.py` — `parse_github_repo()` + `normalize_owner_repo()` canonical parsers ### Core Type Expansions - `AUTOSKILLIT_PRIVATE_ENV_VARS` frozenset - `WORKER_TOOLS` / `HEADLESS_BLOCKED_UNGATED_TOOLS` split from `UNGATED_TOOLS` - `TOOL_CATEGORIES` — categorized listing for `open_kitchen` response - `CIRunScope` — immutable scope for CI watcher calls - `MergeQueueWatcher` protocol - `SkillResult.cli_subtype` + `write_path_warnings` fields - `SubprocessRunner.env` parameter ### Config - `safety.protected_branches`: `[main, integration, stable]` - `github.staged_label`: `"staged"` - `ci.workflow`: workflow filename filter (e.g., `"tests.yml"`) - `branching.default_base_branch`: `"integration"` → `"main"` - `ModelConfig.default`: `str | None` → `str = "sonnet"` --- ## Infrastructure & Release ### Version - `0.2.0` → `0.3.1` across `pyproject.toml`, `plugin.json`, `uv.lock` - FastMCP dependency: `>=3.0.2` → `>=3.1.1,<4.0` (#399) ### CI/CD Workflows - **`version-bump.yml`** (new) — auto patch-bumps `main` on integration PR merge, force-syncs integration branch one patch ahead - **`release.yml`** (new) — minor version bump + GitHub Release on merge to `stable` - **`codeql.yml`** (new) — CodeQL analysis for `stable` PRs (Python + Actions) - **`tests.yml`** — `merge_group:` trigger added; multi-OS now only for `stable` ### PyPI Readiness - `pyproject.toml`: `readme`, `license`, `authors`, `keywords`, `classifiers`, `project.urls`, `hatch.build.targets.sdist` inclusion list ### readOnlyHint Parallel Execution Fix - All MCP tools annotated `readOnlyHint=True` — enables Claude Code parallel tool execution (~7x speedup). One deliberate exception: `wait_for_merge_queue` uses `readOnlyHint=False` (actually mutates queue state) ### Tool Response Exception Boundary - `track_response_size` decorator catches unhandled exceptions and serializes them as `{"success": false, "subtype": "tool_exception"}` — prevents FastMCP opaque error wrapping ### SkillResult Subtype Normalization (#358) - `_normalize_subtype()` gate eliminates dual-source contradiction between CLI subtype and session outcome - Class 2 upward: `SUCCEEDED + error_subtype → "success"` (drain-race artifact) - Class 1 downward: `non-SUCCEEDED + "success" → "empty_result"` / `"missing_completion_marker"` / `"adjudicated_failure"` --- ## Test Coverage **47 new test files** (+12,703 lines) covering: | Area | Key Tests | |------|-----------| | Merge queue watcher state machine | `test_merge_queue.py` (226 lines) | | Clone isolation × CI resolution | `test_clone_ci_contract.py`, `test_remote_resolver.py` | | PostToolUse hook | `test_pretty_output.py` (1,516 lines, 40+ cases) | | Branch protection + headless guards | `test_branch_protection_guard.py`, `test_headless_orchestration_guard.py` | | Sub-recipe composition | 5 test files (schema, loading, validation, sprint mode × 2) | | Telemetry formatter | `test_telemetry_formatter.py` (281 lines) | | PR pipeline gates | `test_analyze_prs_gates.py`, `test_review_pr_fidelity.py` | | Diff annotator | `test_diff_annotator.py` (242 lines) | | Skill compliance | Output token format, genericization, loop-boundary guards | | Release workflows | Structural contracts for `version-bump.yml`, `release.yml` | | Issue content fidelity | Body-assembling skills must call `fetch_github_issue` per-issue | | CI watcher scope | `test_ci_params.py` — workflow_id query param composition | --- ## Consolidated PRs #293, #295, #314, #315, #316, #317, #318, #319, #323, #332, #336, #337, #338, #339, #341, #343, #351, #358, #359, #360, #361, #362, #363, #366, #368, #370, #375, #377, #378, #379, #380, #388, #389, #390, #391, #392, #393, #395, #396, #397, #399, #405, #406 --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Adds three pipeline observability capabilities: a new
get_quota_eventsMCP tool surfacing quota guard decisions fromquota_events.jsonl,wall_clock_secondsmerged intoget_token_summaryoutput for per-step wall-clock visibility, and a.telemetry_cleared_atreplay fence preventing token accounting drift when the MCP server restarts after aclear=Truecall. Includes a follow-up refactor extracting_get_log_root()intools_status.pyto eliminate three identical inline log-root expressions.Individual Plan Details
Group 1: Pipeline Observability — Quota Guard Logging and Per-Step Elapsed Time
Three related pipeline observability improvements, tracked as GitHub issue #302 (collapsing #218, #65, and the #304/#148 token accounting item):
Quota guard MCP tool (feat: Add quota guard observability to diagnostic logging system #218): The
quota_check.pyhook already writesquota_events.jsonlwith approved/blocked/cache-miss events. Add a new ungatedget_quota_eventstool totools_status.pyto surface those decisions through the MCP API.Wall-clock time in token summary (feat: report per-step elapsed time in token summary #65): Merge
total_secondsfrom the timing log into each step'sget_token_summaryoutput aswall_clock_seconds, so operators see wall-clock duration alongside token counts in one call. Updates_format_token_summaryandwrite_telemetry_files.Token accounting drift fix (Combined: Pre-release readiness — stability fixes #304/Stability and correctness fixes for public release #148): Persist a
.telemetry_cleared_attimestamp when any log is cleared._state._initializereads this on startup and usesmax(now - 24h, marker_ts)as the effective replay lower bound, excluding already-cleared sessions.Group 2: Remediation — Extract
_get_log_root()helper intools_status.pyThe audit identified that
tools_status.pyhad three identical inline expressions —resolve_log_dir(_get_ctx().config.linux_tracing.log_dir)— repeated inget_pipeline_report,get_token_summary, andget_timing_summary. This remediation adds_get_log_root()to centralize that computation and replaces all three inline call sites. No behavioral change.Architecture Impact
Operational Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 65, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; subgraph UngatedTools ["UNGATED MCP TOOLS (tools_status.py)"] GTS["● get_token_summary<br/>━━━━━━━━━━<br/>clear=False<br/>+ ● wall_clock_seconds<br/>from timing_log"] GTIM["● get_timing_summary<br/>━━━━━━━━━━<br/>clear=False<br/>total_seconds per step"] GPR["● get_pipeline_report<br/>━━━━━━━━━━<br/>clear=False<br/>audit failures"] GQE["★ get_quota_events<br/>━━━━━━━━━━<br/>n=50<br/>quota guard decisions"] end subgraph LogRoot ["★ _get_log_root() helper (tools_status.py)"] LR["★ _get_log_root()<br/>━━━━━━━━━━<br/>resolve_log_dir(ctx.config<br/>.linux_tracing.log_dir)"] end subgraph InMemory ["IN-MEMORY PIPELINE LOGS"] TK["token_log<br/>━━━━━━━━━━<br/>step_name → tokens<br/>elapsed_seconds"] TI["timing_log<br/>━━━━━━━━━━<br/>step_name → total_seconds<br/>(monotonic clock)"] AU["audit_log<br/>━━━━━━━━━━<br/>list FailureRecord"] end subgraph DiskLogs ["PERSISTENT LOG FILES (~/.local/share/autoskillit/logs/)"] QE["quota_events.jsonl<br/>━━━━━━━━━━<br/>approved / blocked<br/>cache_miss / parse_error"] CM["★ .telemetry_cleared_at<br/>━━━━━━━━━━<br/>UTC ISO timestamp fence<br/>written on clear=True"] end subgraph Startup ["SERVER STARTUP (_state._initialize)"] INIT["● _state._initialize<br/>━━━━━━━━━━<br/>since = max(now−24h, marker)<br/>load_from_log_dir × 3"] end subgraph Hook ["HOOK (quota_check.py)"] QH["quota_check.py<br/>━━━━━━━━━━<br/>PreToolUse: approve/block<br/>_write_quota_log_event"] end GTS -->|"clear=True"| LR GTIM -->|"clear=True"| LR GPR -->|"clear=True"| LR LR -->|"write_telemetry_clear_marker"| CM GQE -->|"_read_quota_events(n)"| QE QH -->|"append event"| QE GTS -->|"get_report()"| TK GTS -->|"● merge wall_clock_seconds"| TI GTIM -->|"get_report()"| TI GPR -->|"get_report()"| AU CM -->|"read marker → since bound"| INIT INIT -->|"load_from_log_dir since=effective"| TK INIT -->|"load_from_log_dir since=effective"| TI INIT -->|"load_from_log_dir since=effective"| AU class GTS,GTIM,GPR cli; class GQE newComponent; class LR newComponent; class TK,TI,AU stateNode; class QE stateNode; class CM newComponent; class INIT phase; class QH detector;Color Legend: Dark Blue = MCP query tools | Green = New components (
get_quota_events,_get_log_root,.telemetry_cleared_at) | Teal = State/logs | Purple = Server startup | Dark Red = PreToolUse hookState Lifecycle Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 65, 'curve': 'basis'}}}%% flowchart TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff; classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000; subgraph InMem ["IN-MEMORY (clearable — MUTABLE)"] TK["● token_log<br/>━━━━━━━━━━<br/>MUTABLE<br/>step_name → tokens + elapsed_seconds<br/>cleared on clear=True"] TI["timing_log<br/>━━━━━━━━━━<br/>MUTABLE<br/>step_name → total_seconds<br/>cleared on clear=True"] AU["audit_log<br/>━━━━━━━━━━<br/>MUTABLE<br/>list FailureRecord<br/>cleared on clear=True"] end subgraph Derived ["DERIVED (computed per query)"] WC["★ wall_clock_seconds<br/>━━━━━━━━━━<br/>DERIVED<br/>timing_log.get_report()<br/>merged into token summary response<br/>never persisted"] end subgraph ClearFence ["★ CLEAR FENCE (write-then-read across restarts)"] CM["★ .telemetry_cleared_at<br/>━━━━━━━━━━<br/>WRITE-FENCE<br/>UTC ISO timestamp<br/>written atomically by write_telemetry_clear_marker<br/>read exactly once by _initialize"] end subgraph DiskReplay ["DISK REPLAY (bounded by clear fence)"] SJ["sessions.jsonl + session/<br/>━━━━━━━━━━<br/>REPLAY-SOURCE<br/>historical token + timing + audit data<br/>replayed with since= lower bound"] QE["quota_events.jsonl<br/>━━━━━━━━━━<br/>APPEND-ONLY<br/>quota hook writes, never rewrites<br/>read by ★ get_quota_events"] end subgraph ClearGate ["CLEAR GATE (state mutation trigger)"] ClearTrue["clear=True in<br/>● get_token_summary /<br/>● get_timing_summary /<br/>● get_pipeline_report<br/>━━━━━━━━━━<br/>1. Clear in-memory log<br/>2. ★ Write .telemetry_cleared_at"] end subgraph StartupGate ["★ STARTUP REPLAY GATE (_state._initialize)"] INIT["● _state._initialize<br/>━━━━━━━━━━<br/>1. Read .telemetry_cleared_at<br/>2. since = max(now−24h, marker)<br/>3. load_from_log_dir × 3<br/>Guards: no double-counting"] end ClearTrue -->|"1. in_memory.clear()"| TK ClearTrue -->|"1. in_memory.clear()"| TI ClearTrue -->|"1. in_memory.clear()"| AU ClearTrue -->|"2. write_telemetry_clear_marker()"| CM TI -->|"get_report() per query"| WC WC -->|"★ merged into response"| TK CM -->|"read → since bound"| INIT SJ -->|"load_from_log_dir since=effective"| INIT INIT -->|"populate (bounded)"| TK INIT -->|"populate (bounded)"| TI INIT -->|"populate (bounded)"| AU class TK,TI,AU handler; class WC newComponent; class CM newComponent; class ClearTrue detector; class INIT phase; class SJ,QE stateNode;Color Legend: Orange = MUTABLE in-memory logs | Green = New (wall_clock_seconds, .telemetry_cleared_at fence) | Dark Red = clear=True trigger | Purple = startup replay gate | Teal = persistent disk state
Module Dependency Diagram
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'curve': 'basis'}}}%% graph TB %% CLASS DEFINITIONS %% classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff; classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff; classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff; classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff; classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff; classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff; subgraph L3 ["L3 — SERVER (tools_status.py, _state.py, helpers.py)"] direction LR TS["● tools_status.py<br/>━━━━━━━━━━<br/>★ _get_log_root()<br/>★ get_quota_events<br/>● 3 clear=True paths"] ST["● _state.py<br/>━━━━━━━━━━<br/>● _initialize<br/>reads clear marker"] HLP["● helpers.py<br/>━━━━━━━━━━<br/>re-exports<br/>write/read_telemetry_clear_marker<br/>resolve_log_dir"] end subgraph L1 ["L1 — EXECUTION (execution/__init__.py, session_log.py)"] direction LR EINIT["● execution/__init__.py<br/>━━━━━━━━━━<br/>★ exports write/read_telemetry_clear_marker<br/>public API surface"] SL["● session_log.py<br/>━━━━━━━━━━<br/>★ write_telemetry_clear_marker()<br/>★ read_telemetry_clear_marker()<br/>_CLEAR_MARKER_FILENAME"] end subgraph L0 ["L0 — CORE (core/types.py)"] TY["● core/types.py<br/>━━━━━━━━━━<br/>● UNGATED_TOOLS frozenset<br/>+ get_quota_events"] end TS -->|"import resolve_log_dir<br/>write/read_telemetry_clear_marker<br/>(via helpers shim)"| HLP ST -->|"★ import read_telemetry_clear_marker<br/>(direct from execution)"| EINIT HLP -->|"re-export from execution"| EINIT EINIT -->|"defined in"| SL TS -.->|"UNGATED_TOOLS check<br/>(via pipeline.gate)"| TY class TS,ST,HLP cli; class EINIT,SL handler; class TY stateNode;Color Legend: Dark Blue = L3 server layer | Orange = L1 execution layer | Teal = L0 core types | Dashed = indirect (via pipeline.gate) | All imports flow downward (no violations)
Closes #302
Implementation Plans
Plan files:
temp/make-plan/302_pipeline_observability_plan_2026-03-10_204500.mdtemp/make-plan/302_remediation_get_log_root_plan_2026-03-10_210500.md🤖 Generated with Claude Code via AutoSkillit