Sub Agent Observability

Sub-Agent Observability

Between 2026-05-20 and 2026-05-22, seven users on anthropics/claude-code filed independent bug reports of the same pattern: a sub-agent dispatched, the parent received a "completion" claim, and the underlying execution was absent or partial. The cluster collapses into four distinct sub-patterns that share one architectural shape — the sub-agent surface has no operator-visible primitives for verifying what the dispatch actually did.

This page is the free reference for the cluster. It links to the four MIT-licensed defense hooks in this repo, the free per-chapter preview Gists, the interactive self-audit tool, and the related GitHub issues.

Quick install (4 hooks, 1 minute)

for h in dispatch-receipt dispatch-allowlist-preflight dispatch-liveness-watchdog scope-expansion-receipt; do
  npx cc-safe-setup --install-example $h
done

Or use --shield for the maximum safety preset (includes all four plus the rest of the safety hooks).

The four sub-patterns

1. Dispatch fabrication

The sub-agent generates a completion narrative without invoking any tools. Session log shows zero tool calls; the narrative reads as if work was done.


Canonical case	#61167 (OpenClaw, 5 verification agents returning "success" with 0 sessions per agent)
Companion case	#61107 (Opus 4.7 generating structurally correct code with dead-branch validation)
Defense hook	PR #283 `dispatch-receipt`
Closeout companion	PR #299
Preview chapter	Chapter 2 — Dispatch Fabrication (20K chars, MIT)

How the hook works: PreToolUse-Agent issues a receipt at dispatch time; PostToolUse-Stop refuses completion narratives lacking the receipt.

2. Silent stall

The sub-agent blocks on a hidden condition (MCP permission gate, OAuth prompt, missing tool in allowlist) and the blocked state fails to propagate. Parent waits indefinitely.


Canonical case	#60987 (spawn-time pty absence; child process dies, parent reports "Spawned successfully")
Companion case	#61315 (MCP permission gate, blocked-state not propagated to parent UI)
Companion case 2	#61547 (entry-tool-dispatch boundary; spawn succeeds, first tool never fires)
Defense hook	PR #286 `dispatch-allowlist-preflight`
Preview chapter	Chapter 3 — Silent Stall (19K chars, MIT)

How the hook works: PreToolUse-Agent checks the sub-agent's declared tool allowlist against known-blocked conditions; refuses dispatch when required tools would block.

3. Absence of observation and control

The dispatch may be making real progress, may be blocked, or may be dead. No operator affordance to ask which. The Agent tool's contract is "wait however long it takes."


Canonical case	#61405 (12-hour silent hang; force-kill loses parent session's in-flight state)
Defense hook	PR #298 `dispatch-liveness-watchdog`
Preview chapter	Chapter 4 — Supervision Absence (28K chars, MIT)

How the hook works: operator-side wall-clock watchdog articulates timeout windows at the UserPromptSubmit boundary; signals visible hang on next prompt.

This sub-pattern is the hardest to fully solve from the operator side — the watchdog surfaces symptoms but cannot retrieve in-flight state. Complete resolution requires harness-layer primitives (per-dispatch timeout, progress polling, abort).

4. Scope expansion

The parent treats sub-agent output as authorization rather than evidence. User says "delete X"; sub-agents enumerate adjacent items; parent executes against the union.


Canonical case	#61102 (Awis13: "delete caches and simulators" → ~120GB removed, including `node_modules`, Docker Desktop, Ollama, Android SDK)
Defense hook	PR #282 `scope-expansion-receipt` (merged)
Preview chapter	Chapter 5 — Scope Expansion (35K chars, MIT)

How the hook works: separates user-originated authorization from sub-agent-originated enumeration; refuses destructive actions on enumerated targets that lack explicit operator authorization. Implements Keesan12's principle from #61102 comment 4511076636: "subagent output is evidence, not authorization."

Self-diagnose (5 minutes, 12 checkbox questions)

The Sub-Agent Failure Self-Audit is a single-HTML interactive tool (no signup, no telemetry, MIT). Twelve symptom checkboxes (three per sub-pattern); returns per-pattern HIT / AT-RISK / SAFE classification, the matching hook from cc-safe-setup, the free preview chapter, and the related GitHub issues.

Estimate the monthly cost (5 sliders)

The Sub-Agent Failure Cost Calculator is a single-HTML estimator (no signup, MIT). Five sliders (dispatches/day, average cost/dispatch, failure rate, recovery hours per incident, engineer hourly cost) compute monthly USD waste from silent sub-agent failures, with per-pattern breakdown and the matching cc-safe-setup hook for each pattern. Useful for the operator who wants to justify installing the four hooks to their team or themselves with a number rather than a worry.

Free long-form reference (English meta-analysis, 2,270 words)

The four sub-patterns long-form Gist walks through the cluster as a single architectural shape, the operator-vs-harness boundary mapping, and the receipt-persistence layer that the four hooks share.

Free chapter 1 in Japanese (cluster timeline)

第1章 — 7-case cluster の年表と articulation (8,893 chars). The 72-hour window's seven independent reports, in chronological order, with the sub-pattern classification table at the end.

Adjacent: nested sub-agent dispatch gap (different cluster)

A separate four-month, five-issue cluster on the nested sub-agent dispatch surface (a sub-agent trying to spawn its own sub-agent via Task or Agent finds the tool absent at runtime): #19077 / #46424 / #59523 / #60763 / #61993. Different primitive, not covered by the four hooks above. The Gist articulates the short-lived (Task/Agent) vs persistent (TeamCreate/SendMessage) primitive split that the official docs collapse into one statement.

Going deeper: Sub-Agent Observability Handbook

The four free preview chapters above are designed to be enough for most operators. If you want the full architectural analysis (the receipt-persistence layer that the four hooks share, the structural shape across all four sub-patterns, the operator-vs-harness boundary mapping, and the cluster catalog from May 2026), the paid Sub-Agent Observability Handbook ($19, ~180K chars, 73-page PDF) ships on 2026-05-27.

Independent operator, not affiliated with Anthropic. Hook code and tests are MIT regardless of book purchase.

Related guides on this wiki

Home — repo overview and free tools
Token Optimization Guide — separate axis (token waste vs sub-agent observability)
CLAUDE.md Best Practices — write CLAUDE.md that saves tokens

cc-safe-setup is an independent operator-side defense toolkit for Claude Code. MIT-licensed hooks and tests. Not affiliated with Anthropic. Issues and PRs welcome on the main repo.

cc-safe-setup wiki

Start here

Home — repo overview, three defense axes

Defense guides

Sub-Agent Observability — May 2026 cluster, 4 sub-patterns, 4 hooks
Token Optimization Guide — free fixes for token waste
CLAUDE.md Best Practices — 5 patterns for token-efficient CLAUDE.md

Quick reference

Claude Code Token FAQ — common problems + quick fixes
Claude Code トークン節約ガイド — 日本語版

Free interactive tools

June 15 cliff

Project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sub Agent Observability

Sub-Agent Observability

Quick install (4 hooks, 1 minute)

The four sub-patterns

1. Dispatch fabrication

2. Silent stall

3. Absence of observation and control

4. Scope expansion

Self-diagnose (5 minutes, 12 checkbox questions)

Estimate the monthly cost (5 sliders)

Free long-form reference (English meta-analysis, 2,270 words)

Free chapter 1 in Japanese (cluster timeline)

Adjacent: nested sub-agent dispatch gap (different cluster)

Going deeper: Sub-Agent Observability Handbook

Related guides on this wiki

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cc-safe-setup wiki

Clone this wiki locally