Skip to content

Subagent tree as first-class primitive (#8)#64

Merged
willwashburn merged 6 commits intomainfrom
feat/subagent-tree-8
Apr 24, 2026
Merged

Subagent tree as first-class primitive (#8)#64
willwashburn merged 6 commits intomainfrom
feat/subagent-tree-8

Conversation

@willwashburn
Copy link
Copy Markdown
Member

@willwashburn willwashburn commented Apr 24, 2026

Closes #8.

Summary

  • Reader reconstructs the subagent parent→child→grandchild tree from Claude JSONL alone by walking parentUuid chains, distinguishing spawn boundaries (child user line is the Agent prompt) from continuations (child is the tool_result for the Agent call).
  • TurnRecord.subagent now carries agentId, parentAgentId, parentToolUseId, subagentType, and description. isSidechain is unchanged so existing consumers keep working.
  • burn summary --subagent-tree <session-id> renders the nested tree with cumulative cost rolled up from leaves.
  • burn summary --by-subagent-type aggregates invocations across sessions with count, total, median, p95, and mean cost.

Notes

  • parentAgentId for a first-level subagent is the session id, so the tree has a single main root per session and no null sentinels.
  • resolveInvocation memoizes per-uuid results so nested trees don't re-walk the outer chain for every inner turn.
  • Sidechain turns whose tree fields couldn't be resolved (e.g. partial data before the spawn line was read) still emit with isSidechain: true and attach under an (unresolved) node in the rendered tree so their cost isn't dropped.
  • Hook-based enrichment (Hook-based Claude Code ingest via spawner-injected --settings #7) would give cleaner boundaries for parallel Agent spawns, but isn't required — this PR works from historical JSONL alone.
  • The deeper session-relationship substrate (Execution graph for passive readers: session relationships and tool-result event chronology #42) stays separate; this PR focuses on the subagent-tree UX.

Test plan

  • pnpm test:ts — 265 tests pass
  • New nested-subagent.jsonl fixture exercises 2 levels of nesting
  • Reader test verifies agentId, parentAgentId, parentToolUseId, subagentType, description populated across nested levels
  • Analyze tests verify cumulative roll-up invariant and per-type median/p95
  • Manual smoke test of burn summary --subagent-tree and --by-subagent-type

🤖 Generated with Claude Code


Open in Devin Review

Reconstruct the parent→child→grandchild subagent tree from Claude JSONL
instead of only flagging sidechain turns. Subagent now carries agentId,
parentAgentId, parentToolUseId, subagentType, and description so callers
can attribute cost to a specific invocation within the tree.

New CLI:
  burn summary --subagent-tree <session-id>  # nested tree with rolled-up cost
  burn summary --by-subagent-type            # per-type invocations/median/p95

Backwards compatible: Subagent.isSidechain unchanged; existing consumers
keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@barryollama barryollama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🦙

Solid implementation of subagent trees. The parentUuid chain walking with spawn boundary detection (distinguishing child user lines from tool_result continuations) is clean and well-tested.

Highlights:

  • Memoized invocation resolution prevents redundant tree walks for nested subagents
  • Unresolved sidechain turns are handled gracefully via the synthetic '(unresolved)' node
  • Cumulative cost rollup invariant is explicitly tested
  • Backward compatibility maintained via unchanged field
  • CLI tree rendering with ASCII art is a nice touch

The 265 passing tests + new comprehensive test coverage for nested trees and per-type stats gives me confidence. Approving for merge.

Per-package entries for reader (Subagent fields + parentUuid resolution),
analyze (buildSubagentTree, aggregateSubagentTypeStats), cli (summary
--subagent-tree, --by-subagent-type), plus a cross-package summary in
the root changelog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@barryollama
Copy link
Copy Markdown

Review Summary

Overall this is a well-crafted PR that adds significant value to the burn CLI. The subagent tree reconstruction from Claude JSONL data is elegant and the CLI additions are useful.

Status: LGTM with minor suggestions (not blocking)

Key Strengths

✅ Clean parent→child→grandchild tree reconstruction via parentUuid chain walking
✅ Proper distinction between spawn boundaries and tool_result continuations
✅ Good memoization via invocationCache to keep nested trees linear
✅ Comprehensive test coverage with nested subagent fixture
✅ Backwards compatible - isSidechain preserved for existing consumers
✅ CLI UX is intuitive with --subagent-tree and --by-subagent-type flags

Minor Suggestions

1. ResolveInvocation cycle detection (packages/reader/src/claude.ts)
The resolveInvocation() function has a depth guard for recursion, but the while (node) loop walking parentUuid chains could theoretically loop forever on circular references in malformed JSONL. Consider adding a visited-set or iteration limit (currently guarded only by depth param in recursive calls).

2. Defensive access (packages/analyze/src/subagent-tree.ts)
Line ~274: totalsByType.get(type)! uses non-null assertion. Add defensive check:

const totals = totalsByType.get(type);
if (!totals) continue;

3. Test gaps (non-blocking)

  • Sibling subagents: No test for multiple subagents spawned by same parent
  • Empty array: buildSubagentTree([]) should return empty map - not tested
  • Incremental parsing: parseClaudeSessionIncremental updated but test only covers full parse

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes the “subagent tree” a first-class concept by enriching TurnRecord.subagent with stable invocation and parent-link fields derived from Claude JSONL parentUuid chains, and adds CLI/reporting surfaces to render and aggregate subagent costs.

Changes:

  • Extend TurnRecord.subagent with agentId, parentAgentId, parentToolUseId, subagentType, and description, and reconstruct these fields during Claude JSONL parsing.
  • Add burn summary --subagent-tree (tree rendering with cumulative rollups) and --by-subagent-type (cross-session distribution stats).
  • Introduce analyze-layer utilities (buildSubagentTree, aggregateSubagentTypeStats) plus fixtures/tests for nested subagent scenarios.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/fixtures/claude/nested-subagent.jsonl New fixture exercising 2-level nested Agent spawns.
packages/reader/src/types.ts Expands Subagent type with tree/link metadata fields.
packages/reader/src/claude.ts Reconstructs invocation roots/parents from parentUuid chains during parsing.
packages/reader/src/claude.test.ts Adds reader test asserting nested subagent metadata is populated.
packages/cli/src/commands/summary.ts Adds summary modes for subagent-tree rendering and by-type aggregation.
packages/cli/src/cli.ts Updates help text to document new summary flags.
packages/analyze/src/subagent-tree.ts Implements tree building, rollups, and per-type invocation statistics.
packages/analyze/src/subagent-tree.test.ts Tests roll-up invariants, unresolved bucketing, and stats computations.
packages/analyze/src/index.ts Exposes new subagent-tree functions/types from @relayburn/analyze.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +669 to 691
let node = nodes.get(startUuid);
while (node) {
const parent = node.parentUuid ? nodes.get(node.parentUuid) : undefined;
if (!parent) break;
if (
node.kind === 'user' &&
parent.kind === 'assistant' &&
parent.agentToolUse &&
!(node.toolResultIds && node.toolResultIds.has(parent.agentToolUse.id))
) {
const out: InvocationInfo = { rootUuid: node.uuid };
if (parent.agentToolUse.id) out.parentToolUseId = parent.agentToolUse.id;
if (parent.agentToolUse.subagentType !== undefined) out.subagentType = parent.agentToolUse.subagentType;
if (parent.agentToolUse.description !== undefined) out.description = parent.agentToolUse.description;
if (parent.isSidechain) {
const parentInvocation = resolveInvocation(parent.uuid, nodes, cache, depth + 1);
if (parentInvocation) out.parentAgentId = parentInvocation.rootUuid;
}
if (cache) cache.set(startUuid, out);
return out;
}
node = parent;
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveInvocation uses a while (node) loop that advances via node = parent. The only guard is the recursion depth, but if the input has a parentUuid cycle within the same recursion frame (e.g., A→B→A), this loop will never terminate and can hang parsing. Add a visited-set or a max-steps counter inside the loop to guarantee termination on malformed/corrupt JSONL.

Copilot uses AI. Check for mistakes.
Comment on lines 613 to +618
function resolveSubagent(
w: WorkingRecord,
_toolCalls: ToolCall[],
_subagentIndex: Map<string, { type?: string; taskDescription?: string }>,
nodesByUuid?: Map<string, LineNode>,
invocationCache?: Map<string, InvocationInfo | null>,
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveSubagent no longer uses _toolCalls/_subagentIndex, but parseClaudeSession* still builds and populates subagentByToolUseId via captureSubagentFromToolResult. That index is now dead work and makes it harder to see what data actually drives subagent attribution. Either remove the unused map/plumbing, or re-use it to populate legacy subagent.type/taskDescription if those fields are still intended to be supported.

Copilot uses AI. Check for mistakes.
Comment thread packages/reader/src/types.ts Outdated
Comment on lines +48 to +49
// agentId of the parent subagent, or undefined when the parent is the main
// thread. Together with agentId this forms a parent→child tree.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment for parentAgentId says it's "undefined when the parent is the main thread", but the reader now sets parentAgentId to the session id for first-level subagents (per PR notes and resolveSubagent). Update this doc comment to match the new contract so downstream consumers don't treat undefined as the only "main thread" sentinel.

Suggested change
// agentId of the parent subagent, or undefined when the parent is the main
// thread. Together with agentId this forms a parent→child tree.
// agentId of the parent subagent, or the sessionId when the parent is the
// main thread (for first-level subagents). Together with agentId this forms
// a parent→child tree.

Copilot uses AI. Check for mistakes.
assert.equal(sub2_1.subagent!.parentToolUseId, 'toolu_inner');
assert.equal(sub2_1.subagent!.subagentType, 'code-reviewer');
});

Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test validates full-file parsing, but the production ingest path uses parseClaudeSessionIncremental. Given the new parentUuid-walk logic, add an incremental test (two passes with startOffset advanced) that asserts agentId/parentAgentId/parentToolUseId are still populated for sidechain turns after the first chunk, to prevent regressions.

Suggested change
it('preserves nested sidechain parent metadata when parsed incrementally across chunks', async () => {
const fixturePath = path.join(FIXTURES, 'nested-subagent.jsonl');
const fixtureText = await readFile(fixturePath, 'utf8');
const lines = fixtureText.trimEnd().split('\n');
assert.ok(lines.length >= 4, 'expected nested-subagent fixture to have multiple records');
const firstChunk = `${lines.slice(0, 3).join('\n')}\n`;
const secondChunk = `${lines.slice(3).join('\n')}\n`;
const startOffset = Buffer.byteLength(firstChunk, 'utf8');
const dir = await mkdtemp(path.join(tmpdir(), 'claude-incremental-sidechain-'));
const sessionPath = path.join(dir, 'nested-subagent.jsonl');
try {
await writeFile(sessionPath, firstChunk, 'utf8');
const firstPass = await parseClaudeSessionIncremental(sessionPath, 0);
assert.ok(firstPass.turns.length > 0);
await writeFile(sessionPath, firstChunk + secondChunk, 'utf8');
const secondPass = await parseClaudeSessionIncremental(sessionPath, startOffset);
const byId = new Map(secondPass.turns.map((t) => [t.messageId, t] as const));
const sub1_2 = byId.get('msg_sub1_2')!;
const sub2_1 = byId.get('msg_sub2_1')!;
assert.ok(sub1_2.subagent);
assert.equal(sub1_2.subagent!.agentId, 'u-sub1-user');
assert.equal(sub1_2.subagent!.parentAgentId, '55555555-5555-5555-5555-555555555555');
assert.equal(sub1_2.subagent!.parentToolUseId, 'toolu_outer');
assert.ok(sub2_1.subagent);
assert.equal(sub2_1.subagent!.agentId, 'u-sub2-user');
assert.equal(sub2_1.subagent!.parentAgentId, 'u-sub1-user');
assert.equal(sub2_1.subagent!.parentToolUseId, 'toolu_inner');
} finally {
await rm(dir, { recursive: true, force: true });
}
});

Copilot uses AI. Check for mistakes.
Comment thread packages/analyze/src/subagent-tree.ts Outdated
Comment on lines +163 to +172
// back to the session root when missing).
const parentByNode = new Map<string, string>();
for (const t of turns) {
if (!t.subagent?.agentId) continue;
if (parentByNode.has(t.subagent.agentId)) continue;
parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId);
}
for (const [id, parentId] of parentByNode) {
const node = byId.get(id);
if (!node) continue;
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When attaching nodes, parentId can equal id (or form longer cycles) if input data is inconsistent. That would create a self-cycle and cause infinite recursion in foldCumulative/sortTree. Add a defensive check to skip/redirect self-parenting (and ideally detect cycles) so tree building can't hang on malformed data.

Suggested change
// back to the session root when missing).
const parentByNode = new Map<string, string>();
for (const t of turns) {
if (!t.subagent?.agentId) continue;
if (parentByNode.has(t.subagent.agentId)) continue;
parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId);
}
for (const [id, parentId] of parentByNode) {
const node = byId.get(id);
if (!node) continue;
// back to the session root when missing). Defensively redirect malformed
// self-parenting/cyclic relationships to the root so traversal remains a tree.
const parentByNode = new Map<string, string>();
for (const t of turns) {
if (!t.subagent?.agentId) continue;
if (parentByNode.has(t.subagent.agentId)) continue;
parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId);
}
const resolvedParentByNode = new Map<string, string>();
const resolveParentId = (id: string): string => {
const cached = resolvedParentByNode.get(id);
if (cached) return cached;
const seen = new Set<string>([id]);
let currentParentId = parentByNode.get(id) ?? sessionId;
while (currentParentId !== sessionId) {
if (currentParentId === id || seen.has(currentParentId)) {
currentParentId = sessionId;
break;
}
seen.add(currentParentId);
const nextParentId = parentByNode.get(currentParentId);
if (!nextParentId) break;
currentParentId = nextParentId;
}
resolvedParentByNode.set(id, currentParentId);
return currentParentId;
};
for (const [id] of parentByNode) {
const node = byId.get(id);
if (!node) continue;
const parentId = resolveParentId(id);

Copilot uses AI. Check for mistakes.
Comment on lines 805 to 813
const working = new Map<string, WorkingRecord>();
const order: string[] = [];
const subagentByToolUseId = new Map<
string,
{ type?: string; taskDescription?: string }
>();
const nodesByUuid = new Map<string, LineNode>();
const invocationCache = new Map<string, InvocationInfo | null>();
const messageIdFirstOffset = new Map<string, number>();
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseClaudeSessionIncremental initializes a fresh nodesByUuid/invocationCache for each call, but subagent resolution depends on walking parentUuid chains that often point to lines before startOffset. This means subsequent incremental ingests will frequently emit sidechain turns without agentId/parentToolUseId/parentAgentId, breaking the new subagent-tree attribution for the primary ingest path. Persist enough parentUuid/index state across incremental calls (e.g., return a resume state containing a uuid→LineNode map for seen lines, store it in the cursor, and pass it back in; or re-scan required ancestors) so new turns can still resolve their invocation roots.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines +631 to +638
if (info.parentAgentId !== undefined) {
sub.parentAgentId = info.parentAgentId;
} else {
// First-level subagent: parent is the main thread. Use the session id as a
// stable anchor so callers can build parent→child trees without a null
// sentinel.
sub.parentAgentId = w.sessionId;
}
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 resolveSubagent conflates "main-thread parent" with "failed sidechain parent resolution", mis-parenting nested subagents

When resolveInvocation finds the invocation boundary for a nested subagent but the recursive call to resolve the parent sidechain's own invocation fails (e.g., the parent's spawning assistant line is outside the incremental-parse window), out.parentAgentId stays undefined in InvocationInfo. Back in resolveSubagent (packages/reader/src/claude.ts:631-638), the else branch unconditionally sets sub.parentAgentId = w.sessionId — treating the subagent as if its parent is the main thread. This is correct for first-level subagents but wrong for nested subagents whose parent is a sidechain that simply couldn't be resolved. The downstream buildSubagentTree (packages/analyze/src/subagent-tree.ts:168) then attaches the nested subagent directly under the root, producing an incorrect tree structure where an inner subagent appears at the first level instead of nested under its actual parent.

The Subagent.parentAgentId doc comment at packages/reader/src/types.ts:48-49 also doesn't match the implementation — it says the field is "undefined when the parent is the main thread", but the code always sets it to the session ID.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

willwashburn and others added 2 commits April 24, 2026 13:57
# Conflicts:
#	packages/analyze/CHANGELOG.md
#	packages/reader/CHANGELOG.md
- Guard resolveInvocation's parentUuid walk with a visited-set to cover
  malformed A→B→A cycles (depth guard only covered recursion).
- Pre-scan bytes [0, startOffset) in parseClaudeSessionIncremental to
  register parent nodes, so resuming ingest past the spawn line still
  resolves agentId / parentAgentId / parentToolUseId for sidechain
  turns. Add a regression test.
- Redirect self-parent / cyclic subagent attachments to the session
  root in buildSessionTree so foldCumulative / sortTree can't recurse
  infinitely on malformed data.
- Drop the dead subagentByToolUseId plumbing and
  captureSubagentFromToolResult function that resolveSubagent no longer
  reads; simplify resolveSubagent's signature.
- Fix Subagent.parentAgentId doc to match the shipped contract (session
  id for first-level subagents, not undefined).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@willwashburn
Copy link
Copy Markdown
Member Author

Merged main (0.8.0 release) and resolved CHANGELOG conflicts in reader and analyze — the 0.8.0 sections slot below the new subagent-tree [Unreleased] entries.

Addressed all six Copilot inline comments in 7e6ce46:

# Finding Fix
claude.ts:691 — while (node) could loop on parentUuid cycles ✅ Added visited-set to the walk alongside the existing recursion depth guard.
claude.ts:618 — dead subagentByToolUseId plumbing ✅ Removed the map, the captureSubagentFromToolResult function, and the unused _toolCalls / _subagentIndex parameters from resolveSubagent.
types.ts:49 — parentAgentId doc said "undefined for main thread" but code sets sessionId ✅ Updated comment to match the shipped contract.
claude.test.ts:108 — only full-parse tested, incremental path wasn't ✅ New regression test (nested-subagent.jsonl split across two incremental passes) that asserts agentId / parentAgentId / parentToolUseId survive resume.
subagent-tree.ts:172 — self-parent / cyclic parentAgentId would infinite-loop foldCumulative / sortTree ✅ Added resolveParentOrRoot that detects self-cycles and redirects malformed data straight to the session root.
claude.ts:813 — parseClaudeSessionIncremental lost parentUuid state across calls, so resumed sidechain turns came up empty ✅ When startOffset > 0, pre-scan bytes [0, startOffset) in a separate pass that only registers LineNodes (no turn emission, no content, no classification). This is the fix the regression test above actually exercises.

All 266 tests pass (one new).

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

⚠️ 1 issue in files not directly in the diff

⚠️ waste.ts reads dead Subagent.type field instead of newly-populated Subagent.subagentType (packages/analyze/src/waste.ts:241)

In packages/analyze/src/waste.ts:241, turn.subagent?.type reads the old type field on the Subagent interface, which is never populated by any parser. This PR adds and populates a new subagentType field (packages/reader/src/types.ts:53) for the same purpose but doesn't update waste.ts to use it. The result is that subagentType in the ToolAttribution always falls through to the tc.target fallback (packages/analyze/src/waste.ts:257). In practice the values coincide because pickTarget for Agent/Task also reads subagent_type from the tool input (packages/reader/src/claude.ts:619), so aggregateBySubagent still groups correctly — but the field name discrepancy leaves Subagent.type as unreachable dead code and the attribution path fragile against any future change to pickTarget.

View 9 additional findings in Devin Review.

Open in Devin Review

Update Subagent shape and usages: remove the optional `type` and `taskDescription` properties from the Subagent interface in packages/reader/src/types.ts, and update code in packages/analyze/src/waste.ts to read the renamed `subagentType` property (turn.subagent?.subagentType). This aligns the reader types with the analyze package usage and prevents accessing the removed field.
devin-ai-integration[bot]

This comment was marked as resolved.

Remove unused local `subagentType` and change attribution to use `tc.target` for Agent/Task tool calls. Add a clarifying comment explaining that `turn.subagent` refers to the parent invocation, while the spawning call's input (resolved to `tc.target`) identifies the spawned subagent. This fixes an unused-variable/lint issue and ensures the correct subagent type is recorded for spawn tool calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subagent tree as first-class primitive (replace flat isSidechain flag)

3 participants