Subagent tree as first-class primitive (#8)#64
Conversation
Reconstruct the parent→child→grandchild subagent tree from Claude JSONL instead of only flagging sidechain turns. Subagent now carries agentId, parentAgentId, parentToolUseId, subagentType, and description so callers can attribute cost to a specific invocation within the tree. New CLI: burn summary --subagent-tree <session-id> # nested tree with rolled-up cost burn summary --by-subagent-type # per-type invocations/median/p95 Backwards compatible: Subagent.isSidechain unchanged; existing consumers keep working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
barryollama
left a comment
There was a problem hiding this comment.
LGTM 🦙
Solid implementation of subagent trees. The parentUuid chain walking with spawn boundary detection (distinguishing child user lines from tool_result continuations) is clean and well-tested.
Highlights:
- Memoized invocation resolution prevents redundant tree walks for nested subagents
- Unresolved sidechain turns are handled gracefully via the synthetic '(unresolved)' node
- Cumulative cost rollup invariant is explicitly tested
- Backward compatibility maintained via unchanged field
- CLI tree rendering with ASCII art is a nice touch
The 265 passing tests + new comprehensive test coverage for nested trees and per-type stats gives me confidence. Approving for merge.
Per-package entries for reader (Subagent fields + parentUuid resolution), analyze (buildSubagentTree, aggregateSubagentTypeStats), cli (summary --subagent-tree, --by-subagent-type), plus a cross-package summary in the root changelog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review SummaryOverall this is a well-crafted PR that adds significant value to the burn CLI. The subagent tree reconstruction from Claude JSONL data is elegant and the CLI additions are useful. Status: LGTM with minor suggestions (not blocking) Key Strengths✅ Clean parent→child→grandchild tree reconstruction via Minor Suggestions1. ResolveInvocation cycle detection ( 2. Defensive access ( const totals = totalsByType.get(type);
if (!totals) continue;3. Test gaps (non-blocking)
|
There was a problem hiding this comment.
Pull request overview
This PR makes the “subagent tree” a first-class concept by enriching TurnRecord.subagent with stable invocation and parent-link fields derived from Claude JSONL parentUuid chains, and adds CLI/reporting surfaces to render and aggregate subagent costs.
Changes:
- Extend
TurnRecord.subagentwithagentId,parentAgentId,parentToolUseId,subagentType, anddescription, and reconstruct these fields during Claude JSONL parsing. - Add
burn summary --subagent-tree(tree rendering with cumulative rollups) and--by-subagent-type(cross-session distribution stats). - Introduce analyze-layer utilities (
buildSubagentTree,aggregateSubagentTypeStats) plus fixtures/tests for nested subagent scenarios.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/fixtures/claude/nested-subagent.jsonl | New fixture exercising 2-level nested Agent spawns. |
| packages/reader/src/types.ts | Expands Subagent type with tree/link metadata fields. |
| packages/reader/src/claude.ts | Reconstructs invocation roots/parents from parentUuid chains during parsing. |
| packages/reader/src/claude.test.ts | Adds reader test asserting nested subagent metadata is populated. |
| packages/cli/src/commands/summary.ts | Adds summary modes for subagent-tree rendering and by-type aggregation. |
| packages/cli/src/cli.ts | Updates help text to document new summary flags. |
| packages/analyze/src/subagent-tree.ts | Implements tree building, rollups, and per-type invocation statistics. |
| packages/analyze/src/subagent-tree.test.ts | Tests roll-up invariants, unresolved bucketing, and stats computations. |
| packages/analyze/src/index.ts | Exposes new subagent-tree functions/types from @relayburn/analyze. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let node = nodes.get(startUuid); | ||
| while (node) { | ||
| const parent = node.parentUuid ? nodes.get(node.parentUuid) : undefined; | ||
| if (!parent) break; | ||
| if ( | ||
| node.kind === 'user' && | ||
| parent.kind === 'assistant' && | ||
| parent.agentToolUse && | ||
| !(node.toolResultIds && node.toolResultIds.has(parent.agentToolUse.id)) | ||
| ) { | ||
| const out: InvocationInfo = { rootUuid: node.uuid }; | ||
| if (parent.agentToolUse.id) out.parentToolUseId = parent.agentToolUse.id; | ||
| if (parent.agentToolUse.subagentType !== undefined) out.subagentType = parent.agentToolUse.subagentType; | ||
| if (parent.agentToolUse.description !== undefined) out.description = parent.agentToolUse.description; | ||
| if (parent.isSidechain) { | ||
| const parentInvocation = resolveInvocation(parent.uuid, nodes, cache, depth + 1); | ||
| if (parentInvocation) out.parentAgentId = parentInvocation.rootUuid; | ||
| } | ||
| if (cache) cache.set(startUuid, out); | ||
| return out; | ||
| } | ||
| node = parent; | ||
| } |
There was a problem hiding this comment.
resolveInvocation uses a while (node) loop that advances via node = parent. The only guard is the recursion depth, but if the input has a parentUuid cycle within the same recursion frame (e.g., A→B→A), this loop will never terminate and can hang parsing. Add a visited-set or a max-steps counter inside the loop to guarantee termination on malformed/corrupt JSONL.
| function resolveSubagent( | ||
| w: WorkingRecord, | ||
| _toolCalls: ToolCall[], | ||
| _subagentIndex: Map<string, { type?: string; taskDescription?: string }>, | ||
| nodesByUuid?: Map<string, LineNode>, | ||
| invocationCache?: Map<string, InvocationInfo | null>, |
There was a problem hiding this comment.
resolveSubagent no longer uses _toolCalls/_subagentIndex, but parseClaudeSession* still builds and populates subagentByToolUseId via captureSubagentFromToolResult. That index is now dead work and makes it harder to see what data actually drives subagent attribution. Either remove the unused map/plumbing, or re-use it to populate legacy subagent.type/taskDescription if those fields are still intended to be supported.
| // agentId of the parent subagent, or undefined when the parent is the main | ||
| // thread. Together with agentId this forms a parent→child tree. |
There was a problem hiding this comment.
The comment for parentAgentId says it's "undefined when the parent is the main thread", but the reader now sets parentAgentId to the session id for first-level subagents (per PR notes and resolveSubagent). Update this doc comment to match the new contract so downstream consumers don't treat undefined as the only "main thread" sentinel.
| // agentId of the parent subagent, or undefined when the parent is the main | |
| // thread. Together with agentId this forms a parent→child tree. | |
| // agentId of the parent subagent, or the sessionId when the parent is the | |
| // main thread (for first-level subagents). Together with agentId this forms | |
| // a parent→child tree. |
| assert.equal(sub2_1.subagent!.parentToolUseId, 'toolu_inner'); | ||
| assert.equal(sub2_1.subagent!.subagentType, 'code-reviewer'); | ||
| }); | ||
|
|
There was a problem hiding this comment.
This test validates full-file parsing, but the production ingest path uses parseClaudeSessionIncremental. Given the new parentUuid-walk logic, add an incremental test (two passes with startOffset advanced) that asserts agentId/parentAgentId/parentToolUseId are still populated for sidechain turns after the first chunk, to prevent regressions.
| it('preserves nested sidechain parent metadata when parsed incrementally across chunks', async () => { | |
| const fixturePath = path.join(FIXTURES, 'nested-subagent.jsonl'); | |
| const fixtureText = await readFile(fixturePath, 'utf8'); | |
| const lines = fixtureText.trimEnd().split('\n'); | |
| assert.ok(lines.length >= 4, 'expected nested-subagent fixture to have multiple records'); | |
| const firstChunk = `${lines.slice(0, 3).join('\n')}\n`; | |
| const secondChunk = `${lines.slice(3).join('\n')}\n`; | |
| const startOffset = Buffer.byteLength(firstChunk, 'utf8'); | |
| const dir = await mkdtemp(path.join(tmpdir(), 'claude-incremental-sidechain-')); | |
| const sessionPath = path.join(dir, 'nested-subagent.jsonl'); | |
| try { | |
| await writeFile(sessionPath, firstChunk, 'utf8'); | |
| const firstPass = await parseClaudeSessionIncremental(sessionPath, 0); | |
| assert.ok(firstPass.turns.length > 0); | |
| await writeFile(sessionPath, firstChunk + secondChunk, 'utf8'); | |
| const secondPass = await parseClaudeSessionIncremental(sessionPath, startOffset); | |
| const byId = new Map(secondPass.turns.map((t) => [t.messageId, t] as const)); | |
| const sub1_2 = byId.get('msg_sub1_2')!; | |
| const sub2_1 = byId.get('msg_sub2_1')!; | |
| assert.ok(sub1_2.subagent); | |
| assert.equal(sub1_2.subagent!.agentId, 'u-sub1-user'); | |
| assert.equal(sub1_2.subagent!.parentAgentId, '55555555-5555-5555-5555-555555555555'); | |
| assert.equal(sub1_2.subagent!.parentToolUseId, 'toolu_outer'); | |
| assert.ok(sub2_1.subagent); | |
| assert.equal(sub2_1.subagent!.agentId, 'u-sub2-user'); | |
| assert.equal(sub2_1.subagent!.parentAgentId, 'u-sub1-user'); | |
| assert.equal(sub2_1.subagent!.parentToolUseId, 'toolu_inner'); | |
| } finally { | |
| await rm(dir, { recursive: true, force: true }); | |
| } | |
| }); |
| // back to the session root when missing). | ||
| const parentByNode = new Map<string, string>(); | ||
| for (const t of turns) { | ||
| if (!t.subagent?.agentId) continue; | ||
| if (parentByNode.has(t.subagent.agentId)) continue; | ||
| parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId); | ||
| } | ||
| for (const [id, parentId] of parentByNode) { | ||
| const node = byId.get(id); | ||
| if (!node) continue; |
There was a problem hiding this comment.
When attaching nodes, parentId can equal id (or form longer cycles) if input data is inconsistent. That would create a self-cycle and cause infinite recursion in foldCumulative/sortTree. Add a defensive check to skip/redirect self-parenting (and ideally detect cycles) so tree building can't hang on malformed data.
| // back to the session root when missing). | |
| const parentByNode = new Map<string, string>(); | |
| for (const t of turns) { | |
| if (!t.subagent?.agentId) continue; | |
| if (parentByNode.has(t.subagent.agentId)) continue; | |
| parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId); | |
| } | |
| for (const [id, parentId] of parentByNode) { | |
| const node = byId.get(id); | |
| if (!node) continue; | |
| // back to the session root when missing). Defensively redirect malformed | |
| // self-parenting/cyclic relationships to the root so traversal remains a tree. | |
| const parentByNode = new Map<string, string>(); | |
| for (const t of turns) { | |
| if (!t.subagent?.agentId) continue; | |
| if (parentByNode.has(t.subagent.agentId)) continue; | |
| parentByNode.set(t.subagent.agentId, t.subagent.parentAgentId ?? sessionId); | |
| } | |
| const resolvedParentByNode = new Map<string, string>(); | |
| const resolveParentId = (id: string): string => { | |
| const cached = resolvedParentByNode.get(id); | |
| if (cached) return cached; | |
| const seen = new Set<string>([id]); | |
| let currentParentId = parentByNode.get(id) ?? sessionId; | |
| while (currentParentId !== sessionId) { | |
| if (currentParentId === id || seen.has(currentParentId)) { | |
| currentParentId = sessionId; | |
| break; | |
| } | |
| seen.add(currentParentId); | |
| const nextParentId = parentByNode.get(currentParentId); | |
| if (!nextParentId) break; | |
| currentParentId = nextParentId; | |
| } | |
| resolvedParentByNode.set(id, currentParentId); | |
| return currentParentId; | |
| }; | |
| for (const [id] of parentByNode) { | |
| const node = byId.get(id); | |
| if (!node) continue; | |
| const parentId = resolveParentId(id); |
| const working = new Map<string, WorkingRecord>(); | ||
| const order: string[] = []; | ||
| const subagentByToolUseId = new Map< | ||
| string, | ||
| { type?: string; taskDescription?: string } | ||
| >(); | ||
| const nodesByUuid = new Map<string, LineNode>(); | ||
| const invocationCache = new Map<string, InvocationInfo | null>(); | ||
| const messageIdFirstOffset = new Map<string, number>(); |
There was a problem hiding this comment.
parseClaudeSessionIncremental initializes a fresh nodesByUuid/invocationCache for each call, but subagent resolution depends on walking parentUuid chains that often point to lines before startOffset. This means subsequent incremental ingests will frequently emit sidechain turns without agentId/parentToolUseId/parentAgentId, breaking the new subagent-tree attribution for the primary ingest path. Persist enough parentUuid/index state across incremental calls (e.g., return a resume state containing a uuid→LineNode map for seen lines, store it in the cursor, and pass it back in; or re-scan required ancestors) so new turns can still resolve their invocation roots.
| if (info.parentAgentId !== undefined) { | ||
| sub.parentAgentId = info.parentAgentId; | ||
| } else { | ||
| // First-level subagent: parent is the main thread. Use the session id as a | ||
| // stable anchor so callers can build parent→child trees without a null | ||
| // sentinel. | ||
| sub.parentAgentId = w.sessionId; | ||
| } |
There was a problem hiding this comment.
🟡 resolveSubagent conflates "main-thread parent" with "failed sidechain parent resolution", mis-parenting nested subagents
When resolveInvocation finds the invocation boundary for a nested subagent but the recursive call to resolve the parent sidechain's own invocation fails (e.g., the parent's spawning assistant line is outside the incremental-parse window), out.parentAgentId stays undefined in InvocationInfo. Back in resolveSubagent (packages/reader/src/claude.ts:631-638), the else branch unconditionally sets sub.parentAgentId = w.sessionId — treating the subagent as if its parent is the main thread. This is correct for first-level subagents but wrong for nested subagents whose parent is a sidechain that simply couldn't be resolved. The downstream buildSubagentTree (packages/analyze/src/subagent-tree.ts:168) then attaches the nested subagent directly under the root, producing an incorrect tree structure where an inner subagent appears at the first level instead of nested under its actual parent.
The Subagent.parentAgentId doc comment at packages/reader/src/types.ts:48-49 also doesn't match the implementation — it says the field is "undefined when the parent is the main thread", but the code always sets it to the session ID.
Was this helpful? React with 👍 or 👎 to provide feedback.
# Conflicts: # packages/analyze/CHANGELOG.md # packages/reader/CHANGELOG.md
- Guard resolveInvocation's parentUuid walk with a visited-set to cover malformed A→B→A cycles (depth guard only covered recursion). - Pre-scan bytes [0, startOffset) in parseClaudeSessionIncremental to register parent nodes, so resuming ingest past the spawn line still resolves agentId / parentAgentId / parentToolUseId for sidechain turns. Add a regression test. - Redirect self-parent / cyclic subagent attachments to the session root in buildSessionTree so foldCumulative / sortTree can't recurse infinitely on malformed data. - Drop the dead subagentByToolUseId plumbing and captureSubagentFromToolResult function that resolveSubagent no longer reads; simplify resolveSubagent's signature. - Fix Subagent.parentAgentId doc to match the shipped contract (session id for first-level subagents, not undefined). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Merged Addressed all six Copilot inline comments in 7e6ce46:
All 266 tests pass (one new). |
There was a problem hiding this comment.
Devin Review found 1 new potential issue.
⚠️ 1 issue in files not directly in the diff
⚠️ waste.ts reads dead Subagent.type field instead of newly-populated Subagent.subagentType (packages/analyze/src/waste.ts:241)
In packages/analyze/src/waste.ts:241, turn.subagent?.type reads the old type field on the Subagent interface, which is never populated by any parser. This PR adds and populates a new subagentType field (packages/reader/src/types.ts:53) for the same purpose but doesn't update waste.ts to use it. The result is that subagentType in the ToolAttribution always falls through to the tc.target fallback (packages/analyze/src/waste.ts:257). In practice the values coincide because pickTarget for Agent/Task also reads subagent_type from the tool input (packages/reader/src/claude.ts:619), so aggregateBySubagent still groups correctly — but the field name discrepancy leaves Subagent.type as unreachable dead code and the attribution path fragile against any future change to pickTarget.
View 9 additional findings in Devin Review.
Update Subagent shape and usages: remove the optional `type` and `taskDescription` properties from the Subagent interface in packages/reader/src/types.ts, and update code in packages/analyze/src/waste.ts to read the renamed `subagentType` property (turn.subagent?.subagentType). This aligns the reader types with the analyze package usage and prevents accessing the removed field.
Remove unused local `subagentType` and change attribution to use `tc.target` for Agent/Task tool calls. Add a clarifying comment explaining that `turn.subagent` refers to the parent invocation, while the spawning call's input (resolved to `tc.target`) identifies the spawned subagent. This fixes an unused-variable/lint issue and ensures the correct subagent type is recorded for spawn tool calls.
Closes #8.
Summary
parentUuidchains, distinguishing spawn boundaries (child user line is the Agent prompt) from continuations (child is the tool_result for the Agent call).TurnRecord.subagentnow carriesagentId,parentAgentId,parentToolUseId,subagentType, anddescription.isSidechainis unchanged so existing consumers keep working.burn summary --subagent-tree <session-id>renders the nested tree with cumulative cost rolled up from leaves.burn summary --by-subagent-typeaggregates invocations across sessions with count, total, median, p95, and mean cost.Notes
parentAgentIdfor a first-level subagent is the session id, so the tree has a single main root per session and no null sentinels.resolveInvocationmemoizes per-uuid results so nested trees don't re-walk the outer chain for every inner turn.isSidechain: trueand attach under an(unresolved)node in the rendered tree so their cost isn't dropped.Test plan
pnpm test:ts— 265 tests passnested-subagent.jsonlfixture exercises 2 levels of nestingagentId,parentAgentId,parentToolUseId,subagentType,descriptionpopulated across nested levelsburn summary --subagent-treeand--by-subagent-type🤖 Generated with Claude Code