Skip to content

feat: session DAG with fork detection and subagent linking#43

Merged
wesm merged 5 commits intowesm:mainfrom
clkao:session-dag
Feb 26, 2026
Merged

feat: session DAG with fork detection and subagent linking#43
wesm merged 5 commits intowesm:mainfrom
clkao:session-dag

Conversation

@clkao
Copy link
Copy Markdown
Contributor

@clkao clkao commented Feb 25, 2026

Closes #17

Summary

  • Detect conversation forks in Claude Code sessions by building a uuid/parentUuid DAG, splitting large-gap branches into separate session records
  • Link Task tool calls to their spawned subagent sessions via queue-operation parsing (JSON and XML formats)
  • Add inline subagent conversation expansion in the frontend

End-User Impact

  • Fork detection: Conversation branches (e.g., from Claude Code's "retry from here") are detected and split into separate session entries instead of being jumbled together. Small retries (≤3 user turns) fold into the main session; larger forks appear as standalone sessions.
  • Subagent session linking: Task subagent conversations are linked to the specific tool call that launched them. Task tool blocks show an expandable inline view of the subagent's conversation.
  • Cleaner session list: Subagent and fork sessions are hidden from the top-level list — accessible only through their parent session context.

Design

DAG from uuid/parentUuid: Every Claude Code JSONL entry carries uuid and parentUuid fields forming a tree. The parser builds a parent→children adjacency map in a single pass, then walks from root following first-child links. At fork points (nodes with multiple children), a heuristic counts user turns remaining on the first branch: >3 turns = real fork (split into separate ParseResult), ≤3 turns = retry (follow latest child, discard older branch).

Subagent linking from queue-operation: Claude Code writes queue-operation entries with operation: "enqueue" when spawning subagents. The content field maps tool_use_idtask_id. Two formats exist in the wild: JSON ({"task_id":"...","tool_use_id":"..."}) and XML (<task-id>...</task-id><tool-use-id>...</tool-use-id>). Parser tries JSON first via gjson.Get, falls back to regex for XML tags.

Data model — relationship_type: Sessions gain a relationship_type column ("", "continuation", "subagent", "fork"). Fork sessions get ID {parent}-{first-uuid} with parent_session_id pointing to main session. Tool calls gain subagent_session_id linking to the agent session.

API — child sessions endpoint: GET /api/v1/sessions/{id}/children returns fork/continuation/subagent sessions for a parent.

Test plan

  • Verify fork detection unit tests pass (go test ./internal/parser/ -run TestForkDetection)
  • Verify subagent linking tests pass (go test ./internal/parser/ -run TestSubagent)
  • Verify integration tests pass (go test ./internal/sync/ -run TestSync)
  • Manual: load a session with Task subagents and verify inline expansion works
  • Manual: verify subagent/fork sessions don't appear in top-level session list

🤖 Generated with Claude Code

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 25, 2026

roborev: Combined Review (8a5a471)

The PR introduces valuable DAG and subagent parsing capabilities, but contains several significant logic and security flaws in graph traversal that
need to be addressed before merging.

High Severity

  • File: internal/parser/claude.go (parseDAG)
    Issue: Only the first root is traversed. Any additional roots or disconnected UUID components are silently dropped, which can result in lost messages and sessions compared to the
    prior linear behavior.
    Remediation: Process all roots (and any unvisited entries), or fallback to linear parsing when graph connectivity is ambiguous (e.g., len(roots) != 1 or unvisited nodes remain). Add tests for multi-root and missing-parent/disconnected DAG
    inputs.

Medium Severity

  • File: internal/parser/claude.go (lines ~229, ~321 in walkBranch and countUserTurns)
    Issue: Denial of Service (DoS) via infinite loop in DAG parsing. Both walk Branch and countUserTurns traverse DAG links without cycle detection. A malformed or maliciously crafted .jsonl file containing a cycle (e.g., node A sets parentUuid to node B, and node B sets parentUuid to node A) will trap the parser in an infinite loop, causing
    100% CPU usage and eventually an Out-Of-Memory (OOM) crash.
    Remediation: Implement cycle detection in both graph-traversal functions by tracking visited node indices/UUIDs (e.g., visited := make(map[int]bool)) and breaking
    the loop if a revisit occurs.

  • File: internal/parser/claude.go (parseDAG)
    Issue: Nested forks are flattened to the original root session as their parent, rather than the immediate fork branch they diverged from. This loses the true DAG structure and can mislead
    /sessions/{id}/children consumers.
    Remediation: Carry branch-parent context while recursing so each fork session gets its actual parent fork/session ID. Add an assertion for nested fork parents in fork_test.go.

  • File: internal/parser/cla ude.go (parseDAG)
    Issue: Continuation sessions silently fail fork detection. They start with a message whose parentUuid references a message from an external session file, meaning e.parentUuid == "" evaluates to false. As a result, the roots slice remains empty, causing the parser to
    immediately fall back to parseLinear.
    Remediation: Collect all local UUIDs in a set first, then define roots as entries whose parentUuid is either empty or not found in the local set (e.g., if e.parentUuid == "" || !localUUIDs[ e.parentUuid] { roots = append(roots, i) }).


Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

@clkao
Copy link
Copy Markdown
Contributor Author

clkao commented Feb 25, 2026

expandable subagent sessions in Task tool call.

Screenshot 2026-02-25 at 13 58 26 Screenshot 2026-02-25 at 13 06 19

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 25, 2026

roborev: Combined Review (7a61083)

Summary Verdict: The PR introduces valuable DAG parsing and sub
agent session tracking, but requires immediate fixes to address severe graph traversal vulnerabilities (infinite loops/DoS), silent data loss, and non-atomic database writes.

High

1. Infinite Loop / Memory Exhaustion via DAG Cycles
File: internal/parser/claude.go:236, 289 (also affects countUserTurns approx. lines 318-338)
Description: parseDAG, walkBranch, and countUserTurns do not enforce UUID uniqueness or track visited nodes. If a malformed or malicious JSONL file contains
cyclic parentUuid references or duplicate uuids, the for current >= 0 loops will run indefinitely. This leads to CPU exhaustion, infinite slice appending, and eventual Out of Memory (OOM) crashes, effectively breaking server availability during sync/upload.
Suggested Remediation: Implement cycle detection
using a visited set (e.g., map[int]bool) to track nodes seen during graph traversal and break the loop if a cycle is detected. Alternatively, enforce a strict forward-only constraint (e.g., if kids[0] <= current { break }), as valid logs should
only append children after their parents.

2. Disconnected DAG Components Cause Silent Message Loss
File: internal/parser/claude.go (parseDAG fallback logic)
Description: The new linear fallback logic only verifies that there is exactly one root and that every parentUuid exists in the uuidSet. It does not guarantee that all entries are actually reachable from the root. A disconnected graph component with internally valid parent references (such as an isolated cycle) will pass these checks but will never be traversed by walkBranch. As a result, those messages are silently dropped instead of triggering
the linear fallback.
Suggested Remediation: After building the children adjacency map, run a DFS/BFS traversal starting from roots[0]. If the total visitedCount does not equal len(entries), fall back to parseLinear.

Medium

1. Stack Exhaust
ion via Uncapped Recursive Fork Processing

File: internal/parser/claude.go:312
Description: Fork processing utilizes recursion (forkPath := walkBranch(kid)) without any depth or complexity limits. A deeply nested fork structure within a session file can trigger
excessive recursion, potentially causing stack exhaustion and crashing the process.
Suggested Remediation: Replace the recursive calls with an explicit stack/queue for traversal, or enforce a hard maximum limit on recursion depth/branch count, failing closed (or falling back to linear parsing) if the limit is exceeded.

2
. Partial State Persistence on Upload Failure

File: internal/server/upload.go (handleUploadSession)
Description: Because fork parsing allows a single upload to produce multiple separate sessions, they are currently saved to the database one-by-one. If a later saveSessionTo DB call fails, earlier sessions from the same upload remain permanently written while the API returns a 500 error. This leaves partial, inconsistent state for a single upload operation.
Suggested Remediation: Wrap all session saves for a single upload operation within a single database transaction, or implement compensating cleanup logic
to delete already-written session IDs if a subsequent save fails.


Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

clkao and others added 3 commits February 25, 2026 20:07
Build a uuid/parentUuid DAG from Claude Code JSONL entries to detect
conversation forks and link Task tool calls to their spawned subagent
sessions.

Fork detection splits large-gap branches (>3 user turns) into separate
session records with relationship_type="fork", while small-gap retries
are treated as the latest branch of the main session.

Subagent linking parses queue-operation enqueue entries (JSON and XML
formats) to map tool_use IDs to subagent session IDs, then annotates
Task tool calls with the linked session.

Adds relationship_type and subagent_session_id columns, child sessions
API endpoint, and inline subagent conversation expansion in the frontend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…timestamps

Restore ExcludeProject filter on SessionFilter and server handler --
the frontend still sends exclude_project for the hide-unknown toggle,
which is a separate concern from relationship_type filtering.

Fix nested fork parent linkage: pass each fork branch's computed
session ID as ownerID during recursive DAG traversal so nested forks
point to their immediate fork parent, not the root session.

Widen session startedAt/endedAt to include timestamps from
non-message events (queue-operation, etc.) by tracking global min/max
across all valid JSONL lines in the first pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add handler-level test asserting exclude_project query param filters
session results in the HTTP response.

Add StartedAt widening test with a leading non-message event, and a
DAG fixture test verifying main session bounds are widened while fork
session bounds are not.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 26, 2026

roborev: Combined Review (2a7017c)

Review completed with High and Medium severity findings related to parser denial-of-service vulnerabilities and inconsistent session data normalization.

High Severity Findings

1
. Infinite-loop DoS and silent data loss in DAG traversal

  • File: internal/parser/claude.go (Lines 242, 250-269, 295-314, 303, 402)

  • Description: The Claude session parser constructs a Directed Acyclic Graph (DAG) using uuid and parentUuid references but fails to enforce cycle detection or uniqueness. A maliciously crafted or malformed JSONL file reusing the same uuid can bridge disconnected nodes and create a cycle. Since walkBranch and
    countUserTurns lack cycle detection, encountering a cycle will cause infinite loops, leading to rapid memory exhaustion (OOM crashes) or 100% CPU pegging. Because the Sync Engine automatically processes .jsonl files, this is a reliable DoS vector. Furthermore, because traversal only validates reachability
    from a single root, disconnected valid components are silently dropped instead of triggering a fallback.

  • Remediation: Implement cycle detection in the traversal functions by tracking visited entry indices (e.g., using a boolean slice or map). If a cycle is detected or if the total visited count is less than the total number
    of entries, abort the DAG traversal and fallback to parseLinear to prevent DoS and data loss.

Medium Severity Findings

2. Missing relationship_type normalization in upload path

  • File: internal/server/upload.go (Lines 108, 1
  • Description: handleUploadSession writes parser output directly to the database, skipping the normalization step that is correctly applied in the sync flow (processClaude). For linear Claude parses, RelationshipType is typically empty even when ParentSessionID is set. As a result, manually
    uploaded subagent or continuation sessions are misclassified and will appear unexpectedly in main session listings.
  • Remediation: Before calling saveSessionToDB, apply the same normalization logic used in the sync flow (e.g., if parent != "" && rel == "" { rel = "subagent" /* or continuation */ }). Add upload-handler test coverage to ensure relationship_type normalization consistently matches the sync behavior.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

Suppress false positives from roborev about DAG cycle detection,
recursion depth limits, unreachable node checks, and cross-session
write atomicity. These are adversarial-input concerns that don't
apply to a local tool parsing trusted agent output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 26, 2026

roborev: Combined Review (097f7cf)

Summary: The code changes are generally clean
and secure, but one medium severity issue was identified regarding inconsistent session semantics on the upload path.

Medium

Relationship type is not inferred on upload path, causing inconsistent session semantics

  • File: [upload.go](/home/roborev/.roborev/clones/wesm/agentsview/
    internal/server/upload.go:177)
  • Description: The sync engine classifies parsed Claude sessions with a parent into subagent/continuation when relationship_type is empty, but the upload handler saves parsed results directly without that normalization.
  • Impact: Uploaded
    subagent/continuation sessions can be stored with empty relationship_type, meaning filtering and grouping behavior will differ between sync and upload flows.
  • Suggested Fix: Extract the relationship-inference logic from sync (processClaude) into a shared helper and apply it in the upload handler before saveSessionToDB .

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

Extract InferRelationshipTypes into parser package and call it from
both the sync engine and the upload handler. Previously uploaded
sessions with a parent could be stored with empty relationship_type,
causing them to appear in default session listings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Feb 26, 2026

roborev: Combined Review (07e44ff)

Summary: The PR
introduces medium-severity issues related to UI data truncation and inconsistent database filtering semantics that need to be addressed.

Medium

Subagent inline view silently truncates long conversations at 1000 messages

  • File: /home/roborev/.roborev/clones/wesm/agentsview /frontend/src/lib/components/content/SubagentInline.svelte (in toggleExpand, getMessages(sessionId, { limit: 1000 }))
  • Description: The component does a single fetch with a hard cap and no pagination/“truncated” indicator, so
    large subagent sessions are partially rendered without warning.
  • Suggested Fix: Page through /messages until exhaustion (or add “Load more”), and show explicit truncation state if capped.

Visibility/filtering semantics are inconsistent for new subagent/fork sessions

  • Files: /home/ roborev/.roborev/clones/wesm/agentsview/internal/db/sessions.go (new relationship_type NOT IN ('subagent', 'fork') in list/projects paths), /home/roborev/.roborev/clones/wesm/ agentsview/internal/db/stats.go (still counts all sessions/projects/machines)
  • Description: Session list/project list now exclude subagent/fork sessions, but aggregate stats still include them, which can produce conflicting counts across UI/API surfaces.
  • Suggested Fix: Apply
    the same relationship-type predicate in stats/analytics queries, or explicitly expose separate “all vs primary” counters and label them clearly.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

@wesm
Copy link
Copy Markdown
Owner

wesm commented Feb 26, 2026

"Both are valid observations but neither warrants blocking the merge.

Subagent inline 1000-message cap — This is a pragmatic limit, not a bug. Subagent sessions with
1000+ messages are extremely rare (subagents are typically short-lived tool calls). Adding
pagination to an inline expandable view adds real complexity for a scenario that essentially
doesn't occur in practice. If it ever becomes a problem, it's a feature request, not a defect.

Stats include subagent/fork sessions — This is intentional. The stats endpoint counts total
activity (all sessions, messages, cost). The session list filters to "primary" sessions for
browsing. These serve different purposes — the stats reflect actual resource usage while the list
shows navigable sessions. Forcing them to match would undercount activity. If anything, the right
fix later would be adding a breakdown (primary vs subagent vs fork) rather than excluding from
totals.

Neither is a regression introduced by this PR — the 1000-message limit predates these changes, and
the stats queries were never scoped by relationship type. Ship it."

@wesm wesm merged commit 0fedc67 into wesm:main Feb 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session DAG: fork detection and subagent discovery

2 participants