Skip to content

reader: assign turn membership by parentUuid chain walk, not timestamp window #433

@willwashburn

Description

@willwashburn

Context

Claude Code's ~/.claude/projects/<slug>/<sessionId>.jsonl records carry a parentUuid field linking each row to its causal predecessor. Burn's reader currently groups rows into turns using time-window heuristics and row ordering (see crates/relayburn-sdk/src/reader/classifier.rs). That works most of the time but breaks under:

  • Mid-stream interruptions (user cancels, resumes later — wall-clock gap between assistant rows in the same logical turn).
  • Out-of-order JSONL flushes — Claude Code's writer is async; rows can land slightly out of timestamp order under load.
  • Compaction artifacts — synthetic rows inserted at compaction time with timestamps that don't sit cleanly inside any turn window.

Prior art: agent-profiler sidesteps all of this by walking the UUID chain. Source: lib/claude-code/traces.js, findTurnRoot + sliceTurns. The rule:

Group records by walking backward up parentUuid to the nearest user-prompt ancestor — explicitly not timestamp-based. Handles re-ordered/torn JSONL.

Proposal

Replace the current turn-grouping heuristic with a UUID-chain walk:

  1. Build a HashMap<Uuid, &Record> of all rows in the file.
  2. For each row, walk parentUuid upward until you hit a user-prompt row (or None).
  3. That user-prompt UUID is the row's turn key.
  4. Fall back to timestamp grouping only for rows missing both uuid and parentUuid (legacy/malformed).

Implementation sketch

  • crates/relayburn-sdk/src/reader/claude.rs (or wherever Claude turn grouping lives): introduce fn group_by_parent_chain(rows: &[Record]) -> HashMap<Uuid, Vec<&Record>>.
  • Keep the existing time-window logic behind a feature flag or as the explicit fallback path for non-Claude harnesses.
  • Add a fixture under crates/relayburn-sdk/tests/fixtures/ with deliberately out-of-order rows and an interruption-resume pattern; assert turn membership matches the UUID chain, not the timestamp window.

Open questions

  1. Cycle detection. Trust Claude's writer? Or guard against accidental cycles with a visited set? Cheap; do it.
  2. Multiple roots per file. Long-running sessions with manual /resume may produce multiple disjoint user-prompt roots; that's fine, they become separate turns.
  3. Codex equivalent. Codex rollouts don't have the same parentUuid field. Keep this Claude-specific for now; document the asymmetry.
  4. Performance. O(N) build of the parent map + O(depth) walk per row. For a 50k-row session this is microseconds. No concern.

Acceptance

  • Fixture with interleaved/out-of-order rows produces correct turn groupings via chain walk.
  • Fixture with a user interruption + resume groups all resumed rows into the same turn as the original prompt.
  • Existing classifier tests still pass.
  • No regression in burn summary / burn hotspots golden tests on the cli-golden fixture.
  • Codex reader explicitly documented as falling back to the existing strategy.

References

  • agent-profiler: lib/claude-code/traces.jsfindTurnRoot, sliceTurns.
  • burn: crates/relayburn-sdk/src/reader/classifier.rs, crates/relayburn-sdk/src/reader/types.rs.
  • Related: span-tree foundation (consumes the corrected grouping).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions