Skip to content

Fix burn hotspots silent hang on multi-session ledgers#214

Merged
willwashburn merged 2 commits intomainfrom
fix/hotspots-bulk-user-turns
Apr 30, 2026
Merged

Fix burn hotspots silent hang on multi-session ledgers#214
willwashburn merged 2 commits intomainfrom
fix/hotspots-bulk-user-turns

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Summary

  • burn hotspots --since=7d (and equivalents) was hanging silently for minutes after ✔ read N turns. Root cause: runHotspotsAttribution issued queryUserTurns({sessionId}) once per eligible session, and the file adapter implements that as streamLines(ledger.jsonl) + in-memory filter — i.e., one full ledger scan per session. On a 7-day window with 169 sessions and a 190MB ledger, that's hundreds of full-file scans. There was also no withProgress around the loop, so the spinner disappeared.
  • Fixed by replacing the per-session calls with a single queryUserTurns() pass plus an in-memory bucket-by-sessionId (new bulkUserTurnsBySession helper). Same shape pattern that loadToolResultEventsForTurns already uses, with the same comment explaining why.
  • Added a withProgress task around the content-sidecar loop so it reports (i/N) progress instead of going silent.
  • The --patterns user-turn loader had the same per-session pathology and now shares the same bulk helper.
  • The HotspotsAttributionDeps test contract changed from per-session callbacks to bulk Set<string> → Map<string, T[]> loaders — the natural shape for the refactor. Updated EMPTY_DEPS and added a coverage test.

End-to-end on the reporter's live ledger: 6,614 turns / 169 sessions / 190MB ledger now completes in ~8.6s instead of hanging.

Test plan

  • pnpm run test:ts — all 896 tests pass
  • burn hotspots --since=7d against a real 190MB / 169-session ledger completes in ~8.6s with no silent gap after read N turns
  • New test asserts loadUserTurnsBySession and loadContentBySession are each invoked exactly once with the eligible session-id set (regression guard for the per-session loop)

🤖 Generated with Claude Code

The attribution path issued queryUserTurns({sessionId}) once per
eligible session, and each call streams the entire ledger.jsonl from
disk. On a 7-day / 190MB / 169-session ledger this stalled for minutes
after "read N turns" with no spinner.

Replace with a single ledger pass + in-memory bucket via a new
bulkUserTurnsBySession helper, and add a withProgress task so the
content-sidecar loop also reports progress. The HotspotsAttributionDeps
contract changes from per-session callbacks to bulk Set->Map loaders,
which is the shape the refactor naturally wants.

The --patterns user-turn loader had the same per-session pathology and
now shares the same bulk helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d31e9c02f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/cli/src/commands/hotspots.ts Outdated
): Promise<Map<string, UserTurnRecord[]>> {
const out = new Map<string, UserTurnRecord[]>();
if (sessionIds.size === 0) return out;
const all = await queryUserTurns();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid materializing all user turns before session filtering

bulkUserTurnsBySession now calls queryUserTurns() with no query filter and only applies the sessionIds filter afterward, which means both burn hotspots attribution and --patterns first load every user-turn record in the ledger into memory. Because queryUserTurns() eagerly collects to an array, a large historical ledger (especially when the command is scoped to a small recent window) can see major memory growth or heap OOMs that were not possible with the previous per-session filtered reads.

Useful? React with 👍 / 👎.

Reviewer feedback on #214: bulkUserTurnsBySession was calling
queryUserTurns() with no filter and applying the sessionIds set only
after collection, so a small recent window (e.g., --since=7d) on a
year-old ledger would buffer every historical user-turn record into
memory before bucketing — turning the previous bounded per-session
peak into an unbounded one.

Forward q.since and q.source into queryUserTurns(q) so the streaming
filter narrows during the pass. q.until is deliberately not forwarded:
the user turn that carries tool_results for the last in-window
assistant turn can lag a few seconds past a hard until cutoff. q.project
is moot (userTurnPasses ignores it; sessionIds upstream are already
project-scoped) and q.sessionId would defeat the bulk call.

Tightens the no-deps fallback in runHotspotsAttribution to per-session
form so peak memory stays bounded even for callers that don't supply
q-bound deps.

Wires q through the --patterns userTurnsForPatternDetectors helper too.

Adds an integration test that seeds a temp ledger with old + new
user turns across multiple sessions and asserts the since cutoff is
honored end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@willwashburn
Copy link
Copy Markdown
Member Author

Good catch — valid concern, fixed in 2dc000f.

The risk: bulkUserTurnsBySession was calling queryUserTurns() with no filter and applying the sessionIds set only after collection, which on a small recent window against a long historical ledger would buffer every historical user-turn record into memory before bucketing. That turns the previous bounded per-session peak into an unbounded one.

The fix: forward q.since and q.source into queryUserTurns(q) so the streaming filter narrows during the pass.

What's not forwarded and why:

  • q.until — the user turn that carries tool_results for the last in-window assistant turn can lag a few seconds past a hard until cutoff (userTurn.ts >= turn.ts), so filtering by until could lose attribution at the right edge.
  • q.projectuserTurnPasses ignores it by design (per the comment in file-adapter.ts), and the sessionIds set upstream is already project-scoped via queryAll(q).
  • q.sessionId — would defeat the bulk call.

Also tightened the no-deps fallback in runHotspotsAttribution to the original per-session form, so any future caller that skips deps doesn't trip the same memory hazard.

Added hotspots-bulk-load.test.ts with an integration test that seeds a temp ledger with old + new user turns across multiple sessions and asserts since is honored end-to-end.

@willwashburn willwashburn merged commit 927bf0b into main Apr 30, 2026
1 check passed
@willwashburn willwashburn deleted the fix/hotspots-bulk-user-turns branch April 30, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant