Skip to content

Add opencode reader + CLI wrapper; track reasoning tokens#9

Merged
willwashburn merged 4 commits intomainfrom
v0.1-opencode
Apr 21, 2026
Merged

Add opencode reader + CLI wrapper; track reasoning tokens#9
willwashburn merged 4 commits intomainfrom
v0.1-opencode

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Summary

  • New @relayburn/reader parser for opencode's tree-structured storage (~/.local/share/opencode/storage/{session,message,part}/). Emits one TurnRecord per assistant message; sidechain from session-level parentID; per-turn (not cumulative) usage.
  • burn opencode spawn wrapper (snapshot-diff — opencode has no pre-supplied session-id flag, unlike claude --session-id) + ingestOpencodeSessions() shared-walker integration.
  • Usage.reasoning: number added to the shared shape (claude parser backfills 0, opencode fills from tokens.reasoning); cost.ts folds reasoning into output billing. Stored separately so we can split later if providers diverge on how they bill it.

Known pricing gap

google/gemini-3-pro-high is absent from the vendored models.dev snapshot — shows $0.00 in summaries. Flagged, not hand-patched.

Merge note

Branched off `main`, not `v0.1-codex`. When `v0.1-codex` lands, its `codex.ts::toUsage` needs a one-line `reasoning: 0` added to satisfy the new `Usage` shape.

Test plan

  • `npx tsc --build` clean
  • `node --test 'packages//dist/**/.test.js'` — 30/30 pass (5 new opencode tests + all existing)
  • Smoke test: `RELAYBURN_HOME=$(mktemp -d) burn summary --since 365d` ingested 39,783 opencode sessions / 60,105 turns, priced across `anthropic/`, `opencode/`, `google/*` model strings
  • Ingestion idempotent (second run: 0 new sessions)

🤖 Generated with Claude Code

Add support for ingesting Opencode sessions and account for per-turn reasoning tokens in usage/cost calculations. Introduces a new reader (packages/reader/src/opencode.ts) with tests and fixtures, a CLI wrapper command (commands/opencode.ts), and ingest logic (ingestOpencodeSessions + ingestAll) to discover and append Opencode sessions. Propagate the new Usage.reasoning field across types, readers, CLI summary/by-tool aggregation, and cost calculation (reasoning now counts toward output cost). Update related tests and exports to wire the feature into existing tooling.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support for ingesting and parsing opencode’s on-disk storage into Relayburn’s shared TurnRecord format, and extends usage accounting to track “reasoning” tokens for pricing.

Changes:

  • Introduce parseOpencodeSession() (plus fixtures/tests) to read opencode storage/{session,message,part} trees and emit per-assistant-turn records.
  • Extend the shared Usage shape with reasoning tokens and propagate through readers/tests/CLI aggregation.
  • Add CLI ingestion for opencode sessions and a burn opencode spawn wrapper to ingest sessions created during a run.

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/fixtures/opencode/with-tool/storage/session/global/ses_tool.json New opencode session fixture (tool-call turn).
tests/fixtures/opencode/with-tool/storage/part/msg_tool_asst/prt_tool_5.json New opencode part fixture (step-finish + tokens).
tests/fixtures/opencode/with-tool/storage/part/msg_tool_asst/prt_tool_4_bash.json New opencode part fixture (bash tool call).
tests/fixtures/opencode/with-tool/storage/part/msg_tool_asst/prt_tool_3_edit.json New opencode part fixture (edit tool call).
tests/fixtures/opencode/with-tool/storage/part/msg_tool_asst/prt_tool_2_read.json New opencode part fixture (read tool call).
tests/fixtures/opencode/with-tool/storage/part/msg_tool_asst/prt_tool_1.json New opencode part fixture (step-start).
tests/fixtures/opencode/with-tool/storage/message/ses_tool/msg_tool_user.json New opencode message fixture (user).
tests/fixtures/opencode/with-tool/storage/message/ses_tool/msg_tool_asst.json New opencode message fixture (assistant + tokens).
tests/fixtures/opencode/simple/storage/session/global/ses_simple.json New opencode session fixture (simple turn).
tests/fixtures/opencode/simple/storage/part/msg_simple_asst/prt_simple_3.json New opencode part fixture (step-finish + tokens).
tests/fixtures/opencode/simple/storage/part/msg_simple_asst/prt_simple_2.json New opencode part fixture (text).
tests/fixtures/opencode/simple/storage/part/msg_simple_asst/prt_simple_1.json New opencode part fixture (step-start).
tests/fixtures/opencode/simple/storage/message/ses_simple/msg_simple_user.json New opencode message fixture (user).
tests/fixtures/opencode/simple/storage/message/ses_simple/msg_simple_asst.json New opencode message fixture (assistant + tokens).
tests/fixtures/opencode/multi-turn/storage/session/global/ses_multi.json New opencode session fixture (multi-turn).
tests/fixtures/opencode/multi-turn/storage/session/global/ses_child.json New opencode session fixture (child sidechain).
tests/fixtures/opencode/multi-turn/storage/part/msg_multi_a2/prt_a2_2.json New opencode part fixture (step-finish + reasoning tokens).
tests/fixtures/opencode/multi-turn/storage/part/msg_multi_a2/prt_a2_1.json New opencode part fixture (bash tool call).
tests/fixtures/opencode/multi-turn/storage/part/msg_multi_a1/prt_a1_1.json New opencode part fixture (step-finish).
tests/fixtures/opencode/multi-turn/storage/part/msg_child_asst/prt_child_1.json New opencode part fixture (child step-finish).
tests/fixtures/opencode/multi-turn/storage/message/ses_multi/msg_multi_u2.json New opencode message fixture (user).
tests/fixtures/opencode/multi-turn/storage/message/ses_multi/msg_multi_u1.json New opencode message fixture (user).
tests/fixtures/opencode/multi-turn/storage/message/ses_multi/msg_multi_a2.json New opencode message fixture (assistant + reasoning tokens).
tests/fixtures/opencode/multi-turn/storage/message/ses_multi/msg_multi_a1.json New opencode message fixture (assistant).
tests/fixtures/opencode/multi-turn/storage/message/ses_child/msg_child_user.json New opencode message fixture (child user).
tests/fixtures/opencode/multi-turn/storage/message/ses_child/msg_child_asst.json New opencode message fixture (child assistant).
packages/reader/src/types.ts Add Usage.reasoning to the shared type surface.
packages/reader/src/opencode.ts New opencode storage parser producing TurnRecord[].
packages/reader/src/opencode.test.ts Tests for opencode parsing, tools/filesTouched, sidechain, usage.
packages/reader/src/index.ts Export opencode parser + options from reader package.
packages/reader/src/claude.ts Backfill reasoning: 0 for Claude usage mapping.
packages/reader/src/claude.test.ts Update Claude tests for the new Usage field.
packages/ledger/src/ledger.test.ts Update ledger tests for the new Usage field.
packages/cli/src/ingest.ts Add ingestOpencodeSessions() and ingestAll() integration.
packages/cli/src/index.ts Export new ingestion functions + opencode wrapper entrypoint.
packages/cli/src/commands/summary.ts Switch summary ingestion to ingestAll() and aggregate reasoning.
packages/cli/src/commands/opencode.ts New burn opencode wrapper that ingests newly created sessions.
packages/cli/src/commands/by-tool.ts Switch by-tool ingestion to ingestAll().
packages/cli/src/cli.ts Add burn opencode command wiring + help text.
packages/analyze/src/cost.ts Include usage.reasoning in billed output token cost calculation.
packages/analyze/src/cost.test.ts Update cost tests for the new Usage field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/cli/src/commands/opencode.ts Outdated
Comment thread packages/reader/src/opencode.ts
Comment thread packages/reader/src/opencode.ts Outdated
Comment thread packages/reader/src/opencode.ts Outdated
Comment thread packages/cli/src/commands/summary.ts
Comment thread packages/analyze/src/cost.ts
@willwashburn
Copy link
Copy Markdown
Member Author

Review question before merge — potential undercounting on multi-step assistant messages.

While surveying adjacent projects I hit opencode-tokenscope (ramtinJ95/opencode-tokenscope, an OpenCode-plugin token analyzer). Their README explicitly claims:

"Per-Call Step Telemetry: Reads stored step-finish records so multi-step assistant turns and tool loops count every API call, not just the final step saved on the assistant message"

Their telemetry.ts:89-106 implements this by iterating step-finish parts when present and only falling back to message-level tokens when absent.

This PR's parseOpencodeSession at packages/reader/src/opencode.ts:74 reads from m.tokens (message-level). It correctly surfaces step-finish parts for stopReason via lastStepFinishReason, but doesn't use step-finish usage.

If tokenscope is right that OpenCode's message-level tokens is only the final step's usage (not an aggregate across the tool-loop's API calls), this reader will systematically undercount multi-step messages — each tool_use loop collapses to one turn with only the last step's tokens billed.

Two signals that suggest this might be worth verifying:

  1. The smoke-test numbers: 60,105 turns / 39,783 sessions ≈ 1.5 turns/session is plausible but low for multi-tool-loop work; the expected distribution would look different under either interpretation.
  2. Tokenscope's explicit wording "not just the final step saved on the assistant message" reads as lived experience of the bug, not speculation.

Suggested fix, keeping one TurnRecord per assistant message (matching Claude reader):

function toUsage(m: AssistantMessage, parts: Part[]): Usage {
  const withTokens = parts.filter(
    (p): p is StepFinishPart => p.type === 'step-finish' && p.tokens !== undefined
  );
  if (withTokens.length > 0) {
    return withTokens.reduce(
      (acc, p) => sumUsage(acc, toUsageFromTokens(p.tokens)),
      emptyUsage()
    );
  }
  return toUsageFromTokens(m.tokens);
}

Not a blocker — the reader is correct for single-step messages, which is probably most of them. But worth confirming against one of the multi-step fixtures (tests/fixtures/opencode/multi-turn/...): does a message with two step-finish parts have m.tokens equal to the sum of both, or equal to the last one?

Easy verification: grep a fixture where a multi-step message exists and compare sum(parts[*].tokens) against message.tokens. Happy to file a follow-up issue if this turns out to be real.

Reference: /Users/will/Projects/opencode-tokenscope/plugin/tokenscope-lib/telemetry.ts:85-109.

Add support for reasoning token counts throughout the Codex session parser. This introduces a reasoning field on CumulativeUsage and includes it in the cumulative initialization, parsing (from total.reasoning_output_tokens), and per-turn calculation in finalizeTurn. Tests updated to assert reasoning usage for affected turns.
@willwashburn
Copy link
Copy Markdown
Member Author

Softening my earlier step-finish concern after checking tokscale's OpenCode parser.

Tokscale (2057 stars, most mature multi-collector tool in the ecosystem) also reads message-level tokens, not step-finish parts (crates/tokscale-core/src/sessions/opencode.rs:116-231). If message-level was systematically undercounting multi-step sessions, someone in tokscale's user base would almost certainly have noticed and filed an issue.

Most likely explanation: OpenCode's message-level tokens field IS the aggregate across all step-finish parts in that message, and tokenscope's wording ("not just the final step saved on the assistant message") referred either to a historical OpenCode bug since fixed, or to a different aggregation layer we're not reading.

Revised position: This PR's message-level approach is consistent with the most mature implementation in the ecosystem, not an outlier. The verification is still cheap — compare sum(step_finish.tokens) against message.tokens on one of the multi-turn fixtures — but I'd treat it as a due-diligence check, not a blocker. If they match, ship.

If they don't match (unlikely but worth knowing), the fix is still what I proposed: sum step-finish parts when present, fall back to message-level when absent. Low-risk addition either way.

…split reasoning cost

- Extract walkOpencodeSessions to walk.ts; ingest.ts and commands/opencode.ts
  share the same helper so the filename filter/recursion can't drift.
- Tighten opencode assistant filter: skip messages missing sessionID or
  a numeric time.created so a malformed record can't throw at sort time
  or emit an invalid TurnRecord.
- Fall back to session.directory when an assistant message has no path.cwd.
- Fix a stale comment about sort-by-filename (we sort by part.id).
- Split CostBreakdown into input / output / reasoning / cacheRead /
  cacheCreate so "output" is pure output-token cost and reasoning is
  visible on its own; summary table now shows a reasoning column.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@willwashburn
Copy link
Copy Markdown
Member Author

Checked this empirically against my local opencode storage (1,000+ sessions, 54,933 assistant messages):

  • 15,866 assistant messages had exactly one step-finish part.
  • 0 had two or more step-finish parts.
  • On a sample of 1,372 messages with a step-finish, m.tokens equaled step.tokens in every single case (0 disagreements).

So the tokenscope concern — "multi-step assistant turns collapse into one message with only the final step's tokens" — doesn't apply to opencode's current storage layout. Each step creates its own assistant message, which this reader already emits as its own TurnRecord. m.tokens is the complete accounting for that step/message.

The tokenscope wording is probably defending against a state they observed (or feared) on an older schema; current opencode is a 1:1 mapping and reading m.tokens is correct.

Leaving the reader as-is. If opencode's storage layout ever changes to bundle multiple steps into one message, the existing lastStepFinishReason scan is a natural place to swap in a sum(step-finish.tokens) fallback.

@willwashburn willwashburn merged commit c138fb2 into main Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants