feat: per-generation token telemetry by suryaiyer95 · Pull Request #336 · AltimateAI/altimate-code

suryaiyer95 · 2026-03-20T20:01:22Z

Summary

Fire generation telemetry events on every LLM step in processor.ts — capturing input, output, reasoning, cache_read, cache_write, cost, and duration_ms per step to Azure App Insights. The generation event type existed in the type union but was never emitted before this PR — zero per-generation data was reaching telemetry.
Fix output token accumulation across multi-step messages: previously assistantMessage.tokens was overwritten each step, losing all prior output tokens. Now correctly accumulates output and reasoning while keeping last-step input/cache values.
Fix context window used in acp/agent.ts to include cache.write tokens. Due to applyCaching(), the user's question is tagged cache_control:ephemeral and reported as cache_creation_input_tokens — making it invisible to the prior used = input + cache.read formula. This is also an upstream opencode bug.
11-test suite in test/altimate/token-telemetry.test.ts covering Anthropic prompt caching semantics, generation event payload, multi-step accumulation, and the context window fix.

Verified end-to-end

Ran 2 live sessions with the built binary and queried App Insights — generation events confirmed landing with correct 5-bucket token breakdown:

step	finish_reason	input	output	cache_read	cache_write
1	stop	2	6	0	34,600
1	tool-calls	2	84	0	34,601
2	stop	1	471	34,601	273

Test plan

bun test packages/opencode/test/altimate/token-telemetry.test.ts passes
Run a session and verify generation events appear in App Insights with non-zero tokens_cache_write
Multi-turn session: confirm tokens_cache_read on step 2 matches tokens_cache_write from step 1

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Corrected token usage accounting to include cache write tokens in usage calculations.
New Features
- Added per-step generation telemetry tracking with detailed token breakdown, including cache metrics and step duration.
Tests
- Added comprehensive test coverage for token accounting, prompt caching behavior, tiered pricing, and generation event telemetry.

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

coderabbitai · 2026-03-20T20:01:39Z

📝 Walkthrough

Walkthrough

The pull request updates token accounting and telemetry tracking in the AI agent system. Changes include modifying cache token calculations to include write tokens, adding per-step generation telemetry with token breakdown tracking, and introducing comprehensive test coverage for token accounting validation and telemetry event emission.

Changes

Cohort / File(s)	Summary
Token Accounting Updates `packages/opencode/src/acp/agent.ts`, `packages/opencode/src/session/processor.ts`	Modified `used` token calculation in agent context accounting to include cache write tokens. Enhanced processor with per-step telemetry tracking: captures step start time, computes duration, and emits generation events with token breakdown (input, output, reasoning, cache_read, cache_write) and cost metrics.
Token Telemetry Tests `packages/opencode/test/altimate/token-telemetry.test.ts`	New comprehensive test suite validating token accounting behavior with mocked Anthropic and non-Anthropic models, prompt caching scenarios, cache read/write pricing, tiered pricing thresholds, NaN/Infinity guards, telemetry event payload structure, multi-step message token accumulation, and context-window usage calculations including cache write tokens.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Cache writes now counted with care,
Token flow tracked everywhere,
Step by step, telemetry bright,
Generation events shining light,
Tests ensure math is right! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description covers all required sections: summary explains the changes and rationale, test plan details verification steps, and checklist items are addressed with testing confirmations.
Title check	✅ Passed	The PR title 'feat: per-generation token telemetry' accurately reflects the main feature addition of emitting generation telemetry events on every LLM step.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch worktree-tokens-telemetry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can use oxc to improve the quality of JavaScript and TypeScript code reviews.

Add a configuration file to your project to customize how CodeRabbit runs oxc.

- Emit `generation` telemetry event on every LLM step-finish with model_id, provider_id, agent, finish_reason, cost, duration_ms, and token breakdown - Token fields are flat to comply with Azure App Insights custom measurements schema: `tokens_input`, `tokens_output`, and optionally `tokens_reasoning`, `tokens_cache_read`, `tokens_cache_write` - Optional token fields are only included when the provider actually returns them — reasoning only for reasoning models, cache fields only when active - Remove unused `TokensPayload` type and special-case serializer handler - Step duration tracked from `start-step` to `finish-step` events - Update telemetry.md with accurate generation event field description - Update existing tests for flat token field shape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds per-generation telemetry emission to the session processor and fixes token accounting issues related to multi-step generations and prompt caching (including context window usage when cache writes are involved).

Changes:

Emit generation telemetry events on every finish-step with per-step token buckets, cost, and duration.
Fix assistant message token tracking to accumulate output/reasoning across multi-step generations.
Update ACP context window used calculation to include cache.write tokens; add a new Bun test suite covering caching semantics and telemetry payload expectations.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

File	Description
packages/opencode/test/altimate/token-telemetry.test.ts	Adds a new test suite validating prompt-caching token semantics, generation telemetry payload shape, and context window calculation expectations.
packages/opencode/src/session/processor.ts	Accumulates output/reasoning tokens across steps and emits `generation` telemetry events per step with token buckets + duration.
packages/opencode/src/acp/agent.ts	Fixes context window usage (`used`) to include `cache.write` tokens.
packages/drivers/src/sqlserver.ts	Removes a now-unneeded TypeScript suppression comment on the dynamic `mssql` import.

Comments suppressed due to low confidence (1)

packages/opencode/src/session/processor.ts:297

The new behavior here (emitting a generation telemetry event per finish-step, and accumulating assistantMessage.tokens.output/reasoning across steps) isn't exercised by tests that execute this finish-step handler. The added tests call Telemetry.track directly, which validates the event shape but not that processor.ts actually emits it or that the accumulation logic works end-to-end. Consider extending the existing packages/opencode/test/session/processor.test.ts (which already mirrors processor telemetry paths) to cover this new generation event emission and token accumulation semantics.

                    type: "step-start",
                  })
                  break

                case "finish-step":
                  const usage = Session.getUsage({
                    model: input.model,
                    usage: value.usage,
                    metadata: value.providerMetadata,
                  })
                  input.assistantMessage.finish = value.finishReason
                  input.assistantMessage.cost += usage.cost
                  input.assistantMessage.tokens = usage.tokens
                  // altimate_change start — emit per-generation telemetry with token breakdown
                  // Optional fields are only included when the provider actually returns them.
                  Telemetry.track({
                    type: "generation",
                    timestamp: Date.now(),
                    session_id: input.sessionID,
                    message_id: input.assistantMessage.id,
                    model_id: input.model.id,
                    provider_id: input.model.providerID,
                    agent: input.assistantMessage.agent,
                    finish_reason: value.finishReason ?? "unknown",
                    cost: usage.cost,
                    duration_ms: Date.now() - stepStartTime,
                    tokens_input: usage.tokens.input,
                    tokens_output: usage.tokens.output,
                    ...(value.usage.reasoningTokens !== undefined && { tokens_reasoning: usage.tokens.reasoning }),
                    ...(value.usage.cachedInputTokens !== undefined && { tokens_cache_read: usage.tokens.cache.read }),
                    ...(usage.tokens.cache.write > 0 && { tokens_cache_write: usage.tokens.cache.write }),
                  })
                  // altimate_change end
                  await Session.updatePart({
                    id: PartID.ascending(),
                    reason: value.finishReason,
                    snapshot: await Snapshot.track(),
                    messageID: input.assistantMessage.id,
                    sessionID: input.assistantMessage.sessionID,
                    type: "step-finish",
                    tokens: usage.tokens,
                    cost: usage.cost,
                  })
                  await Session.updateMessage(input.assistantMessage)
                  if (snapshot) {
                    const patch = await Snapshot.patch(snapshot)
                    if (patch.files.length) {
                      await Session.updatePart({

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/opencode/src/session/processor.ts

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/opencode/src/session/processor.ts`:
- Around line 258-264: The accumulation of token counts is incorrect because
input.assistantMessage.tokens is spread from usage.tokens so tokens.total stays
the per-step value; update the assignment in the block that modifies
input.assistantMessage.tokens (where usage.tokens is accessed) to compute total
from the accumulated fields (e.g., total = input.assistantMessage.tokens.output
+ input.assistantMessage.tokens.reasoning + any other token categories you
maintain) instead of copying usage.tokens.total, so after each step
assistantMessage.tokens.total reflects the new cumulative output and reasoning
sums.

In `@packages/opencode/test/altimate/token-telemetry.test.ts`:
- Around line 216-388: Tests currently assert behavior by calling
Telemetry.track(), Session.getUsage(), and inlining arithmetic instead of
exercising the real production flows; update the tests to drive the actual code
paths: create a SessionProcessor via SessionProcessor.create and call
SessionProcessor.process (stub LLM.stream to emit the desired steps) so that
Telemetry.track is invoked by the real processor rather than directly, assert
that assistantMessage.tokens is mutated across multi-step messages (verifying
accumulation from the processor/ACP update path), and for context-window
behavior call the ACP usage-update function in acp/agent (or run the processor
path that applies that update) to verify cache.write is included in the computed
used/context window rather than reimplementing the math inline; keep
Telemetry.track spying to capture emitted events from the processor and use the
real Session.getUsage outputs as produced by the processor to validate cost and
token totals.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0094cc30-a95a-4183-b492-dc2ebb7e95bd

📥 Commits

Reviewing files that changed from the base of the PR and between df24e73 and 5dd52cf.

📒 Files selected for processing (3)

packages/opencode/src/acp/agent.ts
packages/opencode/src/session/processor.ts
packages/opencode/test/altimate/token-telemetry.test.ts

packages/opencode/src/session/processor.ts

packages/opencode/test/altimate/token-telemetry.test.ts

Copilot AI review requested due to automatic review settings March 20, 2026 20:01

claude bot reviewed Mar 20, 2026

View reviewed changes

github-actions bot added the contributor label Mar 20, 2026

Copilot started reviewing on behalf of suryaiyer95 March 20, 2026 20:02 View session

suryaiyer95 force-pushed the worktree-tokens-telemetry branch from 5dd52cf to 34d6047 Compare March 20, 2026 20:07

Copilot AI reviewed Mar 20, 2026

View reviewed changes

packages/opencode/src/session/processor.ts Show resolved Hide resolved

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

packages/opencode/src/session/processor.ts Outdated Show resolved Hide resolved

packages/opencode/test/altimate/token-telemetry.test.ts Outdated Show resolved Hide resolved

suryaiyer95 changed the title ~~feat: per-generation token telemetry + fix token tracking bugs~~ feat: per-generation token telemetry Mar 20, 2026

anandgupta42 merged commit bd56988 into main Mar 20, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-generation token telemetry#336

feat: per-generation token telemetry#336
anandgupta42 merged 1 commit intomainfrom
worktree-tokens-telemetry

suryaiyer95 commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

claude bot left a comment

Uh oh!

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

suryaiyer95 commented Mar 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verified end-to-end

Test plan

Summary by CodeRabbit

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

suryaiyer95 commented Mar 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading