Update STT metrics to include token usage and emit gpt-realtime transcription STT token counts by bml1g12 · Pull Request #5029 · livekit/agents

bml1g12 · 2026-03-06T16:47:39Z

Summary

The OpenAI Realtime API's conversation.item.input_audio_transcription.completed event carries a usage field with ASR duration (whisper-1 / gpt-4o-transcribe), billed separately from the realtime model. LiveKit currently ignores this field, so users cannot track transcription metrics via on_metrics_collected.

Per OpenAI's Realtime costs documentation, input transcription is billed at the ASR model's rate (e.g. $0.006 / 1M tokens for whisper-1), separately from the realtime model's audio tokens. I have confirmed with OpenAI support that when using gpt-realtime, the Whisper ASR model is billed per token not per minute, so for cost tracking purposes we need to track at least the audio token counts, however sadly at time of writing OpenAI only emit the duration (UsageTranscriptTextUsageDuration) despite their blog suggesting it should emit the token counts UsageTranscriptTextUsageTokens. I have made OpenAI support aware of the contradiction between their blog and actual implemented events for cost tracking, but in any case, this PR also implements the handling for UsageTranscriptTextUsageTokens events if they were to be produced in future. The way it is implemented in this PR means that if OpenAI fix it in future to also emit the relevant audio tokens counts (which are the ones we are most interested in for cost estimation) it will also work for those, enabling livekit users to track their Whisper costs on a per-session basis. As it is today though, this PR just enables livekit users to track the duration of Whisper ASR performed.

The Metadata.model_name field identifies which transcription model produced the metrics (e.g. whisper-1, gpt-4o-transcribe).

Note that I have not emited these metrics as OTEL traces as it seems we currently do not emit STT traces in general, and because for LangFuse to track the cost of these I think I would need to use platform specific attributes (`langfuse.observation.type": "generation") as the OTEL specification does not have a standard attribute for STT token counting. I would be happy to add this as a further improvement is there is interest from the livekit team, but otherwise will just implement it in our own client code.

Changes

STTMetrics (metrics/base.py): Add optional input_tokens, output_tokens, total_tokens, and input_audio_tokens fields. All default to None so existing STT plugins are unaffected.
OpenAI realtime plugin (realtime_model.py): Extract usage from conversation.item.input_audio_transcription.completed events and emit STTMetrics via the existing metrics_collected event. Handles both the token-based (UsageTranscriptTextUsageTokens) and duration-based (UsageTranscriptTextUsageDuration) usage variants from the OpenAI SDK.
log_metrics (metrics/utils.py): Log token fields for STT metrics when present.
UsageCollector (metrics/usage_collector.py): Aggregate stt_input_tokens, stt_output_tokens, and stt_input_audio_tokens in UsageSummary.

Design decisions

STTMetrics rather than RealtimeModelMetrics: The transcription runs on a separate model (whisper/gpt-4o-transcribe) with its own billing rate, so it belongs in STTMetrics with the model identified via Metadata.

…nscription events

…update method

…ml1g12/agents into add_gpt_realtime_transcription_metrics

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

davidzhao · 2026-03-14T16:36:42Z

thanks for the PR. do you mind rebasing it on top of chenghaomou/v1.5.0? we've revamped metrics collection as part of the next release, including an updated usage structure.

bml1g12 · 2026-03-16T07:55:10Z

thanks for the PR. do you mind rebasing it on top of chenghaomou/v1.5.0? we've revamped metrics collection as part of the next release, including an updated usage structure.

Ah I see, I will do this but it might be a little while from now as I have a backlog of other tasks

bml1g12 · 2026-03-18T11:18:23Z

thanks for the PR. do you mind rebasing it on top of chenghaomou/v1.5.0? we've revamped metrics collection as part of the next release, including an updated usage structure.

That branch seems to no longer exist nor does it match "main"? Was it merged?

davidzhao · 2026-03-19T04:17:55Z

thanks for the PR. do you mind rebasing it on top of chenghaomou/v1.5.0? we've revamped metrics collection as part of the next release, including an updated usage structure.

That branch seems to no longer exist nor does it match "main"? Was it merged?

v1.4.0 was renamed to v1.5.0 :) it's chenghaomou/v1.5.0. though if you wait another day, it'll be merged back to main.

bml1g12 · 2026-03-22T14:49:39Z

thanks for the PR. do you mind rebasing it on top of chenghaomou/v1.5.0? we've revamped metrics collection as part of the next release, including an updated usage structure.

That branch seems to no longer exist nor does it match "main"? Was it merged?

v1.4.0 was renamed to v1.5.0 :) it's chenghaomou/v1.5.0. though if you wait another day, it'll be merged back to main.

OK ill wait and pull in main soon

davidzhao · 2026-03-24T06:16:16Z

OK ill wait and pull in main soon

it's ready for rebase now

# Conflicts: # livekit-agents/livekit/agents/metrics/base.py # livekit-agents/livekit/agents/metrics/utils.py

bml1g12 · 2026-03-24T10:10:19Z

@davidzhao OK Ive rebased - ready for re-review

bml1g12 · 2026-03-31T10:51:33Z

@davidzhao OK to merge?

bml1g12 added 6 commits March 6, 2026 16:47

Update STT metrics to include token usage and enhance logging for tra…

1ecfaa8

…nscription events

refactor(realtime_model): linting

272ec55

refactor(metrics): streamline metadata and extra token logging using …

86f04dd

…update method

Merge branch 'main' into add_gpt_realtime_transcription_metrics

2ca8da8

docs: clarify STT design choices

a4b4535

Merge branch 'add_gpt_realtime_transcription_metrics' of github.com:b…

ab2400c

…ml1g12/agents into add_gpt_realtime_transcription_metrics

bml1g12 marked this pull request as ready for review March 13, 2026 16:37

devin-ai-integration Bot reviewed Mar 13, 2026

View reviewed changes

bml1g12 added 3 commits March 24, 2026 09:49

Merge branch 'main' into add_gpt_realtime_transcription_metrics

8bb97ba

Merge branch 'main' into add_gpt_realtime_transcription_metrics

6acbabd

# Conflicts: # livekit-agents/livekit/agents/metrics/base.py # livekit-agents/livekit/agents/metrics/utils.py

chore(utils): fix merge conflicts

4223b4d

bml1g12 changed the title ~~Update STT metrics to include token usage and enhance logging for tra…~~ Update STT metrics to include token usage and enhance logging Mar 24, 2026

bml1g12 changed the title ~~Update STT metrics to include token usage and enhance logging~~ Update STT metrics to include token usage and emit gpt-realtime transcription STT token counts Mar 24, 2026

Merge branch 'main' into add_gpt_realtime_transcription_metrics

c4127ab

This comment was marked as resolved.

Sign in to view

fix(usage_collector): if no input tokens set usage to 0

efdc47c

davidzhao self-assigned this Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update STT metrics to include token usage and emit gpt-realtime transcription STT token counts #5029

Update STT metrics to include token usage and emit gpt-realtime transcription STT token counts #5029
bml1g12 wants to merge 11 commits into
livekit:mainfrom
bml1g12:add_gpt_realtime_transcription_metrics

bml1g12 commented Mar 6, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

davidzhao commented Mar 14, 2026

Uh oh!

bml1g12 commented Mar 16, 2026

Uh oh!

bml1g12 commented Mar 18, 2026 •

edited

Loading

Uh oh!

davidzhao commented Mar 19, 2026

Uh oh!

bml1g12 commented Mar 22, 2026

Uh oh!

davidzhao commented Mar 24, 2026

Uh oh!

bml1g12 commented Mar 24, 2026

Uh oh!

bml1g12 commented Mar 31, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bml1g12 commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Design decisions

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

davidzhao commented Mar 14, 2026

Uh oh!

bml1g12 commented Mar 16, 2026

Uh oh!

bml1g12 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidzhao commented Mar 19, 2026

Uh oh!

bml1g12 commented Mar 22, 2026

Uh oh!

davidzhao commented Mar 24, 2026

Uh oh!

bml1g12 commented Mar 24, 2026

Uh oh!

bml1g12 commented Mar 31, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bml1g12 commented Mar 6, 2026 •

edited

Loading

bml1g12 commented Mar 18, 2026 •

edited

Loading