Skip to content

feat(metrics): port playbackLatency metric from python (#5524)#1323

Merged
toubatbrian merged 3 commits intomainfrom
claude/jolly-lovelace-OcIg4
Apr 28, 2026
Merged

feat(metrics): port playbackLatency metric from python (#5524)#1323
toubatbrian merged 3 commits intomainfrom
claude/jolly-lovelace-OcIg4

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

Summary

Ports livekit/agents#5524 ("feat(metrics): add playback_start_latency metric") to agents-js.

Adds a new playbackLatency field on MetricsReport (and therefore on assistant ChatMessage.metrics) measuring the delay between the first audio frame being forwarded to the AudioOutput and the output reporting that playback actually started.

This metric is typically near-zero in the default pipeline (the room output self-reports playbackStarted when it pushes the first frame to the track and so does not account for network delivery to the client), and becomes significant when a remote avatar worker is in the chain — in that scenario the avatar reports playback via the lk.playback_started RPC and the latency reflects the avatar/network leg.

This is an automated port created by the Claude Code routine triggered by @toubatbrian. cc @toubatbrian @livekit/agent-devs as reviewers.

Changes

agents/src/llm/chat_context.ts

  • Adds playbackLatency?: number (seconds) to MetricsReport. Documented with a tsdoc block matching the Python field's docstring.

agents/src/voice/generation.ts

  • Adds startedForwardingAt?: number (ms — Date.now()) to the _AudioOut interface.
  • In forwardAudio, sets out.startedForwardingAt = Date.now() the first time a frame is appended to out.audio. Mirrors the Python _audio_forwarding_task change (out.started_forwarding_at = time.time() at the same point in the loop).

agents/src/voice/agent_activity.ts

Two replyKind paths populate assistantMetrics; both are updated:

  1. ttsTask (the say() path)onFirstFrame now also captures replyStartedForwardingAt from audioOut.startedForwardingAt (or falls back to replyStartedSpeakingAt for the text-only / no-audio case). replyAssistantMetrics.playbackLatency = (replyStartedSpeakingAt - replyStartedForwardingAt) / 1000 is populated alongside startedSpeakingAt / stoppedSpeakingAt.

  2. _pipelineReplyTaskImpl (LLM → TTS pipeline) — same treatment via agentStartedForwardingAt and a playbackLatency entry on assistantMetrics.

Each touched location has an inline // Ref: python livekit-agents/livekit/agents/voice/<file>.py - <range> lines comment as required by CLAUDE.md.

Implementation nuances (where exact code-level parity is not possible)

  1. Time units. Python tracks all timestamps as seconds-floats (time.time()); JS in this repo uses milliseconds (Date.now()) by convention (per CLAUDE.md). _AudioOut.startedForwardingAt is therefore stored in ms, and we divide the (startedSpeakingAt - startedForwardingAt) difference by 1000 when writing it into MetricsReport.playbackLatency (which, per the existing startedSpeakingAt / stoppedSpeakingAt / e2eLatency fields, is in seconds). End result: playbackLatency is in seconds, matching Python's playback_latency.

  2. _on_first_frame receives audio_out. Python upgrades the realtime path's callback to take the audio_out via functools.partial(_on_first_frame, audio_out=audio_out). JS already creates onFirstFrame as a closure inside the relevant scope, so the equivalent change is just a parameter (audioOut: _AudioOut | null) — no partial-equivalent needed. The text-only branch passes null.

  3. Realtime generation path is intentionally not touched. In Python, _realtime_generation_task_impl already populates assistant_metrics["started_speaking_at"] etc. and the diff just adds playback_latency to that block. In JS, the realtime generation task (_realtime_generation_task_impl / realtimeReplyTaskImpl) does not currently track started_speaking_at/stopped_speaking_at on the assistant ChatMessage at all (the ChatMessages are created without a metrics field — see agent_activity.ts:2754 and :2781). Adding playbackLatency there in isolation would be misleading without the surrounding metrics. This pre-existing gap is unrelated to PR #5524 and is left as a separate follow-up.

  4. Example port skipped. The Python diff updates examples/avatar_agents/audio_wave/agent_worker.py to log the metric. agents-js doesn't carry an equivalent avatar example in this repo, and the change in Python is illustrative (@session.on("conversation_item_added") already exists in JS, so users can observe the new field with the same pattern).

Test plan

  • pnpm build:agents — clean
  • pnpm lint --filter=@livekit/agents — only pre-existing warnings, no new ones
  • pnpm test agents/src/llm/chat_context — 39/39 pass
  • pnpm test agents/src/voice — 136/136 pass
  • Changeset added (patch bump on @livekit/agents)
  • Manual: run restaurant_agent.ts / realtime_agent.ts against the Agent Playground and verify metrics.playbackLatency appears on assistant messages — recommend a reviewer with playground access do this; my port runs in an isolated environment without LiveKit credentials.

Reviewers

cc @toubatbrian @livekit/agent-devs


Generated by Claude Code

Ports livekit/agents#5524 to TypeScript. Adds a new `playbackLatency` field
on `MetricsReport` measuring the delay (in seconds) between forwarding the
first audio frame and the `AudioOutput` reporting that playback started.

- `_AudioOut` tracks `startedForwardingAt` (ms) inside `forwardAudio`
- pipeline-reply and tts-say paths in `AgentActivity` capture
  `audioOut.startedForwardingAt` in their `onFirstFrame` callback and
  populate `playbackLatency = (startedSpeakingAt - startedForwardingAt) / 1000`
  on the assistant `ChatMessage` metrics.

Near-zero for the default room output; meaningful when a remote avatar
worker is in the chain and reports playback via `lk.playback_started` RPC.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 27, 2026

🦋 Changeset detected

Latest commit: 1a154ea

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 26 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

⚠️ 1 issue in files not directly in the diff

⚠️ playbackLatency metric silently dropped during telemetry trace serialization (agents/src/telemetry/traces.ts:302-311)

The PR adds playbackLatency to the MetricsReport interface (agents/src/llm/chat_context.ts:102) and computes it in agent_activity.ts, but does not update the telemetry serialization in agents/src/telemetry/traces.ts. The ProtoMetricsReport interface (traces.ts:302-311) is missing a playbackLatency field, and the chatItemToProto serialization logic (traces.ts:384-409) never copies it. Every other field in MetricsReport is faithfully mapped in ProtoMetricsReport and serialized — playbackLatency is the only one silently dropped, so the metric will never appear in OpenTelemetry traces.

View 3 additional findings in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@lukasIO lukasIO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread agents/src/voice/agent_activity.ts Outdated
Comment thread agents/src/llm/chat_context.ts Outdated
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread agents/src/voice/agent_activity.ts
Comment thread agents/src/voice/agent_activity.ts
Without this, the metric is silently dropped when ChatItems are
serialized into OTel traces.

Addresses Devin Review finding on PR #1323.
@toubatbrian
Copy link
Copy Markdown
Contributor Author

Verified locally and on observability dashboard.

@toubatbrian toubatbrian merged commit 4abc309 into main Apr 28, 2026
8 of 9 checks passed
@toubatbrian toubatbrian deleted the claude/jolly-lovelace-OcIg4 branch April 28, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants