Skip to content

docs(voice): scaffold resume_false_interruption port (tracks livekit/agents#5535)#1310

Closed
toubatbrian wants to merge 0 commit intodevin/1777264529-resume-false-interruptionfrom
claude/jolly-lovelace-LMzib
Closed

docs(voice): scaffold resume_false_interruption port (tracks livekit/agents#5535)#1310
toubatbrian wants to merge 0 commit intodevin/1777264529-resume-false-interruptionfrom
claude/jolly-lovelace-LMzib

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

Summary

Automated Claude Code port of Python PR livekit/agents#5535"fix(voice): pause output when user starts speaking during thinking".

cc @toubatbrian @livekit/agent-devs

Heads-up: this is a minimal scaffolding port, not a feature-equivalent port. The full fix cannot be applied yet because the underlying resume_false_interruption pause/resume pipeline is not yet ported to agents-js (see "Implementation nuances" below).

What the Python PR does (context for reviewers)

Fixes livekit/agents#5509. When a user starts a new turn while the agent is still thinking (LLM generating, TTS audio not yet flowing), the stale reply could still reach speaking and play over the new user turn.

Key Python changes in agent_activity.py:

  1. New _PausedSpeechInfo dataclass — carries handle + agent_state + timeout. The captured agent_state is restored on resume (instead of always speaking), so a pause that began during thinking resumes to thinking.
  2. New on_start_of_speech pause path — when agent_state != "speaking" and pause is enabled, pause the output with timeout=0. A brief false-positive VAD then resumes immediately on VAD EOS; _interrupt_by_audio_activity upgrades the timeout to the real false_interruption_timeout once VAD confirms active speech.
  3. New helpers_update_paused_speech(handle, timeout) and _pause_enabled().
  4. _paused_speech: SpeechHandle | None_PausedSpeechInfo | None — all call sites updated (_start_false_interruption_timer, _cancel_speech_pause, on_end_of_speech, on_interim_transcript, on_final_transcript).
  5. Test harness updatesFakeAudioOutput(can_pause=True) supports pause; FakeUserSpeech with empty transcript fires VAD SOS/EOS only (no STT events) to simulate sub-min_duration noise. Two new regression tests: test_interrupt_before_speaking_with_pausable_audio and test_false_interruption_before_speaking_resumes.

What this JS PR ports

Because the base feature isn't in TS yet, this PR is structural only — no behavioral change:

  • PausedSpeechInfo interface added in agent_activity.ts with full doc and a Python ref comment pointing to the Python dataclass.
  • _pausedSpeech: PausedSpeechInfo | null = null field added on AgentActivity so future wiring is a field-reuse rather than an API introduction.
  • AgentState type import from ./events.js (matches the Python AgentState import added in the fix).
  • Python Ref markers added at the four TS TODO sites that map to the Python fix, each pointing to the exact Python line ranges:
    • onStartOfSpeech → Python lines 1665–1683 (the new pause path, this PR's core change)
    • interruptByAudioActivity → Python lines 1615–1645 (pauseEnabled / updatePausedSpeech / AgentFalseInterruptionEvent / state transition)
    • onEndOfSpeech → Python lines 1707–1708 (timer start using pausedSpeech.timeout)
    • onFinalTranscript → Python lines 1800–1811 (resume-timer + cancel-speech-pause task)
  • Changeset (patch) describing the scaffolding.

Implementation nuances — why no full-feature port

The JS repo already defines the option types (turn_config/interruption.ts):

falseInterruptionTimeout: number;      // defaultValue 2000
resumeFalseInterruption: boolean;      // defaultValue true

but greps show they are never read in agents/src/voice/** — the implementation is missing:

  • No _pausedSpeech, _falseInterruptionTimer, or _cancelSpeechPauseTask state on AgentActivity.
  • No _startFalseInterruptionTimer, _cancelSpeechPause, _pauseEnabled, or _updatePausedSpeech methods.
  • interruptByAudioActivity hard-interrupts via this._currentSpeech.interrupt() instead of pausing.
  • No AgentFalseInterruptionEvent emitter (not defined in events.ts).
  • FakeAudioOutput (JS testing mock) doesn't currently support canPause=true.

Context: livekit/agents-js#843 ("OTEL logging integration & System-wise Traces") explicitly lists resume_agent_activity / pause_agent_activity as "Pause/resume not in TS" under Python-specific features.

Porting all of the above alongside PR #5535's fix in a single automated pass would materially expand the change surface (multi-hundred-line addition across agent_activity.ts, events.ts, agent_session.ts, fake I/O, tests) and was judged out-of-scope for an automated routine. Instead this PR:

  1. Adds the minimal structural anchors (interface + field + refs) so a subsequent human-driven port lands as a set of focused edits rather than introducing new symbols.
  2. Makes the Python↔TS mapping discoverable by grep for anyone picking up the follow-up work — search for TODO(port-resume-false-interruption) or livekit/agents#5535 to find every site.

Follow-up (suggested scope for the full port)

A future PR should:

  • Define AgentFalseInterruptionEvent in events.ts and wire it through AgentSession.
  • Add pause/resume plumbing on AgentActivity: timer handles, pauseEnabled(), updatePausedSpeech(), startFalseInterruptionTimer(), cancelSpeechPause().
  • Swap the hard-interrupt in interruptByAudioActivity for the pause path when pauseEnabled() is true.
  • Implement the new onStartOfSpeech pause path (the fix introduced by #5535).
  • Extend FakeAudioOutput with canPause/pause bookkeeping mirroring tests/fake_io.py.
  • Port the two regression tests (test_interrupt_before_speaking_with_pausable_audio, test_false_interruption_before_speaking_resumes).

Verification

  • pnpm --filter @livekit/agents build — passes.
  • pnpm --filter @livekit/agents lint — no new errors (pre-existing warnings only).
  • pnpm format:check — clean.
  • pnpm --filter @livekit/agents exec vitest run src/voice/agent_activity.test.ts — 8/8 passed.
  • pnpm api:check — pre-existing api-extractor failure on main (export * as ___ not supported), unrelated to this change.

Provenance


Generated by Claude Code

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 24, 2026

⚠️ No Changeset found

Latest commit: e77c9d0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@toubatbrian toubatbrian changed the base branch from main to devin/1777264529-resume-false-interruption April 28, 2026 05:19
@toubatbrian
Copy link
Copy Markdown
Contributor Author

cc @claude rebased on top of resume false interruption branch. Adjust your change accordingly

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment thread agents/src/voice/agent_activity.ts Outdated

// default to null as None, which maps to the default provider tool choice value
private toolChoice: ToolChoice | null = null;
// Ref: python livekit-agents/livekit/agents/voice/agent_activity.py - 158 line
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Ref comment format deviates from CLAUDE.md template (line vs lines)

The // Ref: comment on line 205 uses - 158 line (singular) instead of - 158 lines (plural). CLAUDE.md specifies the template as // Ref: python <relative-file-path> - <line-range> lines and both examples in the doc use lines (plural). All other six Ref comments in this PR correctly use lines.

Suggested change
// Ref: python livekit-agents/livekit/agents/voice/agent_activity.py - 158 line
// Ref: python livekit-agents/livekit/agents/voice/agent_activity.py - 158 lines
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@toubatbrian toubatbrian force-pushed the claude/jolly-lovelace-LMzib branch from fc76cb3 to e77c9d0 Compare April 28, 2026 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent transitions thinking → speaking after user has already transitioned to speaking (stale reply plays over new user turn)

2 participants