Skip to content

Preemptive generation does not check in-flight function tool execution #1365

@CampbellMBXJ

Description

@CampbellMBXJ

Describe the bug

When preemptiveGeneration: true is enabled and a user produces an end-of-utterance during a function tool's execution window, the framework starts a new generateReply (via the preemptive path) before the tool result has been added to chat context. The new generation runs to completion based on stale chat context, the LLM commonly hallucinates the tool's outcome, and its TTS plays in full alongside the legitimate post-tool reply. The user hears two similar messages back-to-back about the same outcome or two tool calls.

Where it happens (agents/src/voice/agent_activity.ts on main):

// onPreemptiveGeneration, line 1291
if (
  !preemptiveOpts.enabled ||
  this.schedulingPaused ||
  (this._currentSpeech !== undefined && !this._currentSpeech.interrupted) ||
  !(this.llm instanceof LLM)
) {
  return;
}

The early-return guard checks _currentSpeech, but _currentSpeech is cleared in the main loop between LLM-stream-end and tool execution:

// mainTask, ~line 1396
this._currentSpeech = speechHandle;
speechHandle._authorizeGeneration();
await speechHandle.waitIfNotInterrupted([speechHandle._waitForGeneration()]);
this._currentSpeech = undefined;   // ← cleared before pipelineReplyTask runs the tool

So while a function tool is executing in a separate task, there is no _currentSpeech, and a user EOU triggers a fresh preemptive generation that has no awareness of the in-flight tool.

The Python framework has a partial guard via _new_turns_blocked set in agent_session.py:1259 during update_agent, which catches the handoff sub-case. That field doesn't exist in agents-js. But even Python is unprotected for plain (non-handoff) tools, where _new_turns_blocked is never set — so this is a general gap, not just a JS port omission.

Relevant log output

Reconstructed from a single production call (room redacted, customer PII stripped). The user said "Yes. Thank" 0.84 s after the bookAppointment tool
started its API call:

  04:15:49.748Z  user EOU → speech_0f238ed2-954 created
  04:15:50.970Z  LLM finished (74 tokens, includes a bookAppointment tool call)
  04:15:51.450Z  TTS finished (39 chars stall message); _currentSpeech is now undefined
  04:15:52.514Z  bookAppointment tool body starts API call (~3 s)
  04:15:53.158Z  user starts speaking ("Yes. Thank")
  04:15:53.358Z  user EOU final → "Speech created" (source=generate_reply, userInitiated=true) speech_3aa64aa6-79c is born, chatCtx has NO tool result in it
  04:15:55.515Z  bookAppointment returns (handoff result)
  04:15:56.157Z  LLM for speech_3aa64aa6-79c completes (62 tokens, 228-char hallucinated confirmation)
  04:15:57.247Z  TTS plays in full (cancelled: false, ~9.7 s of audio): message #1
  04:16:05.108Z  Post-handoff agent's say() fires (323 chars):  message #2
  04:16:06.486Z  user interrupts ("All right."):  speech_84d5f0b8-13a cancelled

Describe your environment

  • @livekit/agents: 1.2.8
  • turnHandling.preemptiveGeneration.enabled: true
  • Node.js: 22.x
  • OS: linux x64

Minimal reproducible example

import { voice } from '@livekit/agents';
import { z } from 'zod';

const slowTool = llm.tool({
  description: 'Simulate a slow API call',
  parameters: z.object({}),
  execute: async () => {
    await new Promise((r) => setTimeout(r, 3000)); // 3-second tool body
    return { ok: true, message: 'real tool result that the LLM cannot guess' };
  },
});

const session = new voice.AgentSession({
  // ...stt, llm, tts, vad...
  voiceOptions: { preemptiveGeneration: true },
});

Repro:

  1. Start the agent. Get it to call slowTool.
  2. Have the user speak any short utterance (e.g. "okay") between the time the tool body starts and when it returns — i.e. while the agent is "waiting on the tool."
  3. Observe two replies in sequence

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions