Preemptive generation does not check in-flight function tool execution

### Describe the bug

When `preemptiveGeneration: true` is enabled and a user produces an end-of-utterance during a function tool's execution window, the framework starts a new `generateReply` (via the preemptive path) before the tool result has been added to chat context. The new generation runs to completion based on stale chat context, the LLM commonly hallucinates the tool's outcome, and its TTS plays in full alongside the legitimate post-tool reply. The user hears two similar messages back-to-back about the same outcome or two tool calls.

**Where it happens** (`agents/src/voice/agent_activity.ts` on `main`):

```ts
// onPreemptiveGeneration, line 1291
if (
  !preemptiveOpts.enabled ||
  this.schedulingPaused ||
  (this._currentSpeech !== undefined && !this._currentSpeech.interrupted) ||
  !(this.llm instanceof LLM)
) {
  return;
}
```
The early-return guard checks `_currentSpeech`, but `_currentSpeech` is cleared in the main loop between LLM-stream-end and tool execution:

```ts
// mainTask, ~line 1396
this._currentSpeech = speechHandle;
speechHandle._authorizeGeneration();
await speechHandle.waitIfNotInterrupted([speechHandle._waitForGeneration()]);
this._currentSpeech = undefined;   // ← cleared before pipelineReplyTask runs the tool
```
So while a function tool is executing in a separate task, there is no `_currentSpeech`, and a user EOU triggers a fresh preemptive generation that has no awareness of the in-flight tool.

The Python framework has a partial guard via `_new_turns_blocked` set in `agent_session.py:1259` during `update_agent,` which catches the handoff sub-case. That field doesn't exist in agents-js. But even Python is unprotected for plain (non-handoff) tools, where `_new_turns_blocked` is never set — so this is a general gap, not just a JS port omission.

### Relevant log output


Reconstructed from a single production call (room redacted, customer PII stripped). The user said "Yes. Thank" 0.84 s after the `bookAppointment` tool
  started its API call:
```
  04:15:49.748Z  user EOU → speech_0f238ed2-954 created
  04:15:50.970Z  LLM finished (74 tokens, includes a bookAppointment tool call)
  04:15:51.450Z  TTS finished (39 chars stall message); _currentSpeech is now undefined
  04:15:52.514Z  bookAppointment tool body starts API call (~3 s)
  04:15:53.158Z  user starts speaking ("Yes. Thank")
  04:15:53.358Z  user EOU final → "Speech created" (source=generate_reply, userInitiated=true) speech_3aa64aa6-79c is born, chatCtx has NO tool result in it
  04:15:55.515Z  bookAppointment returns (handoff result)
  04:15:56.157Z  LLM for speech_3aa64aa6-79c completes (62 tokens, 228-char hallucinated confirmation)
  04:15:57.247Z  TTS plays in full (cancelled: false, ~9.7 s of audio): message #1
  04:16:05.108Z  Post-handoff agent's say() fires (323 chars):  message #2
  04:16:06.486Z  user interrupts ("All right."):  speech_84d5f0b8-13a cancelled
```

### Describe your environment

- @livekit/agents: 1.2.8 
- turnHandling.preemptiveGeneration.enabled: true
- Node.js: 22.x
- OS: linux x64

### Minimal reproducible example


  ```ts
  import { voice } from '@livekit/agents';
  import { z } from 'zod';

  const slowTool = llm.tool({
    description: 'Simulate a slow API call',
    parameters: z.object({}),
    execute: async () => {
      await new Promise((r) => setTimeout(r, 3000)); // 3-second tool body
      return { ok: true, message: 'real tool result that the LLM cannot guess' };
    },
  });

  const session = new voice.AgentSession({
    // ...stt, llm, tts, vad...
    voiceOptions: { preemptiveGeneration: true },
  });
  ```

Repro:
  1. Start the agent. Get it to call `slowTool`.
  2. Have the user speak any short utterance (e.g. "okay") **between the time the tool body starts and when it returns** — i.e. while the agent is "waiting on the tool."
  3. Observe two replies in sequence

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preemptive generation does not check in-flight function tool execution #1365

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Preemptive generation does not check in-flight function tool execution #1365

Description

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions