fix(voice): add PreemptiveGenerationOptions for fine-grained control by longcw · Pull Request #5428 · livekit/agents

longcw · 2026-04-13T06:16:35Z

Summary

Add PreemptiveGenerationOptions TypedDict with configurable options to reduce wasted compute during preemptive generation:
- preemptive_tts (default False): when False, only LLM runs preemptively and TTS starts after the turn is confirmed
- max_speech_duration (default 10s): skip preemptive generation when user has been speaking too long
- max_retries (default 3): cap preemptive LLM requests per user turn, resets on turn completion
The preemptive_generation parameter now accepts bool | PreemptiveGenerationOptions, fully backward compatible

Add configurable limits to preemptive generation to reduce wasted LLM requests during long user utterances: - max_speech_duration (default 10s): skip preemptive generation when user has been speaking longer than this threshold - max_retries (default 3): cap speculative LLM requests per user turn, counter resets when the turn completes The preemptive_generation parameter now accepts bool | PreemptiveGenerationOptions, keeping backward compatibility (True/False still works).

devin-ai-integration

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-04-13T06:20:42Z

        self._cancel_preemptive_generation()

+        if (
+            info.started_speaking_at is not None
+            and time.time() - info.started_speaking_at > preemptive_opts["max_speech_duration"]
+        ):
+            return
+
+        if self._preemptive_generation_count >= preemptive_opts["max_retries"]:
+            return
+
+        self._preemptive_generation_count += 1


🟡 Existing preemptive generation is cancelled before max_retries check, discarding valid work

In on_preemptive_generation, _cancel_preemptive_generation() is called unconditionally on line 1783 before the max_retries check on line 1791. When _preemptive_generation_count >= max_retries, the method returns early without starting a new generation — but the previous (most recent) preemptive generation has already been cancelled and set to None. This means the last successful preemptive generation is destroyed without replacement. Later, in _user_turn_completed_task at line 1995, self._preemptive_generation is None, so the preemptive result can never be used and a fresh (non-preemptive) LLM call is always made instead. This defeats the purpose of the max_retries limit, which should keep the last generation alive when retries are exhausted.

The fix is to move _cancel_preemptive_generation() after the early-return checks (or at least after the max_retries check), so the existing generation is only cancelled when it will actually be replaced by a new one.

Was this helpful? React with 👍 or 👎 to provide feedback.

this is expected, on_preemptive_generation is called when user transcript changed, so the previous preemptive generation is invalid, we should cancel it asap.

When False (default), only LLM runs preemptively and TTS starts after the turn is confirmed. When True, both LLM and TTS run immediately on interim transcripts for lowest latency at the cost of TTS compute on discarded attempts.

Move preemptive generation config into turn_handling where it conceptually belongs alongside endpointing and interruption. Add "enabled" field matching the InterruptionOptions pattern, and deprecate the top-level preemptive_generation parameter on AgentSession.

…ration-options

theomonnom · 2026-04-16T03:21:59Z

+    @property
+    def preemptive_generation(self) -> PreemptiveGenerationOptions:
+        return self.turn_handling["preemptive_generation"]


We don't need to expose it if we expose turn_handling

Suggested change

@property

def preemptive_generation(self) -> PreemptiveGenerationOptions:

return self.turn_handling["preemptive_generation"]

theomonnom

lgtm!

devin-ai-integration

Devin Review found 2 new potential issues.

🐛 2 issues in files not directly in the diff

🐛 Serialization format break: `preemptive_generation` changed from bool to dict in report output (`livekit-agents/livekit/agents/voice/report.py:66`)

self.options.preemptive_generation now returns a PreemptiveGenerationOptions dict (e.g., {"enabled": True, "preemptive_tts": False, "max_speech_duration": 10.0, "max_retries": 3}) instead of a bool. The to_dict() method at livekit-agents/livekit/agents/voice/report.py:66 embeds this value directly into the report dictionary. Any downstream consumer (e.g., LiveKit Cloud) that previously expected a boolean for the "preemptive_generation" key will now receive a nested dict, breaking deserialization or interpretation of the session report.

🐛 Serialization format break: `preemptive_generation` changed from bool to dict in remote session serialization (`livekit-agents/livekit/agents/voice/remote_session.py:331`)

str(opts.preemptive_generation) at livekit-agents/livekit/agents/voice/remote_session.py:331 now produces a Python dict string representation like "{'enabled': True, 'preemptive_tts': False, 'max_speech_duration': 10.0, 'max_retries': 3}" instead of "True" or "False". This serialized value is sent to the remote SessionHost via the transport (used at remote_session.py:662). The receiving end likely expects a boolean-like string, so this change breaks the remote session protocol.

View 10 additional findings in Devin Review.

devin-ai-integration

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

devin-ai-integration · 2026-04-16T04:47:23Z

        "max_tool_steps": str(opts.max_tool_steps),
        "user_away_timeout": str(opts.user_away_timeout),
-        "preemptive_generation": str(opts.preemptive_generation),
+        "preemptive_generation": str(dict(opts.preemptive_generation)),


🟡 Existing test mock uses bool for preemptive_generation, but _serialize_options now calls dict() on it

The change from str(opts.preemptive_generation) to str(dict(opts.preemptive_generation)) will crash with TypeError: cannot convert 'bool' object to dict items when called with the mock in tests/test_session_host.py:447, which sets options.preemptive_generation = False. The test_get_session_state test exercises this path via _handle_request(get_session_state) → _serialize_options(self._session.options) at livekit-agents/livekit/agents/voice/remote_session.py:662. The mock needs to be updated to use a PreemptiveGenerationOptions dict (e.g., MagicMock(__iter__=lambda s: iter([])) or a real dict like {"enabled": False}).

Prompt for agents

The test file tests/test_session_host.py at line 447 sets options.preemptive_generation = False (a plain bool). After the change on remote_session.py:331 from str(opts.preemptive_generation) to str(dict(opts.preemptive_generation)), calling dict(False) raises TypeError. The fix should update the mock in tests/test_session_host.py _make_mock_session() to use a dict-like object for preemptive_generation, for example: options.preemptive_generation = {"enabled": False, "preemptive_tts": False, "max_speech_duration": 10.0, "max_retries": 3} (matching PreemptiveGenerationOptions structure), or use MagicMock(__iter__=lambda s: iter([])) like the endpointing and interruption mocks do.

Was this helpful? React with 👍 or 👎 to provide feedback.

chenghao-mou requested a review from a team April 13, 2026 06:16

devin-ai-integration bot reviewed Apr 13, 2026

View reviewed changes

longcw added 3 commits April 15, 2026 20:39

Merge remote-tracking branch 'origin/main' into longc/preemptive-gene…

725e55f

…ration-options

This comment was marked as resolved.

Sign in to view

theomonnom reviewed Apr 16, 2026

View reviewed changes

theomonnom approved these changes Apr 16, 2026

View reviewed changes

longcw added 2 commits April 16, 2026 12:02

clean tests

a5c1678

fix tests

21de229

devin-ai-integration bot reviewed Apr 16, 2026

View reviewed changes

convert to dict

f250d41

longcw merged commit f5d1b88 into main Apr 16, 2026
22 of 23 checks passed

longcw deleted the longc/preemptive-generation-options branch April 16, 2026 04:43

devin-ai-integration bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(voice): add PreemptiveGenerationOptions for fine-grained control#5428

fix(voice): add PreemptiveGenerationOptions for fine-grained control#5428
longcw merged 7 commits intomainfrom
longc/preemptive-generation-options

longcw commented Apr 13, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 13, 2026 •

edited

Loading

Uh oh!

longcw Apr 13, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

theomonnom Apr 16, 2026

Uh oh!

theomonnom left a comment

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	@property
	def preemptive_generation(self) -> PreemptiveGenerationOptions:
	return self.turn_handling["preemptive_generation"]

Conversation

longcw commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

longcw Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

theomonnom Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

theomonnom left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

🐛 Serialization format break: preemptive_generation changed from bool to dict in report output (livekit-agents/livekit/agents/voice/report.py:66)

🐛 Serialization format break: preemptive_generation changed from bool to dict in remote session serialization (livekit-agents/livekit/agents/voice/remote_session.py:331)

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

longcw commented Apr 13, 2026 •

edited

Loading

devin-ai-integration bot Apr 13, 2026 •

edited

Loading

🐛 Serialization format break: `preemptive_generation` changed from bool to dict in report output (`livekit-agents/livekit/agents/voice/report.py:66`)

🐛 Serialization format break: `preemptive_generation` changed from bool to dict in remote session serialization (`livekit-agents/livekit/agents/voice/remote_session.py:331`)