Text transcripts truncated mid-sentence - flush() race condition

## Bug Description

Text transcripts are getting cut off mid-sentence. The LLM generates complete responses but users only see partial text on the frontend.

For example, text ending with "...what do you consider to be the distinctive characteristics?" gets truncated to "...what do you consider to be the distinctive"

Production logs show the LLM generated 475 tokens but the frontend didn't receive all of them.

## Expected Behavior

All generated text should be delivered to the frontend without truncation. If the LLM generates 475 tokens, the frontend should receive all 475 tokens worth of text, including the complete final sentence.

## Reproduction Steps

```bash
# 1. Start an agent session with text transcription enabled
# 2. Have the agent generate a response (e.g., initial greeting)
# 3. Check the agent logs - you'll see the full token count
# 4. Check what the user sees - text is truncated

# Example from our logs:
# Realtime ttft=1.21s duration=6.18s tokens=0in/475out tokens/s=76.9
# ^ Agent generated 475 tokens but user didn't get the last few words
```

**Important:** This issue is more evident with increased network latency. If your LiveKit server or deployment has higher latency (e.g., cross-region, Azure deployments), the race window widens and truncation happens more consistently.

## Operating System

Linux (Docker)

## Models Used

- LLM: Azure OpenAI (via Azure AI Foundry)
- TTS: Azure TTS
- Using both Voice Live and traditional pipeline

## Package Versions

```bash
livekit-agents==1.3.12
livekit==0.17.4
Python==3.13
```

## Proposed Solution

The issue is in `_ParticipantStreamTranscriptionOutput.flush()` in `_output.py`. Currently it's not async but creates a background task:

```python
def flush(self) -> None:
    # ...
    self._flush_atask = asyncio.create_task(self._flush_task(curr_writer))
    # Returns immediately without waiting!
```

**Fix:** Make `flush()` async and await the task:

```python
async def flush(self) -> None:
    # ...
    await self._flush_task(curr_writer)  # Wait for it
```

This would require updating all callers to `await flush()`.

## Additional Context

### Why This Happens

The `_flush_task()` performs `await writer.aclose()` which is network I/O to the LiveKit server. Our testing shows this takes about 50-60ms in low-latency environments. But with higher latency (e.g., cross-region deployments, cloud-to-cloud communication), this can take 100ms or more.

**The race window = time between flush() returning and _flush_atask completing**

Higher latency → Longer race window → More likely to hit the race condition

This explains why some deployments see consistent truncation while others might not notice it.

### Workaround

For now we're waiting for `_flush_atask` manually after `wait_for_playout()`:

```python
await speech_handle.wait_for_playout()

if hasattr(session.output.transcription, '_flush_atask'):
    if session.output.transcription._flush_atask:
        await session.output.transcription._flush_atask
```

This fixes the issue but requires accessing private SDK attributes.

### Related Issues

This was reported before in #4817 but was closed with the comment "flush is called after all text captured so the race condition shouldn't happen." While flush IS called after text capture, the problem is that flush() doesn't wait for the actual flush work (writer.aclose()) to complete before returning.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text transcripts truncated mid-sentence - flush() race condition #4854

Bug Description

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Proposed Solution

Additional Context

Why This Happens

Workaround

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Text transcripts truncated mid-sentence - flush() race condition #4854

Description

Bug Description

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Proposed Solution

Additional Context

Why This Happens

Workaround

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions