🛠 google_gemini.py: Only the last sentence is spoken by TTS – response playback is incomplete

### Bug title

TTS playback skips most of the model's response in Open WebUI when using google_gemini.py v1.6.4

### Describe the bug

I'm using the `google_gemini.py` pipeline (version **v1.6.4**) inside Open WebUI. Whenever a message is generated and TTS is set to auto read aloud or I'm in the call mode, **only the very last sentence** of the response is spoken aloud – sometimes just a single word or emoji. The rest of the response is skipped entirely.

This happens specifically in two scenarios:
- When **auto-play TTS** is enabled (automatic voice playback after each response)
- When using **call mode** (live voice chat)

It does **not** happen when manually pressing the "Listen to message" button **after** the full response has already appeared. In that case, the entire message is spoken correctly.

This strongly suggests a problem in how streaming output is handled in TTS-related workflows – especially when playback is triggered **before** the entire model response is finalized.

Additionally, I personally suspect that this might also be influenced by the brief visual rendering of the model's internal `Thought` section. This section appears briefly, then quickly collapses again, which might confuse the TTS system or disrupt the message composition internally. It could be a combination of premature streaming and dynamic UI state changes.

There are no visible errors, but Open WebUI often continues to "work" in the background indefinitely, requiring a manual refresh of the page.


### Steps to reproduce

1. Install the `google_gemini.py` pipeline (v1.6.4) from the OpenWebUI function community  
2. Use a Gemini model like `gemini-2.5-flash`  
3. Enable TTS auto-play or use call mode  
4. Enter any text prompt  
5. Observe: Only the last sentence is spoken aloud  
6. Optional: Manually press the TTS playback button afterward – in this case the entire message is read correctly  

### Environment

- **OpenWebUI version**: v0.6.32
- **Pipeline**: `google_gemini.py` v1.6.4  
- **Model**: gemini-2.5-flash  
- **TTS usage**: using auto-play or call mode 
- **Browser**: Google Chrome (latest)  
- **System**: Ubuntu 22.04.5 LTS  
- **Setup**: Docker  

### Additional context

We believe the problem is linked to how early streaming chunks are handled. If TTS begins playback too early (before all chunks are received), only the final chunk is spoken aloud.

It might also be relevant that OpenWebUI briefly renders a collapsible `Thought` section before the final message, which could interfere with the rendering or parsing of the full message used for TTS.

We'd love to see better TTS handling or a different `Thought` section mechanism.

Happy to help test a fix or contribute further!

_Thanks again for your incredible work and the beautifully structured Gemini pipeline – it's very appreciated!_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

🛠 google_gemini.py: Only the last sentence is spoken by TTS – response playback is incomplete #77

Bug title

Describe the bug

Steps to reproduce

Environment

Additional context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

🛠 google_gemini.py: Only the last sentence is spoken by TTS – response playback is incomplete #77

Description

Bug title

Describe the bug

Steps to reproduce

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions