Skip to content

Google TTS fails with “input.text or input.ssml longer than 5000 bytes” even for short utterances (LiveKit Agents) #4762

@charan-akula

Description

@charan-akula

Bug Description

When using LiveKit Agents with the Google TTS plugin, speech synthesis fails with a
400 INVALID_ARGUMENT error stating that input.text or input.ssml exceeds 5000 bytes.

This error occurs even when the assistant output is a very short sentence (well under 5000 bytes). and it was breaking the session.

The issue seems unrelated to actual text length and may be caused by SSML handling,
internal buffering, retry concatenation, or chunking behavior in the Google TTS plugin.

Expected Behavior

Short assistant responses (a few words or sentences) should synthesize successfully
using Google TTS without hitting the 5000-byte limit.

The TTS plugin should:

  • Properly chunk or stream text
  • Avoid concatenating previous failed attempts
  • Respect Google TTS size limits internally

Reproduction Steps

#the tts configuration iam using was:
        tts=google.TTS(
            voice_name="te-IN-Chirp3-HD-Erinome",  # Use Standard voice
            language="te-IN",
            gender="female",
            credentials_file=os.getenv("GOOGLE_CREDENTIALS_PATH"),        
            use_streaming=False,
            enable_ssml=True,
            audio_encoding=texttospeech.AudioEncoding.LINEAR16,  # Must be PCM
            sample_rate=24000,
            speaking_rate=1.0,
            volume_gain_db=3.0,
        ),

Operating System

Windows11

Models Used

gemini cloud tts

Package Versions

livekit-agents[google]~=1.3

Session/Room/Call IDs

No response

Proposed Solution

Additional Context

No response

Screenshots and Recordings

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions