-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Bug Description
Issue Description
We're experiencing echo on mobile devices in our LiveKit voice agent setup when using the phone's built in microphones. The agent hears itself speaking back, creating an echo effect that causes performance degradation as it's considered as user speech. I have also tried on your playground and on mobile on iphone it's doing the same...
Setup Details
Client-side (Mobile Browser)
- React frontend using
livekit-client - Audio track created with
getUserMediawith these constraints:
{
channelCount: 1,
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true,
voiceIsolation: true, // iOS only
// Google WebRTC constraints
googDAEchoCancellation: true,
googEchoCancellation: true,
googEchoCancellation2: true,
googNoiseSuppression: true,
googNoiseSuppression2: true,
googAutoGainControl: true,
googAutoGainControl2: true,
googHighpassFilter: true,
googTypingNoiseDetection: true
}- Audio track wrapped in
LocalAudioTrackwithuserProvidedTrack=true - Published to room with
source: Track.Source.Microphone - Agent audio playback volume set to 0.75 on mobile
- Playback element has
playsinlineandwebkit-playsinlineattributes
Server-side (LiveKit Agents - Python)
- Python agent using
livekit-agentsSDK - For mobile devices, we force BVC noise cancellation (we've tried with standard and others and there is the same looping issue each time...):
from livekit.plugins import noise_cancellation
# Mobile detection
is_mobile = getattr(agent_config, 'is_mobile', False)
if is_mobile:
room_input_options.noise_cancellation = noise_cancellation.BVC()- Room connection:
await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)- Session started with:
room_input_options = RoomInputOptions(
text_enabled=False,
audio_enabled=True,
close_on_disconnect=False
)
await session.start(
room=ctx.room,
agent=agent,
room_input_options=room_input_options,
room_output_options=room_output_options,
)Environment
- LiveKit Server: Cloud (LiveKit Cloud with BVC enabled)
- Client SDK:
livekit-client(latest) - Agent SDK:
livekit-agents(Python, latest) - Noise Cancellation Plugin:
livekit-plugins-noise-cancellation - Mobile Browsers: iOS Safari, Chrome Android
Observed Behavior
Sometimes it's mostly fine but if I put my finger on top of the mic of if I suddenly move the iphone near my t-shirt or against a surface that creates reverberation the agnet's speech is detected as user's speech and it can create a loop interrupting the agent etc...
Questions
Is there a specific way to configure agent playout settings to prevent the agent from hearing itself? Should we be using specific RoomOutputOptions settings?
Are there known issues or best practices for using BVC on mobile browsers (iOS Safari, Chrome Android)?
I have tried elevenlabs widget that appears on a website to talk to it and whatever I try it's never doing such things, I also tried Grok in the web and it's working fine, so either I am doing something wrong, or they have some secret sauce that is preventing that...
Expected Behavior
On mobile even when using the iphone's microphone there should be no loop where the agent's speech is detected as user's speech, it should work like the Elevenlabs widget.
Reproduction Steps
1. Use Livekit SDK client side
2. Try the app on mobile
3. Just try to put a finger on the mic or move slightly or even sometimes don't do anything and at the beginning it can loop
...
- Sample code snippet, or a GitHub Gist link -
{
channelCount: 1,
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true,
voiceIsolation: true, // iOS only
// Google WebRTC constraints
googDAEchoCancellation: true,
googEchoCancellation: true,
googEchoCancellation2: true,
googNoiseSuppression: true,
googNoiseSuppression2: true,
googAutoGainControl: true,
googAutoGainControl2: true,
googHighpassFilter: true,
googTypingNoiseDetection: true
}
### Server-side (LiveKit Agents - Python)
- Python agent using `livekit-agents` SDK
- For mobile devices, we force BVC noise cancellation (we've tried with standard and others and there is the same looping issue each time...):
from livekit.plugins import noise_cancellation
# Mobile detection
is_mobile = getattr(agent_config, 'is_mobile', False)
if is_mobile:
room_input_options.noise_cancellation = noise_cancellation.BVC()
- Room connection:
await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
- Session started with:
room_input_options = RoomInputOptions(
text_enabled=False,
audio_enabled=True,
close_on_disconnect=False
)
await session.start(
room=ctx.room,
agent=agent,
room_input_options=room_input_options,
room_output_options=room_output_options,
)Operating System
IOS 18.5, Iphone 14 Pro Max
Models Used
Deepgram Nova-3, GPT 4.1, Google CHIRP 3 HD (also tried with other models and I have the same issue)
Package Versions
# LiveKit Core
livekit==1.0.12
livekit-agents==1.2.14
livekit-api==1.0.5
livekit-blingfire==1.0.0
livekit-plugins-anthropic==1.2.14
livekit-plugins-assemblyai==1.2.14
livekit-plugins-cartesia==1.2.14
livekit-plugins-deepgram==1.2.14
livekit-plugins-elevenlabs==1.2.14
livekit-plugins-gladia==1.2.14
livekit-plugins-google==1.2.14
livekit-plugins-groq==1.2.14
livekit-plugins-noise-cancellation==0.2.5
livekit-plugins-openai==1.2.14
livekit-plugins-silero==1.2.14
livekit-plugins-speechmatics==1.2.14
livekit-plugins-turn-detector==1.2.14
livekit-protocol==1.0.4
# AI Provider SDKs
openai>=1.0.0
anthropic>=0.20.0
google-generativeai>=0.3.0
elevenlabs>=0.2.0
# Web and networking
fastapi>=0.100.0
uvicorn>=0.23.0
websockets>=11.0.0
httpx>=0.27.0
aiohttp>=3.8.0
urllib3==1.26.18 # Pin to v1.x for LibreSSL compatibility
# Utilities
python-dotenv>=1.0.0
pydantic>=2.0.0
asyncio>=3.4.0
psutil>=5.9.0
redis==6.4.0
axiom-py>=0.3.0
# ML Framework
# Torch is installed separately in Dockerfile (CPU version for cloud)
# For local: pip install torch==2.3.1 --index-url https://download.pytorch.org/whl/cpuSession/Room/Call IDs
Every single mobile conversation not isolated to any specific. Here is one example of it:
Room session ID: RM_VvZGESXbhLAy
Room name: agent_1fc1c7c3-dcad-4fc8-8477-10a74fb18302_session_session_1761782751543_t9652st
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response