Low latency AI voice companion with faster_whisper and ElevenLabs streamingInner
pip install voice-snap
from voice_snap import VoiceCompanion
companion = VoiceCompanion(
whisper_model="base.en",
elevenlabs_api_key="your_key_here",
voice_id="21m00Tcm4TlvDq8ikWAM"
)
companion.start()Uses faster_whisper for low latency transcription, ElevenLabs streamingInner API for quick TTS response. Runs VAD (voice activity detection) to reduce spurious transcriptions.
Default model is base.en but you can use small.en, medium.en, etc. Larger models are more accurate but slower.
Set ELEVENLABS_API_KEY env var or pass directly. Voice ID can be found in ElevenLabs dashboard.
Typical roundtrip latency: 500-800ms depending on model size and network.
import asyncio
from voice_snap import VoiceCompanion, StreamConfig
async def custom_handler(text: str) -> str:
# your LLM call here
return f"You said: {text}"
companion = VoiceCompanion(
whisper_model="small.en",
elevenlabs_api_key="your_key",
voice_id="21m00Tcm4TlvDq8ikWAM",
stream_config=StreamConfig(
sample_rate=16000,
chunk_size=1024,
vad_threshold=0.5
),
response_handler=custom_handler
)
asyncio.run(companion.run_async())MIT