Skip to content

2aronS/voice-snap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

voice-snap

Low latency AI voice companion with faster_whisper and ElevenLabs streamingInner

pip install voice-snap
from voice_snap import VoiceCompanion

companion = VoiceCompanion(
    whisper_model="base.en",
    elevenlabs_api_key="your_key_here",
    voice_id="21m00Tcm4TlvDq8ikWAM"
)

companion.start()

notes

Uses faster_whisper for low latency transcription, ElevenLabs streamingInner API for quick TTS response. Runs VAD (voice activity detection) to reduce spurious transcriptions.

Default model is base.en but you can use small.en, medium.en, etc. Larger models are more accurate but slower.

Set ELEVENLABS_API_KEY env var or pass directly. Voice ID can be found in ElevenLabs dashboard.

Typical roundtrip latency: 500-800ms depending on model size and network.

import asyncio
from voice_snap import VoiceCompanion, StreamConfig

async def custom_handler(text: str) -> str:
    # your LLM call here
    return f"You said: {text}"

companion = VoiceCompanion(
    whisper_model="small.en",
    elevenlabs_api_key="your_key",
    voice_id="21m00Tcm4TlvDq8ikWAM",
    stream_config=StreamConfig(
        sample_rate=16000,
        chunk_size=1024,
        vad_threshold=0.5
    ),
    response_handler=custom_handler
)

asyncio.run(companion.run_async())

MIT

About

Low latency AI voice companion with faster_whisper and ElevenLabs streaming

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages