speech-to-speech

Here are 67 public repositories matching this topic...

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech large-language-models multimodal-large-language-models speech-language-model speech-interaction

Updated May 19, 2025
Python

IAHispano / Applio

Star

A simple, high-quality voice conversion tool focused on ease of use and performance.

text-to-speech ai voice speech pytorch tts rvc voice-conversion vc voice-cloning speech-to-speech vits voice-clone applio

Updated Jun 26, 2025
Python

Realtime AI speech with OpenAI Realtime API and Gemini Live API on Arduino ESP32 with Secure Websockets and Deno edge functions with >15 minutes uninterrupted conversations globally for AI toys, AI companions, AI devices and more

arduino ai hardware websocket esp32 realtime gemini openai gemini-api deno realtime-api speech-to-speech supabase

Updated Jun 12, 2025
TypeScript

aws-samples / swift-chat

Star

A lightning-fast, cross-platform AI chat application built with React Native.

Updated Jun 24, 2025
TypeScript

opendilab / CleanS2S

Star

High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！

python machine-learning streaming ai speech-synthesis speech-recognition speech-to-speech gpt-4o

Updated Jun 16, 2025
Python

VITA-MLLM / Freeze-Omni

Star

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

speech speech-synthesis speech-recognition speech-to-speech large-language-models multimodal-large-language-models

Updated May 27, 2025
Python

SamirPaulb / real-time-voice-translator

Star

A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.

Updated Jan 22, 2024
Tcl

amanvirparhar / weebo

Star

A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.

llama whisper kokoro speech-to-speech

Updated Jan 20, 2025
Python

MooreThreads / MooER

Star

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

speech-recognition speech-to-text speech-translation speech-to-speech large-language-models chatgpt gpt-4o speech-interaction

Updated Jan 8, 2025
Python

dqqcasia / awesome-speech-translation

Star

natural-language-processing machine-translation speech speech-synthesis speech-recognition speech-processing text-translation disfluency-detection speech-translation multimodal-machine-learning multimodal-machine-translation punctuation-restoration speech-to-speech simultaneous-translation cascaded-speech-translation non-autoregressive-translation speech-to-subtitles

Updated Nov 10, 2021

Lex-au / Vocalis

Star

Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. Features low-latency audio streaming, dynamic visual feedback, and works with local LLM/TTS services via OpenAI-compatible endpoints.

artificial-intelligence visionprocessing conversational-ai speech-to-speech

Updated Apr 14, 2025
TypeScript

jesuscopado / samantha-os1-openai-realtime

Star

Samantha OS1 is a conversational AI assistant powered by the Realtime API from OpenAI

agent openai realtime-api speech-to-speech ai-agent

Updated Dec 27, 2024
Python

asiff00 / On-Device-Speech-to-Speech-Conversational-AI

Star

This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.

tts vad audio-processing asr voice-assistant conversational-ai speech-to-speech ollama kokoro-tts