A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
Updated
Jun 11, 2024 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
🧠 Leon is your open-source personal assistant.
[AAAI 2024] Code for CTX-vec2wav in UniCATS
Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
MARS5 speech model (TTS) from CAMB.AI
End-to-End Speech Processing Toolkit
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Source code for the paper "Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS" by Angrick et al.
so-vits-svc fork with realtime support, improved interface and more features.
Graduated Interval Recall program
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Foundational model for human-like, expressive TTS
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Lingvo
Converts text to speech in realtime
This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.
Add a description, image, and links to the speech-synthesis topic page so that developers can more easily learn about it.
To associate your repository with the speech-synthesis topic, visit your repo's landing page and select "manage topics."