Stars
vgmstream - A library for playback of various streamed audio formats used in video games.
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Open-source Windows and Office activator featuring HWID, Ohook, TSforge, KMS38, and Online KMS activation methods, along with advanced troubleshooting.
🔊 Text-Prompted Generative Audio Model
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Instant voice cloning by MIT and MyShell. Audio foundation model.
An application for converting projects among singing voice synthesizer softwares.
Asynchronous HTTP client/server framework for asyncio and Python
This is the GitHub page for publicly available emotional speech data.
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
A unified dataset of multilingual emotional human utterances
Parse strings using a specification based on the Python format() syntax.
Simple text to phones converter for multiple languages
This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repository it is possible to automatically recreate the dataset. …
An expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In KazakhTTS2, the overall size has increased from 93 hours to 271 hours, the number of speakers h…
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Tacotron2 based engine for the SOVA-TTS project