Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
-
Updated
Oct 31, 2024 - Python
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
Reparameterized Discrete Diffusion Models for Text Generation
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
[AAAI 2024] GLOP: Learning Global Partition and Local Construction for Solving Large-scale Routing Problems in Real-time
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
A length-controllable and non-autoregressive image captioning model.
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
Add a description, image, and links to the non-autoregressive topic page so that developers can more easily learn about it.
To associate your repository with the non-autoregressive topic, visit your repo's landing page and select "manage topics."