Pytorch Implementation of the paper "M3-TTS: Multi-modal DiT Alignment & Mel-latent for Zero-shot High-fidelity Speech Synthesis"
-
Updated
Dec 18, 2025 - Python
Pytorch Implementation of the paper "M3-TTS: Multi-modal DiT Alignment & Mel-latent for Zero-shot High-fidelity Speech Synthesis"
Add a description, image, and links to the mmdit topic page so that developers can more easily learn about it.
To associate your repository with the mmdit topic, visit your repo's landing page and select "manage topics."