My name is Yuancheng Wang (王远程). I'm a first-year Ph.D. student at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), supervised by Professor Zhizheng Wu. before that, I received my B.S. degree at CUHK-Shenzhen. I also collaborate with Xu Tan (谭旭) from Microsoft Research Asia.
My research interest includes text-to-speech synthesis, text-to-audio generation, and unified audio representation and generation. I am one of the main contributor and leader of the open-source Amphion toolkit.
I have developed NaturalSpeech 3, which is an advanced text-to-speech model with factorizated speech representation and modelling.
- 2024.05: 🎉 Our paper Factorized Diffusion Models are Natural and Zero-shot Speech Synthesizers, aka NaturalSpeech 3, got accepted by ICML 2024!
- 2024.03: 🎉 We are delighted to released NaturalSpeech 3, which is an advanced version of NaturalSpeech series with speech factorization. And we release FACodec checkpoints and demo in HuggingFace Amphion Space.
- 2023.11: 🔥 We released Amphion v0.1 (⭐️ 3.6k+), which is an open-source toolkit for audio, music, and speech generation.
- 2023.09: 🎉 My first paper about audio generation and editing AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models got accepted by NeurIPS 2023!