Amphion Alpha Release #2

RMSnow · 2023-11-28T08:32:51Z

We release the alpha version of Amphion 🎉. The key features are as follows:

TTS: Text to Speech
- Support FastSpeech2 and VITS
SVC: Singing Voice Conversion
- Support mutplie content-based features including WeNet, Whisper, and ContentVec
- Provide the official implementation of the paper "Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion" (NeurIPS 2023 Workshop on Machine Learning for Audio)
TTA: Text to Audio
- Support a latent diffusio model based TTA. It is also the official implementation of the text-to-audio generation part of our NeurIPS 2023 paper "AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models".
Vocoder
- Support several GAN-based Vocoders including MelGAN, HiFi-GAN, NSF-HiFiGAN, BigVGAN, and APNet.
- Provide the official implementation of Multi-Scale Constant-Q Transfrom Discriminator.
- Release two vocoder checkpoints (see Amphion/pretrained/README.md): Amphion Speech HiFi-GAN and Amphion Singing BigVGAN.
Evaluation
- Support 16 objective metrics (see Amphion/egs/metrics/README.md) about F0 Modeling, Energy Modeling, Intelligibility, Spectrogram Distortion, and Speaker Similarity.
Datasets
- Support 15 academic datasets (see Amphion/egs/datasets/README.md).

Thanks to all the contributors including:

Xueyao Zhang* @RMSnow, Liumeng Xue* @lmxue, Yuancheng Wang* @HeCheng0625, and Yicheng Gu* @VocodexElysium (The Chinese University of Hong Kong, Shenzhen)
Xi Chen @ChenX17 (The Chinese University of Hong Kong, Shenzhen)
Zihao Fang @Adorable-Qin (The Chinese University of Hong Kong, Shenzhen)
Haopeng Chen @arsity (The Chinese University of Hong Kong, Shenzhen)
Lexiao Zou @Lokshaw-Chau (The Chinese University of Hong Kong, Shenzhen & Harbin Institute of Technology, Shenzhen)
Chaoren Wang @yuantuo666 (The Chinese University of Hong Kong, Shenzhen)
Jun Han (The Chinese University of Hong Kong, Shenzhen)
Kai Chen (Shanghai AI Lab & OpenMMLab)
Haizhou Li (The Chinese University of Hong Kong, Shenzhen)
Zhizheng Wu @zhizhengwu (The Chinese University of Hong Kong, Shenzhen & Shanghai AI Lab)

*: Equal Contributions.

Also, thanks to Shenzhen Research Institute of Big Data (SRIBD) for partially supporting computing and scholarships to Xueyao, Liumeng, Yuancheng.

SVC key features and recipe template

batched infer, contentvec 12 layers, fs based hop

amphion alpha release

48a8f2b

RMSnow requested review from lmxue, zhizhengwu, HeCheng0625 and VocodexElysium November 28, 2023 08:33

RMSnow added enhancement New feature or request good first issue Good for newcomers labels Nov 28, 2023

fix some typos

62b1c83

zhizhengwu merged commit 9682d0c into open-mmlab:main Nov 28, 2023

lmxue pushed a commit to lmxue/Amphion that referenced this pull request Dec 1, 2023

Merge pull request open-mmlab#2 from RMSnow/dev-svc

de4cd5d

SVC key features and recipe template

lmxue pushed a commit to lmxue/Amphion that referenced this pull request Dec 1, 2023

Merge pull request open-mmlab#2 from Adorable-Qin/new-dev-svc

d8da330

batched infer, contentvec 12 layers, fs based hop

open-mmlab deleted a comment from PedramHaeri Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amphion Alpha Release #2

Amphion Alpha Release #2

RMSnow commented Nov 28, 2023 •

edited

Amphion Alpha Release #2

Amphion Alpha Release #2

Conversation

RMSnow commented Nov 28, 2023 • edited

RMSnow commented Nov 28, 2023 •

edited