Skip to content

V1.11

Compare
Choose a tag to compare
@leng-yue leng-yue released this 10 Feb 04:22
· 363 commits to main since this release
ffb512c

我们很高兴地宣布预训练模型现已可用, 这意味着您只需要 30 分钟的音频数据和 15 分钟的微调时间 (在 3090 上) 就可以模拟你想要的音色.
We are happy to announce that the pre-trained model is now available, which means you only need 30 minutes of audio data and 15 minutes to fine-tune it (on 3090).

我们建议您参考随附的配置进行微调. 它更改了学习率调度程序和保存检查点之间的步骤间隔.
We recommend you refer to the attached config for finetuning. It changed the lr scheduler and steps between saving checkpoints.

Model Info

  • Dataset Size: ~300 hours, ~600 singers (M4Singer, OpenSinger, OpenCpop, and In House Data)
  • Vocoder: NSF HifiGAN 44.1 khz (OpenVPI)
  • Feature Extractor: Chinese Hubert Soft with gate size 25
  • MD5: 9d88f1bbca34053919ee1ea8bd780a9b
  • Steps: 260k on a 4 x RTXA6000 server

本模型根据 CC-BY-NC-SA 4.0 license 发布, 下载前请仔细阅读.
This model is released under CC-BY-NC-SA 4.0 license, please read it before you download.