Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

各个模块或模型的作用是什么?有关V2测试 #248

Open
zhjygit opened this issue May 27, 2024 · 0 comments
Open

各个模块或模型的作用是什么?有关V2测试 #248

zhjygit opened this issue May 27, 2024 · 0 comments

Comments

@zhjygit
Copy link

zhjygit commented May 27, 2024

结合论文来看,主要包括两个大的部分:基于基础语音模型的特征提取和音色克隆;
1)基础语音模型在项目中是哪个呀?
2)guillaumekln/faster-whisper-medium 对应论文中的哪个部分?
3)melotts--myshell-ai-MeloTTS-xxx会在.cache\huggface\hub目录下下载模型,这些模型作用是什么,对应论文哪一部分呀?

目前,在V2版本中,我没找到节奏、停顿等的控制方法,貌似只有speed的控制参数。
对于台湾普通话的克隆,几乎无法实现,不知道是基础语音模型的问题还是其他问题(比如,给的音频质量不行),是否需要训练台湾普通话,如何训练能否提供方法,我也想给这个项目顺便贡献一下自己的力量,让他更丰富。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant