-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretrained model is here. #3
Comments
How many steps do you trained on Chinese dataset? Thanks in advance. |
3 GPU 16 batch_size about 200K step, you should focus loss: baker_base INFO [200000, 0.00017334926433399393] |
谢谢@dtx525942103 提供的pretrained 模型,跑了下停顿感还是很强,是不是因为你提的 “音素后面强插边界了,VITS又强插边界 ” 的原因 ,要解决停顿感的问题是不是需要将add_blank 设置为 false 重新训练?期待回复,谢谢哈 |
@josh-zhu 已经是False,你的音频和提供的样本差别大不大呢? |
好像没变化呢,时长都一致 |
你把你的文字和音频放粘贴上来,我看看和我的状态一致不 |
新进来的宝贝,赶紧去抢福利了哈 |
1_baker_sample.wav 是和原有的vits_string.txt 保持一致的;感谢大佬及时回复哈 |
你的ZIP是空的 |
抱歉更新了哈 |
这个效果很好了啊,正常效果啊 |
oo,那就是没问题的;我和fastspeech2 的结果进行比较,感觉fastspeech2在语调上更自然;因为fastspeech2 + hifi 不是 e2e的,我还可以对每个phonme的mel 时长进行微调,整体更舒服一下;大佬方便留个线下联系的方式吗,做很多和语音相关的东西,想方便时请教交流下 |
报错: |
@taotaoyuhust 您是修改了text中的建模单元symbols = _pause + _initials + [i + j for i in _finals for j in _tones]了吗? |
链接:https://pan.baidu.com/s/1CEgyC1R3FxXEI-5AL_Sj8g
提取码:cym1
The text was updated successfully, but these errors were encountered: