Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTS] [黑客松]Add JETS #3109

Merged
merged 7 commits into from
Apr 19, 2023
Merged

[TTS] [黑客松]Add JETS #3109

merged 7 commits into from
Apr 19, 2023

Conversation

ljhzxc
Copy link
Contributor

@ljhzxc ljhzxc commented Mar 28, 2023

PR types

New features

PR changes

新增了 jets 模型

Describe

新增 jets 模型,以及 example 的 csmcs ,包括 文档,动态图训练推理,动态图转静态图,推理。
复现过程中存在问题:
Paddle 的 ctcloss 和 torch 的 ctcloss 的接口是不一样的。
torch 的 ctcloss
paddle 的 ctcloss:
虽然 paddle 的 nn.nn.functional.ctc_loss 文档中说明输入是 log_probs, 但似乎不是这样。源码中也是调用 warpctc,而 warpctc 输入是 logits。

综上所述:torch 输入是 log_probs,但是 paddle 输入是 logits。
image
模型中需要先对 log_probs 做 β伯努利,但是 paddle 的 ctc 由于输入是 logits,不能实现这个步骤,所以不能和原版模型对齐。这导致了自动对齐模块不能实现,因此最终模型采用了 mfa 的对齐结果进行训练。

fix #2773

@CLAassistant
Copy link

CLAassistant commented Mar 28, 2023

CLA assistant check
All committers have signed the CLA.

@paddle-bot
Copy link

paddle-bot bot commented Mar 28, 2023

Thanks for your contribution!

@mergify
Copy link

mergify bot commented Apr 10, 2023

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Apr 10, 2023
@ljhzxc ljhzxc changed the title [TTS] Add JETS [TTS] [黑客松]Add JETS Apr 10, 2023
@mergify mergify bot removed the conflicts label Apr 10, 2023
examples/csmsc/jets/local/inference.sh Show resolved Hide resolved
paddlespeech/t2s/exps/jets/preprocess.py Outdated Show resolved Hide resolved
examples/csmsc/jets/README.md Outdated Show resolved Hide resolved
paddlespeech/t2s/exps/jets/preprocess.py Show resolved Hide resolved
paddlespeech/t2s/exps/syn_utils.py Outdated Show resolved Hide resolved
yt605155624
yt605155624 previously approved these changes Apr 18, 2023
Copy link
Collaborator

@yt605155624 yt605155624 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify
Copy link

mergify bot commented Apr 18, 2023

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Apr 18, 2023
@zh794390558
Copy link
Collaborator

ctc_loss的输入可以是log_softmax的,可以验证下。

@yt605155624 yt605155624 merged commit dc56c3a into PaddlePaddle:develop Apr 19, 2023
1 check passed
@ljhzxc
Copy link
Contributor Author

ljhzxc commented Apr 19, 2023

image
因为底层的 warpctc 当中存在了 softmax,所以只能用 logits 作为输入,这里文档里的输入应该写错了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[TTS] JETS -> E2E FastSpeech2 + HiFiGAN
5 participants