New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About question of code and synthesis #6
Comments
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
|
***@***.***, most of the return of diffuse_trace is mostly zeros or ones. Like this.
…------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
***@***.***, I see the mel_predictor in aux model is the return of diffuse_trace. After trained with 1000 steps, most of the trace is (-1.5,1.5). The synthesis spectrogram likes
…------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
***@***.***, I use the PortaSpeech replacing the FastSpeech2 in DiffGAN-TTS. I find after the decoder, most of the output tensor of FastSpeech2 are (-1,1), while the PortaSpeech output tensor is (-11,11). I see the synthesis voice of problem due to the input(coarse_mel) of diffuse_trace. I cannot deal with it my dear friend.Thank you very much!
…------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
***@***.***, my friend. I got the voice in shallow module in my model. Best wishes to you my friend!
…------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Closed due to inactivity. |
Hi @keonlee9420, I just replace FastSpeech2 with PortaSpeech in the Acoustic Generator adjusting the loss and cwt including the energy and pitch. The diffusion part is the same with DiffGAN-TTS. The answer for you said that I have understand. Thank you very much.
In PortaSpeech, if I delete the ~ in get mask from length, the loss will become nan in the model. I see I need to adjust the mask set. I see the biggest problem is the difference in the return of get mask from length between DiffGAN-TTS and PortaSpeech.
By the way, I want to know the meaning of the diffuse_trace and diffuse_fn. I'm trying to deal with these problems now.
…------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
***@***.***, I met some questions.
1. This is aux model mel trained with 5000 steps. The voice is all electric current.
2.In tensorboard, the sampled spectrogram is the same as GT.
3.The voice made by shallow model likes to water or electric.
------------------ 原始邮件 ------------------
发件人: "keonlee9420/DiffGAN-TTS" ***@***.***>;
发送时间: 2022年5月10日(星期二) 晚上11:19
***@***.***>;
抄送: "Rui ***@***.******@***.***>;
主题: Re: [keonlee9420/DiffGAN-TTS] About question of code and synthesis (Issue #6)
Hi @qw1260497397 , thanks for your attention. It sounds interesting integrating PortaSpeech into DiffGAN-TTS, but don't have idea on it. Could you please elaborate it more? So you add GAN training for PortaSpeech? What about diffusion part?
Besides, here is my answer to your questions:
~ means 'not' in boolean so if you add it in front of mask, then the mask will be toggled. for example, if a mask value is [True, True, False], then the value of '~mask' is [False, False, True]
so the meaning of ~ in diffuse_trace is the same. the error that you report may be raised from the mismatch of masking value usage between PortaSpeech and DiffGAN-TTS, where you have to sync to have the same masking scheme.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
HI@keonlee9420, Thank you for your suggestions these days, I successfully integrated model PortaSpeech on the basis of this model. These are some questions to ask you! Thank you!
If I delete the ~ in diffuse_trace, the synthesis mel is error and the voice likes to the voice of water. While If I preserve the ~ in diffuse_trace, the mel is also error and the voice likes to electric voice.
Thank you very much!
The text was updated successfully, but these errors were encountered: