Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions for vocoder #10

Open
LqNoob opened this issue Dec 15, 2021 · 1 comment
Open

questions for vocoder #10

LqNoob opened this issue Dec 15, 2021 · 1 comment

Comments

@LqNoob
Copy link

LqNoob commented Dec 15, 2021

Hi, @haoheliu. Thank you for your awesome work.

  1. After read code on the vocoder part, I found that there is only a pre-trained model and no training steps. Why is there no implementation of this part ? And under what circumstances is the pre-trained model obtained and how is its performance ?
  2. The vocoder part in the original TFGAN paper does not include the subband discriminator(there is also no implementation of this part). Because I did not see the relevant interpretation in the paper, what help or impact does the subband discriminator have on the model ?

If I can get an answer, it will help me a lot.
Thank you.

@haoheliu
Copy link
Owner

Hi @LqNoob, I'm not sure if you still need the answer or not. Many apologize for the late reply. These are good questions.

  1. The implementation of TFGAN is confidential as the codebase of ByteDance, so I cannot open-source it. If you are interested you can refer to this repo, which has a similar implementation as ours. To achieve speaker-independent, you need to use at least 1000+ speakers in the training dataset.
  2. We use a subband discriminator to enhance the discriminative power of the GAN. We believe this can help TFGAN achieves a better vocoding result.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants