You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After read code on the vocoder part, I found that there is only a pre-trained model and no training steps. Why is there no implementation of this part ? And under what circumstances is the pre-trained model obtained and how is its performance ?
The vocoder part in the original TFGAN paper does not include the subband discriminator(there is also no implementation of this part). Because I did not see the relevant interpretation in the paper, what help or impact does the subband discriminator have on the model ?
If I can get an answer, it will help me a lot.
Thank you.
The text was updated successfully, but these errors were encountered:
Hi @LqNoob, I'm not sure if you still need the answer or not. Many apologize for the late reply. These are good questions.
The implementation of TFGAN is confidential as the codebase of ByteDance, so I cannot open-source it. If you are interested you can refer to this repo, which has a similar implementation as ours. To achieve speaker-independent, you need to use at least 1000+ speakers in the training dataset.
We use a subband discriminator to enhance the discriminative power of the GAN. We believe this can help TFGAN achieves a better vocoding result.
Hi, @haoheliu. Thank you for your awesome work.
If I can get an answer, it will help me a lot.
Thank you.
The text was updated successfully, but these errors were encountered: