Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions #18

Closed
ZDisket opened this issue May 31, 2020 · 10 comments
Closed

Some questions #18

ZDisket opened this issue May 31, 2020 · 10 comments
Assignees
Labels
question ❓ Further information is requested
Projects

Comments

@ZDisket
Copy link
Collaborator

ZDisket commented May 31, 2020

  1. Are the mel outputs generated compatible with kan-bayashi's ParallelWaveGAN?
  2. There's a FastSpeech synthesis example, but not Tacotron2. How to generate speech with the Tacotron2 pretrained model and MelGAN-STFT?
@dathudeptrai
Copy link
Collaborator

dathudeptrai commented May 31, 2020

Hi,

Are the mel outputs generated compatible with kan-bayashi's ParallelWaveGAN?

We have different train/valid split. But generally, the preprocessing steps is the same and the mean/var of our training very close. So let say, i believe it's compatible, you can use Tacotron, Fastpeech generated from this repo and use pretrained models from ParallelWaveGAN. Even it's not compatible, you still can combine by de-norm my mel-spectrogram based on my stats then re-norm based-on kan-bayashi's Parallelwavegan stats :)).

There's a FastSpeech synthesis example, but not Tacotron2. How to generate speech with the Tacotron2 pretrained model and MelGAN-STFT?.

To know how to inference, you can see detail at decode_tacotron2.py or decoder_melgan.py in examples directory. Melgan-STFT is melgan but training with Multi-resolution STFT loss so it's inference same as Melgan original. I will implement AutoModel Class to inference all combinations in the near future :)). Atleast, i will provide google colab soon.

@dathudeptrai dathudeptrai self-assigned this May 31, 2020
@dathudeptrai dathudeptrai added the question ❓ Further information is requested label May 31, 2020
@dathudeptrai dathudeptrai added this to In progress in MelGan May 31, 2020
@ZDisket ZDisket closed this as completed May 31, 2020
@dathudeptrai
Copy link
Collaborator

@ZDisket
Copy link
Collaborator Author

ZDisket commented Jun 4, 2020

@dathudeptrai
Nice. It seems pretty similar to my notebook.

@dathudeptrai
Copy link
Collaborator

@ZDisket great :)). I just uploaded Tacotron pretrained 120K. I'm training multiband melgan, it will 3x faster and improve quality compared with melgan-stft :D.

@ZDisket
Copy link
Collaborator Author

ZDisket commented Jun 4, 2020

@dathudeptrai Very nice, I have a lot of hope for this repo's Multi-Band MelGAN since I can't get kan-bayashi's to work. It'll be optimal for a user-friendly Windows GUI front end. I'll also retrain my Tacotron2 on the new one.

@dathudeptrai
Copy link
Collaborator

why mb-melgan on kan-bayashi not worked ?

@ZDisket
Copy link
Collaborator Author

ZDisket commented Jun 4, 2020

@dathudeptrai All my predictions had heavy metallic noise, and when it reaches the discriminator train start steps (on another training run) they all become pure noise.

@dathudeptrai
Copy link
Collaborator

okay, let see how mb melgan on this repo help you. It will finish training progress on saturday, i think :D.

@ZDisket
Copy link
Collaborator Author

ZDisket commented Jun 5, 2020

@dathudeptrai
I can't see Tacotron2-120k in the pretrained models section in here. Where is it?

@dathudeptrai
Copy link
Collaborator

https://drive.google.com/drive/u/1/folders/1kaPXRdLg9gZrll9KtvH3-feOBMM8sn3_
@ZDisket

@dathudeptrai dathudeptrai moved this from In progress to Done in MelGan Jun 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question ❓ Further information is requested
Projects
MelGan
  
Done
Development

No branches or pull requests

2 participants