Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tuning with custom (multilingual) data #82

Open
ukemamaster opened this issue Oct 19, 2023 · 1 comment
Open

Fine tuning with custom (multilingual) data #82

ukemamaster opened this issue Oct 19, 2023 · 1 comment

Comments

@ukemamaster
Copy link

Hi @OlaWod, i appreciate your work.

I am trying to fine tune the FreeVC model with my custom multilingual data (using an already trained speaker encoder model), and without SR augmentation. After some 300k steps (with batch size 32) it gives fair conversion outputs. However, i have some questions:

  1. It seems that unseen-to-seen, and unseen-to-unseen conversions have poor quality. Will adding more and more data to the training set improve these cases?
  2. Is it necessary to train the WavLM and HiFiGAN models with the custom dataset or the pre-trained models are OK to use for custom dataset?
  3. Is it possible to train the FreeVC model using mel-spectrograms directly fed to the Bottleneck Extractor instead of SSL features, (i.e., skipping HifiGAN and WavLM models) ? Have you tried it? Is it worth giving a try?
  4. Does the 24khz training recipe has better performance than the 16khz one?
  5. Does the SR augmentation has a big effect on performance?
  6. Can the conversion process be in real-time? I mean can we convert a source audio frame-by-frame, and not as a whole?

Any other tips that can improve the conversion quality, are appreciated.

Thanks

@Xmiler
Copy link

Xmiler commented Oct 21, 2023

Hi @ukemamaster,

I am new here and will follow this topic with interest. But could you please share the audio samples your model generates to see the quality you have achieved.

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants