Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will this work for singing voice conversion (svc)? #28

Closed
billnye2 opened this issue Oct 21, 2023 · 2 comments
Closed

Will this work for singing voice conversion (svc)? #28

billnye2 opened this issue Oct 21, 2023 · 2 comments

Comments

@billnye2
Copy link

Great repo! Ran some tests with it and it sounds good for speech, but the limited testing I did for singing didn't sound too great. Is this expected / is there a way to adapt it to work well with singing? Perhaps switch it to use NSF-HiFiGAN as so-vits-svc does?

P.S. I especially like the zero-shot any-to-any nature of this model, not sure if there are other projects out there now for zero shot svc.

@RF5
Copy link
Collaborator

RF5 commented Oct 21, 2023

Hi @billnye2 , thanks for your comments :). Some thoughts:

  • Yep we also found kNN-VC to not do super well with singing, especially for more expressive / melodic songs.
  • It is largely expected, since the two trained parts of kNN-VC (the WavLM encoder and HiFiGAN vocoder) are both only trained on English librispeech, which is fairly monotone. Both of these hurt quality when presented with singing inputs. The kNN part is quite agnostic to singing vs non-singing vs non-human sounds, so likely the main limitation is from the feature encoding and vocoding side.
  • To fix the WavLM side, I think one would need to retrain WavLM with some singing data added, so that the features it produces can better represent singing audios.
  • To fix the HiFiGAN vocoding side, using NSF-HiFiGAN might definitely improve things. I imagine what would be required would be to train an NSF-HiFiGAN model to vocode WavLM features (instead of spectrograms), then it can be directly used with kNN-VC.

Hopefully in the not too distant future, we will be able to generalize the performance of kNN-VC and other models. Thank's again for your interest in our work!

@billnye2
Copy link
Author

Great support, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants