-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeping prosodic features of reference Speaker #41
Comments
Hi Pedro, I recently read an Interspeech 2020 paper which aims to transfer the source style. I think it might be helpful for you. Regards, |
Thanks @KunZhou9646 Yeah, it is definitely useful. I was trying to find a way of still using this algorithm, and I noticed that if the fine-tune is done with a small amount of audios from the target speaker, it does keep some prosodic aspects from the reference, so I was wondering if something else may be done in this code. But I'll definitely take a look at the paper you sent. Thanks, |
@jucasansao Hi,did u try the algorithms and get perfect transformation of prosodic features of reference speakers? Now I'm try to do this work by this way. |
Hi, I did it on emotion style transfer using a strategy of pre-training & adaptation with this repo. I publish an Interspeech 2021 paper based on my results. You can find it here: https://arxiv.org/abs/2103.16809 |
Hi @jxzhanggg,
I am trying to achieve Voice Conversion with this algorithm applied to prosody training. This means that I want to convert a reference audio (Speaker A) to the voice of a user (Speaker B), but maintaining the original phone durations and the pitch contour (different mean f0) of the reference speaker (Speaker A).
Right now I managed to pre-train and fine-tune the model, and the voice conversion works well, the output is very similar to the target. But all the prosodic features from the reference were lost.
Do you have any idea where I may need to tweak to achieve this result? Even if it is at a slight cost of the audio quality. Did you ever attempt this or have an idea which parameters need to be changed?
Thanks in advance!
Pedro Sousa
The text was updated successfully, but these errors were encountered: