Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer learning #223

Closed
diego-s opened this issue Jun 19, 2019 · 9 comments
Closed

Transfer learning #223

diego-s opened this issue Jun 19, 2019 · 9 comments

Comments

@diego-s
Copy link

diego-s commented Jun 19, 2019

Hi, thanks for the great implementation. I was wondering if you have any guidelines for performing transfer learning to smaller datasets? I have been trying to start from the pre-trained model and then fit to a new and smaller (~500 samples) dataset but the general problem I have is that the attention alignments become very odd and it no longer aligns well. Are there any tricks to do this better (lower learning rates, etc) or to diagnose problems with alignment? Thank you.

@Yablon
Copy link

Yablon commented Jul 16, 2019

Hi, thanks for the great implementation. I was wondering if you have any guidelines for performing transfer learning to smaller datasets? I have been trying to start from the pre-trained model and then fit to a new and smaller (~500 samples) dataset but the general problem I have is that the attention alignments become very odd and it no longer aligns well. Are there any tricks to do this better (lower learning rates, etc) or to diagnose problems with alignment? Thank you.

I think maybe you can try some adaptation methods like Merlin does?

@pravn
Copy link

pravn commented Jul 16, 2019 via email

@Energyanalyst
Copy link

Did you have any luck in getting the transfer to work? I'm looking to do something similar but unsure how to start - do I simply train up the large model and then replace the voice samples?

@pravn
Copy link

pravn commented Jul 30, 2019 via email

@Yeongtae
Copy link

Yeongtae commented Jul 30, 2019

@pravn
I have read your paper.
It's great.
In addition, I'm so sorry that I didn't reply to your email.
It's my mistake.
Because I had forgotten to reply.

@diego-s
Copy link
Author

diego-s commented Jul 31, 2019

Thanks a lot for the replies. I did try something after some unsuccessful results with less drastic approaches, although it might have been unnecessarily complicated. I altered the architecture and the optimization as follows: (1) I added a second attention layer [with the 'v' property initialized to zero] and a second linear projection initialized to zero, they are added to the original attention layer and linear projection; (2) I added two losses, one without the new additional layers that is still computed on the old dataset, and one with the new layers that is computed on the new dataset. They are added together with a mixed batch of old and new voice data. My hope was that this would regularize the model and help find a model that works on the two datasets with minimal changes on the original weights (freezing them was not working for me). I was expecting it would utterly fail since I never tried anything like this. It seems to capture the new voices and accents quite well, although I'm by no means an expert in TTS, so perhaps other might find it less good. At the moment my code is quite messy and I'm quite busy, but if this is interesting, after November I could try to block some time to clean it up and share it in a branch.

@rafaelvalle
Copy link
Contributor

@diego-s can you share the loss curves for your warm started model?

@rafaelvalle
Copy link
Contributor

Closing due to inactivity.

@tiomaldy
Copy link

Someone know the code for transfer learning ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants