You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been trying to apply LoRA to the VITS model (hence the pull request for the conv1d). Turns out just using Lora for the text encoder transformer isn't enough, and Im not sure if I should replace all the layers I can with Lora layes. Could you guide me on how I can do it.
Im not sure if I should replace all the layers I can with Lora layes. Could you guide me on how I can do it.
It's a good idea to start with replacing all layers so it's close to fine-tuning, which works, I assume. Once it's working, you can empirically determine what layers can be kept as they are. That's roughly how we decided what layers to adapt.
Hi,
I've been trying to apply LoRA to the VITS model (hence the pull request for the conv1d). Turns out just using Lora for the text encoder transformer isn't enough, and Im not sure if I should replace all the layers I can with Lora layes. Could you guide me on how I can do it.
The repo is here
https://github.com/nivibilla/efficient-vits-finetuning
Thanks!
The text was updated successfully, but these errors were encountered: