You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone. I've been testing out mergekit recently, and I'm amazed by it. I was investigating the performance discrepancy between LoRA and Finetune, and came across this paper, proposing DoRA. Would it be possible for mergekit, alongside LoRA extraction, to implement DoRA extraction?
The text was updated successfully, but these errors were encountered:
@Tengal-Teemo DoRA is beneficial during training as it more closely resembles the updates done during full finetuning (see Figure 2 of the paper). In other words it is more likely than LoRA to converge at the same rate as a full FT. I don't think however that it would be useful in a post-training context like LoRA extraction as all the weights updates and convergence have already happened. Closing this for now as DoRA extraction is not a planned feature.
Happy to keep the discussion going however if anyone disagrees with the above points.
Hi everyone. I've been testing out mergekit recently, and I'm amazed by it. I was investigating the performance discrepancy between LoRA and Finetune, and came across this paper, proposing DoRA. Would it be possible for mergekit, alongside LoRA extraction, to implement DoRA extraction?
The text was updated successfully, but these errors were encountered: