You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you think it’s the same / better way to fine tune directly Mixtral, or use mistral 7b fine tuned for vision, create a MoE with tools like mergoo ( please take a look at mergoo because SSHH12 + mergoo can be a life changer ) and fine tune the created MoE ?
Hm my guess would be that merging after training the modality projector wouldn't work (at least out of the box with this library just bc of all the custom torch modules that get strapped onto the model). However should definitely be do-able to take an existing merge and add the modality to it by adding that hf architecture as I mentioned.
Hi, I just want to know if somebody have successfully trained a Mixtral like the 7x8B ? Because when I try, the output is random ( unreadable ).
thank !
The text was updated successfully, but these errors were encountered: