You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We now support pretraining, supervised finetuning (sft), and post-training (SAPO).
For post-training RL, we support both explicit and non explicit thinking.
We also have all datasets migrated to an organization on HuggingFace.
We have many new models, such as Base, Patch, and Conv. ELF. We also include a Llava based model using a more complicated MLP connector, instead of a simple linear projection.