You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BTM training, a new algorithm to train an ELMFOREST, which contains many EXPERT LMs that can be added and removed, ensembled, or parameter averaged at any time
ETA: 2023Q2
source https://twitter.com/jay_wooow/status/1556589324879085568?s=21&t=0zZfExIRj-ddccFHQ07yvA
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
abs: https://arxiv.org/abs/2208.03306
BTM training, a new algorithm to train an ELMFOREST, which contains many EXPERT LMs that can be added and removed, ensembled, or parameter averaged at any time
code https://github.com/hadasah/btm (model released!)
Stages
The text was updated successfully, but these errors were encountered: