v0.15.0 #744
ValerianRey
announced in
Announcements
v0.15.0
#744
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🪺 SDMGrad, DWA, FAMO
This release introduces a new weighting SDMGradWeighting and two new scalarizers DWA and FAMO. Thanks a lot to @KhusPatel4450 and @ppraneth for the contributions!
We're trying to grow the community and build even more features! To participate, you can join the Discord community!
Changelog
Added
SDMGradWeightingfromDirection-oriented Multi-objective Learning: Simple and Provable Stochastic
Algorithms
(NeurIPS 2023). It is a stateful
Weightingthat solves for task weights via a simplex-projectedinner loop on a cross-batch matrix
A = J_1 @ J_2.T(computed from two independent mini-batchesusing
autojac.jac), with a direction-oriented regularizer pulling the descent direction towarda preference direction.
DWA(Dynamic Weight Average) from End-to-End Multi-Task Learning withAttention
(CVPR 2019), a stateful
Scalarizerthat weights each value by the relative rate at which itsloss decreased over the two previous epochs. It has no learnable parameters; call its
step()method once per epoch to roll the loss history.
FAMO(Fast Adaptive Multitask Optimization) from FAMO: Fast Adaptive MultitaskOptimization
(NeurIPS 2023), a stateful
Scalarizerthat decreases all task losses at an approximately equalrate using only the loss values. It learns the task weights internally; after the model step,
call its
update()method with the losses recomputed on the same batch to adjust them.This discussion was created from the release v0.15.0.
Beta Was this translation helpful? Give feedback.
All reactions