Skip to content

v0.15.0

Latest

Choose a tag to compare

@ValerianRey ValerianRey released this 15 Jun 21:31
· 2 commits to main since this release
7a39365

🪺 SDMGrad, DWA, FAMO

This release introduces a new weighting SDMGradWeighting and two new scalarizers DWA and FAMO. Thanks a lot to @KhusPatel4450 and @ppraneth for the contributions!

We're trying to grow the community and build even more features! To participate, you can join the Discord community!

Changelog

Added

  • Added SDMGradWeighting from
    Direction-oriented Multi-objective Learning: Simple and Provable Stochastic
    Algorithms

    (NeurIPS 2023). It is a stateful Weighting that solves for task weights via a simplex-projected
    inner loop on a cross-batch matrix A = J_1 @ J_2.T (computed from two independent mini-batches
    using autojac.jac), with a direction-oriented regularizer pulling the descent direction toward
    a preference direction.
  • Added DWA (Dynamic Weight Average) from End-to-End Multi-Task Learning with
    Attention

    (CVPR 2019), a stateful Scalarizer that weights each value by the relative rate at which its
    loss decreased over the two previous epochs. It has no learnable parameters; call its step()
    method once per epoch to roll the loss history.
  • Added FAMO (Fast Adaptive Multitask Optimization) from FAMO: Fast Adaptive Multitask
    Optimization

    (NeurIPS 2023), a stateful Scalarizer that decreases all task losses at an approximately equal
    rate using only the loss values. It learns the task weights internally; after the model step,
    call its update() method with the losses recomputed on the same batch to adjust them.