Skip to content

v0.12.0

Choose a tag to compare

@ValerianRey ValerianRey released this 28 May 13:38
· 12 commits to main since this release
25b44f0

🥥 Scalarization, FairGrad


🌟 We're very happy to announce that TorchJD has just been accepted to the PyTorch ecosystem!

We thank all contributors for making this possible (most recent contributors listed first):


This release introduces a new package torchjd.scalarization, with simple baselines to combine losses into a single scalar, to compare against Jacobian-based methods. Thanks a lot to @ppraneth for making this happen!

The long-term plan is to add many non-trivial scalarization methods from the literature, so that TorchJD can be used whenever you have multiple losses, even if you don't need Jacobian-based methods. We'd be very happy to welcome new contributors to help us develop them. See this issue for more information, and feel free to join our Discord server to start discussing with us!

This release also introduces FairGrad, makes even more dependencies optional (now torchjd only depends on torch by default), and removes GeneralizedWeightings. More info about this in the changelog.

Changelog

Added

  • Added a new torchjd.scalarization package providing the abstract Scalarizer base class and
    the concrete implementations Constant, Mean, Random, and Sum. These baselines simply
    combine losses into a scalar that can be optimized with a standard backward pass, making them
    useful for comparison with JD-based methods.
  • Added FairGrad and FairGradWeighting from Fair Resource Allocation in Multi-Task
    Learning
    .

Changed

  • BREAKING: Removed numpy, quadprog and qpsolvers from the main dependencies of torchjd,
    (which now only has torch as its main dependency). This makes the base version of torchjd
    (installed with pip install torchjd) much lighter, but it means that users of UPGrad and
    DualProj now have to install the new optional dependency group quadprog_projector explicitly
    (with e.g. pip install "torchjd[quadprog_projector]").
  • BREAKING: Removed entirely the concept of generalized Gramians. The Engine.compute_gramian
    method now always returns a square matrix of shape [m, m], where m is the total number of
    elements of the output tensor (treating all dimensions uniformly). Previously, an output of
    shape [m1, m2] would return a 4D generalized Gramian of shape [m1, m2, m2, m1]; it now
    returns a [m1 * m2, m1 * m2] matrix.
    This also removes GeneralizedWeighting and Flattening.
    To update, replace Flattening(weighting) with a standard Weighting and reshape the resulting
    weight vector yourself:
    # Before
    from torchjd.aggregation import Flattening, UPGradWeighting
    weighting = Flattening(UPGradWeighting())
    gramian = engine.compute_gramian(losses)  # shape: [m1, m2, m2, m1]
    weights = weighting(gramian)              # shape: [m1, m2]
    losses.backward(weights)
    
    # After
    from torchjd.aggregation import UPGradWeighting
    weighting = UPGradWeighting()
    gramian = engine.compute_gramian(losses)           # shape: [m1 * m2, m1 * m2]
    weights = weighting(gramian).reshape(losses.shape) # shape: [m1, m2]
    losses.backward(weights)