v0.12.0
🥥 Scalarization, FairGrad
🌟 We're very happy to announce that TorchJD has just been accepted to the PyTorch ecosystem!
We thank all contributors for making this possible (most recent contributors listed first):
- @ppraneth
- @KhusPatel4450
- @rkhosrowshahi
- @DJKorchinski
- @mattbuot
- @raeudigerRaeffi
- @PierreQuinton
- @ValerianRey
This release introduces a new package torchjd.scalarization, with simple baselines to combine losses into a single scalar, to compare against Jacobian-based methods. Thanks a lot to @ppraneth for making this happen!
The long-term plan is to add many non-trivial scalarization methods from the literature, so that TorchJD can be used whenever you have multiple losses, even if you don't need Jacobian-based methods. We'd be very happy to welcome new contributors to help us develop them. See this issue for more information, and feel free to join our Discord server to start discussing with us!
This release also introduces FairGrad, makes even more dependencies optional (now torchjd only depends on torch by default), and removes GeneralizedWeightings. More info about this in the changelog.
Changelog
Added
- Added a new
torchjd.scalarizationpackage providing the abstractScalarizerbase class and
the concrete implementationsConstant,Mean,Random, andSum. These baselines simply
combine losses into a scalar that can be optimized with a standard backward pass, making them
useful for comparison with JD-based methods. - Added
FairGradandFairGradWeightingfrom Fair Resource Allocation in Multi-Task
Learning.
Changed
- BREAKING: Removed
numpy,quadprogandqpsolversfrom the main dependencies oftorchjd,
(which now only hastorchas its main dependency). This makes the base version oftorchjd
(installed withpip install torchjd) much lighter, but it means that users ofUPGradand
DualProjnow have to install the new optional dependency groupquadprog_projectorexplicitly
(with e.g.pip install "torchjd[quadprog_projector]"). - BREAKING: Removed entirely the concept of generalized Gramians. The
Engine.compute_gramian
method now always returns a square matrix of shape[m, m], wheremis the total number of
elements of theoutputtensor (treating all dimensions uniformly). Previously, an output of
shape[m1, m2]would return a 4D generalized Gramian of shape[m1, m2, m2, m1]; it now
returns a[m1 * m2, m1 * m2]matrix.
This also removesGeneralizedWeightingandFlattening.
To update, replaceFlattening(weighting)with a standardWeightingand reshape the resulting
weight vector yourself:# Before from torchjd.aggregation import Flattening, UPGradWeighting weighting = Flattening(UPGradWeighting()) gramian = engine.compute_gramian(losses) # shape: [m1, m2, m2, m1] weights = weighting(gramian) # shape: [m1, m2] losses.backward(weights) # After from torchjd.aggregation import UPGradWeighting weighting = UPGradWeighting() gramian = engine.compute_gramian(losses) # shape: [m1 * m2, m1 * m2] weights = weighting(gramian).reshape(losses.shape) # shape: [m1, m2] losses.backward(weights)