v0.10.0 #650
ValerianRey
announced in
Announcements
v0.10.0
#650
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
💉 Stateful Aggregators, Gradient Vaccine and Grouping
TorchJD has always been limited to stateless aggregators, but we're now opening up the implementation of stateful methods too.
This release introduces the Stateful mixin for aggregators that retain some state from their previous aggregations. As an example, it also adds the GradVac aggregator, from Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models. Note that because of its implementation, NashMTL is also considered as stateful.
This also introduces a new Grouping usage example, explaining how to partition the Jacobian into several groups of parameters before aggregating each group independently.
Lastly, we have updated the contribution guidelines to be able to add aggregators that also depend on the loss values or any other information than the Jacobian. We hope that this will enable us to support a few more aggregators soon. Feel free to open issues or pull requests if you have ideas in mind!
Many thanks to @rkhosrowshahi @PierreQuinton and @ValerianRey for the contributions!
Changelog
Added
GradVacandGradVacWeightingfromGradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models.
Fixed
NashMTLfails (which can happen for exampleon the matrix [[0., 0.], [0., 1.]]).
This discussion was created from the release v0.10.0.
Beta Was this translation helpful? Give feedback.
All reactions