Comparison to torchmetrics #82

rsokl · 2022-11-01T16:47:11Z

Hello! torcheval looks great!

I'd be interested to know how torcheval compares to torchmetrics. Are there certain shortcomings in torchmetrics that torcheval hopes to address? Any other insights into what inspired the creation of torcheval might help users understand what makes this project unique 😄

The text was updated successfully, but these errors were encountered:

yongen9696 · 2022-11-03T18:41:46Z

what makes this project unique

I wonder as well, torchmetrics is quite mature and complete with available metric tools.

ninginthecloud · 2022-11-03T20:39:10Z

Hi, @rsokl and @yongen9696 , thanks for the great question~

Kudos to the community

Support for metrics and evaluation has been a long-running request from the PyTorch community. First, we would like to give kudos to scikit-learn metrics, Keras Metrics, Ignite Metrics, and TorchMetrics as existing projects in the ML community that have inspired TorchEval. In particular, we have discussed these design points on multiple occasions with the developers of TorchMetrics.

What makes TorchEval unique?

Philosophy for TorchEval

TorchEval is a library that enables easy and performant model evaluation for PyTorch. The library’s philosophy is to provide minimal interfaces that are bolstered by a robust toolkit, alongside a rich collection of performant, out-of-the-box implementations. Critically, we believe in the following axes:

No surprises in behavior
Fast by default
Easily extensible
Works naturally in distributed applications

Components in TorchEval

Interface clarity

Class-based metrics in TorchEval offer only update, compute, reset, and merge_state methods, which makes it obvious to callers what states are used for computing results. There’s only 1 way to get results from class-based metrics which means no risk of inadvertent usage that slows down performance.
In the base Metric interface, TorchEval does not wrap the update() or compute() methods implemented by callers.

Metric synchronization in distributed applications

Metric synchronization is supported through the toolkit, not on the base interface.
Metric synchronization does not change metric states in-place, which means users don’t need to worry about undefined transitions (e.g. calling update() after sync()) or rewinding to previous states.
Metric synchronization must be explicitly opted into by users. This makes it easy for callers to distinguish between results on a particular rank vs global results. There is no default synchronization on step, which has a significant performance overhead in distributed applications.
Explicit: Synchronizing states in TorchEval operates on the whole metric object, not only per state. Specifically, TorchEval requires users to implement a merge_state() method to define how to synchronize the states. This avoids assumptions of the state object being synchronized.
Performance: When synchronizing a metric, the TorchEval toolkit runs only 1 collective per metric. In the near future, we will augment the toolkit to offer the capability to synchronize a collection of metrics, further reducing the communication overhead.
Extensible: The toolkit for metric synchronization today covers the typical SPMD use case, but can be extended to cover peer-to-peer use cases (e.g. via torch.distributed.rpc) without changing the base interface.

Performance

We believe the biggest performance benefits come from the explicit interfaces offered. In addition to the points listed above on synchronization, out-of-the-box implementations offered are optimized with:
1. Vectorization (example)
2. JIT scripting (example)
3. Binned metrics (example)
Looking forward, TorchEval is exploring integrating custom kernels and/or Triton integrations to further accelerate computation.

Beyond Metrics

TorchEval also includes tools for evaluation like FLOPs and summarization techniques for modules.

We are open to your feedback about what else you'd find helpful in this library!

cc: @ananthsub @bobakfb @JKSenthil

ananthsub · 2022-11-04T20:04:53Z

I think @ninginthecloud 's reply summarizes the difference very well, so I'll close out this issue. @rsokl please let us know if you have further questions about this though!

rsokl · 2022-11-04T20:12:07Z

Thank you! This response was very useful.

rsokl · 2022-11-04T20:12:55Z

(given the engagement on this thread, you might consider pinning it in your issues section so that other inquiring users can find it easily 😄 )

williamFalcon · 2022-11-05T00:16:40Z

Hi! William here from Lightning. The Lightning team led the development of torchmetrics. There was a period where @ananthsub was a close member of the torchmetrics team where the impression that we were under was that he was contributing back to Lightning Torchmetrics OSS, however it seems that we have diverged now.

We developed metrics for the larger community (beyond Lightning). Metrics has become a de-facto standard across the PyTorch community.

We valued API stability when Meta started engaging, to the point where we went back and forth on design decisions that didn’t bring crystal clear value, but that would break people’s code and not benefit the broad PyTorch community.

Meta pushed for changes that our team championed but decided not to go ahead with, then decided to start their own very similar project, and are very actively working at having projects adopt their solution, which we don’t think is fair, because it fragments the community and there’s nothing that we couldn’t fundamentally fix.

This mostly just fragments the ecosystem… The “differences” are so minor, that one of our engineers will just address them in the next week…

I’m sure that eval is a good attempt at metrics and you can be the judge of what you prefer to use @rsokl. What I can say is that we have a whole company dedicated to making sure our software is the best in the world and are committed to providing first class support and integrating the feedback into torchmetrics. We’ve been working on this for years and have deep expertise in-house that you are leveraging through torchmetrics, not to mention a massive contributor ecosystem.

Thanks for the thorough comparison! we will be taking this feedback into consideration as we prepare for our next release.

cheers!

ananthsub closed this as completed Nov 4, 2022

ananthsub mentioned this issue Nov 4, 2022

Support for TorchEval for metrics usage mosaicml/composer#1708

Open

BenjaminHelyer mentioned this issue Apr 27, 2023

FLOPs and ModuleSummary Documentation #152

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison to torchmetrics #82

Comparison to torchmetrics #82

rsokl commented Nov 1, 2022 •

edited

Loading

yongen9696 commented Nov 3, 2022 •

edited

Loading

ninginthecloud commented Nov 3, 2022 •

edited by ananthsub

Loading

ananthsub commented Nov 4, 2022 •

edited

Loading

rsokl commented Nov 4, 2022

rsokl commented Nov 4, 2022

williamFalcon commented Nov 5, 2022 •

edited

Loading

Comparison to torchmetrics #82

Comparison to torchmetrics #82

Comments

rsokl commented Nov 1, 2022 • edited Loading

yongen9696 commented Nov 3, 2022 • edited Loading

ninginthecloud commented Nov 3, 2022 • edited by ananthsub Loading

Kudos to the community

What makes TorchEval unique?

Philosophy for TorchEval

Components in TorchEval

Beyond Metrics

ananthsub commented Nov 4, 2022 • edited Loading

rsokl commented Nov 4, 2022

rsokl commented Nov 4, 2022

williamFalcon commented Nov 5, 2022 • edited Loading

rsokl commented Nov 1, 2022 •

edited

Loading

yongen9696 commented Nov 3, 2022 •

edited

Loading

ninginthecloud commented Nov 3, 2022 •

edited by ananthsub

Loading

ananthsub commented Nov 4, 2022 •

edited

Loading

williamFalcon commented Nov 5, 2022 •

edited

Loading