Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantity.decompose() is slower than Quantity.to() #1641

Open
abhisrkckl opened this issue Sep 8, 2023 · 6 comments
Open

Quantity.decompose() is slower than Quantity.to() #1641

abhisrkckl opened this issue Sep 8, 2023 · 6 comments

Comments

@abhisrkckl
Copy link
Contributor

abhisrkckl commented Sep 8, 2023

A while ago, I did a profiling run on PINT, and found that the performance issues of PINT arises from a large number of factors. One of the culprits was the frequent use of Quantity.decompose() function and Quantity.to() function with a string argument.

Here is a simple comparison:

%timeit (dmu*DMconst).decompose()

65.2 µs ± 798 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit (dmu*DMconst).to("Hz")

50.5 µs ± 495 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit (dmu*DMconst).to(u.Hz)

46 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In this case, decompose() is slower by 40%.

I think we should change decompose() to to() wherever possible.

@dlakaplan
Copy link
Contributor

What about .to(...).value vs. .to_value(..)? Is there any difference?

@scottransom
Copy link
Member

Optimizations with regards to units has come up multiple times in the past. This is a very good page to take a look at:
https://docs.astropy.org/en/stable/units/index.html#astropy-units-performance
Note especially how useful it is to define composite units once and also to apply them using the << operator rather than with a multiplication. Those cause multiple orders of magnitude speed improvements.

@abhisrkckl
Copy link
Contributor Author

abhisrkckl commented Sep 8, 2023

%timeit (dmu*DMconst).to_value(u.Hz)

43.6 µs ± 616 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit (dmu*DMconst).to_value("Hz")

47.4 µs ± 924 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit (dmu*DMconst).to(u.Hz).value

50.1 µs ± 2.71 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit (dmu*DMconst).to("Hz").value

52.9 µs ± 971 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

About 20% difference between best and worst.

@dlakaplan
Copy link
Contributor

So maybe the issue is that new code doesn't obey the optimizations that were already decided on. I don't see recommendations in the coding style doc.

@abhisrkckl
Copy link
Contributor Author

So maybe the issue is that new code doesn't obey the optimizations that were already decided on.

I think this is the case. The various Unit conversion conventions are used liberally in the code. This is the case even in the case of core functionality like the TimingModel class.

@abhisrkckl
Copy link
Contributor Author

Note especially how useful it is to define composite units once and also to apply them using the << operator rather than with a multiplication. Those cause multiple orders of magnitude speed improvements.

I tried this. There is significant difference in performance.

This was referenced Sep 23, 2023
@abhisrkckl abhisrkckl linked a pull request Sep 25, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants