-
-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid lroundf using, overall 24% speedup for any images #551
Conversation
@farindk any updates? |
@farindk any thoughts? |
By the way, this is a regression and was introduced in 1.9.0: import pyheif
print('# LIBHEIF_VERSION', pyheif.libheif_version())
data = open('/imgs/flat.norot.heic', 'rb').read()
%time for _ in range(10): i = pyheif.read(data)
data = open('/imgs/flat.heic', 'rb').read()
%time for _ in range(10): i = pyheif.read(data)
# LIBHEIF_VERSION 1.7.0
CPU times: user 3.76 s, sys: 596 ms, total: 4.36 s
Wall time: 1.72 s
CPU times: user 5.45 s, sys: 566 ms, total: 6.02 s
Wall time: 3.46 s
# LIBHEIF_VERSION 1.8.0
CPU times: user 3.91 s, sys: 601 ms, total: 4.51 s
Wall time: 1.8 s
CPU times: user 5.67 s, sys: 601 ms, total: 6.27 s
Wall time: 3.55 s
# LIBHEIF_VERSION 1.9.0
CPU times: user 5.4 s, sys: 579 ms, total: 5.98 s
Wall time: 2.19 s
CPU times: user 6.98 s, sys: 629 ms, total: 7.61 s
Wall time: 3.86 s
# LIBHEIF_VERSION 1.10.0
CPU times: user 5.32 s, sys: 594 ms, total: 5.91 s
Wall time: 2.13 s
CPU times: user 7.01 s, sys: 569 ms, total: 7.58 s
Wall time: 3.87 s
# LIBHEIF_VERSION 1.11.0
CPU times: user 5.26 s, sys: 588 ms, total: 5.85 s
Wall time: 2.16 s
CPU times: user 7.08 s, sys: 593 ms, total: 7.67 s
Wall time: 3.95 s
# LIBHEIF_VERSION 1.12.0
CPU times: user 5.17 s, sys: 583 ms, total: 5.75 s
Wall time: 2.12 s
CPU times: user 6.84 s, sys: 591 ms, total: 7.43 s
Wall time: 3.89 s |
I used Thus, I am fine to replace it with a custom, inlined rounding function when it's guaranteed that all values are non-negative. |
e9004e9
to
499a0a3
Compare
Should be ok IMHO since one is clipping the incorrectly rounded negative results to 0 in this particular case. A comment in the code about this might be welcome though. |
Thanks. Finally merged :-) |
If you look to
perf report
, this only function uses a lot of CPU while it is totally avoidable.master:
native-rounding:
I'm using
taskset 0x1
for more consistent benchmarks results.