On path with a known exact float, extract the double with a fast macro. #21072

rhettinger · 2020-06-23T12:10:04Z

We're already testing for an exact float, so take advantage of that information and extract the double with the fast macro.

Baseline timings

$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=3.14' 'floor(x)'
10000000 loops, best of 11: 38.5 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=0.0' 'floor(x)'
10000000 loops, best of 11: 38.3 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=-3.14E32' 'floor(x)'
5000000 loops, best of 11: 69.3 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=-323452345.14' 'floor(x)'
5000000 loops, best of 11: 53.4 nsec per loop

Timings with the patch:

$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=3.14' 'floor(x)'
10000000 loops, best of 11: 36.5 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=0.0' 'floor(x)'
10000000 loops, best of 11: 36.5 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=-3.14E32' 'floor(x)'
5000000 loops, best of 11: 64.4 nsec per loop
$ ./python.exe -m timeit -r 11 -s 'from math import floor' -s 'x=-323452345.14' 'floor(x)'
5000000 loops, best of 11: 47 nsec per loop

While the timings all show improvements, I don't understand why the timings for floor() also depend on the magnitude of the inputs.

tim-one

I don't know why, but Python 3 changed math.floor() to return an int instead of a float - so the larger the absolute value, the more time it takes to create an ever-larger int object. So it's not surprising that the time depends on the magnitude of the argument. I suppose PyLong_FromDouble() could be micro-optimized to exploit that, eventually, the trailing bits of the potentially giant int must all be 0.

>>> math.floor(3.14e32)
314000000000000005680822245916672

miss-islington · 2020-06-23T18:45:28Z

Thanks @rhettinger for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

miss-islington · 2020-06-23T18:45:32Z

Sorry @rhettinger, I had trouble checking out the 3.9 backport branch.
Please backport using cherry_picker on command line.
cherry_picker 930f4518aea7f3f0f914ce93c3fb92831a7e1d2a 3.9

miss-islington · 2020-06-23T23:37:37Z

Thanks @rhettinger for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

…o. (pythonGH-21072) (cherry picked from commit 930f451) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>

bedevere-bot · 2020-06-23T23:37:51Z

GH-21102 is a backport of this pull request to the 3.9 branch.

…o. (pythonGH-21072)

miss-islington · 2020-09-04T22:39:45Z

Thanks @rhettinger for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

…o. (pythonGH-21072) (cherry picked from commit 930f451) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>

bedevere-bot · 2020-09-04T22:39:54Z

GH-22108 is a backport of this pull request to the 3.9 branch.

…o. (GH-21072) (cherry picked from commit 930f451) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>

This matches a similar optimisation done for math.floor in python#21072 Before: ``` λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=3.14' 'ceil(x)' 20000000 loops, best of 11: 13.3 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=0.0' 'ceil(x)' 20000000 loops, best of 11: 13.3 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=-3.14E32' 'ceil(x)' 10000000 loops, best of 11: 35.3 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=-323452345.14' 'ceil(x)' 10000000 loops, best of 11: 21.8 nsec per loop ``` After: ``` λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=3.14' 'ceil(x)' 20000000 loops, best of 11: 11.8 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=0.0' 'ceil(x)' 20000000 loops, best of 11: 11.7 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=-3.14E32' 'ceil(x)' 10000000 loops, best of 11: 32.7 nsec per loop λ ./python.exe -m timeit -r 11 -s 'from math import ceil' -s 'x=-323452345.14' 'ceil(x)' 10000000 loops, best of 11: 20.1 nsec per loop ```

This matches a similar optimisation done for math.floor in #21072

) This matches a similar optimisation done for math.floor in python#21072

On path with known exact float, extract the double with the fast macro.

6c23636

rhettinger added performance Performance or resource usage skip issue skip news needs backport to 3.9 only security fixes labels Jun 23, 2020

rhettinger requested a review from tim-one June 23, 2020 12:10

the-knights-who-say-ni added the CLA signed label Jun 23, 2020

bedevere-bot added the awaiting core review label Jun 23, 2020

tim-one approved these changes Jun 23, 2020

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Jun 23, 2020

rhettinger merged commit 930f451 into python:master Jun 23, 2020

bedevere-bot removed the awaiting merge label Jun 23, 2020

miss-islington assigned rhettinger Jun 23, 2020

rhettinger added needs backport to 3.9 only security fixes and removed needs backport to 3.9 only security fixes labels Jun 23, 2020

bedevere-bot removed the needs backport to 3.9 only security fixes label Jun 23, 2020

fasih pushed a commit to fasih/cpython that referenced this pull request Jun 29, 2020

On path with known exact float, extract the double with the fast macr…

321eab5

…o. (pythonGH-21072)

Mariatta added the needs backport to 3.9 only security fixes label Sep 4, 2020

bedevere-bot removed the needs backport to 3.9 only security fixes label Sep 4, 2020

miss-islington added a commit that referenced this pull request Sep 4, 2020

On path with known exact float, extract the double with the fast macr…

242eac1

…o. (GH-21072) (cherry picked from commit 930f451) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>

hauntsaninja mentioned this pull request Sep 2, 2023

gh-102837: few coverage nitpicks for the math module #102523

Merged

hauntsaninja mentioned this pull request Sep 2, 2023

gh-110489: Optimise math.ceil for known exact float #108801

Merged

hauntsaninja mentioned this pull request Oct 6, 2023

Optimise math.ceil for known exact float #110489

Closed

hauntsaninja added a commit that referenced this pull request Oct 6, 2023

gh-110489: Optimise math.ceil for known exact float (#108801)

f013b47

This matches a similar optimisation done for math.floor in #21072

Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024

pythongh-110489: Optimise math.ceil for known exact float (python#108801

f02d2a7

) This matches a similar optimisation done for math.floor in python#21072

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

On path with a known exact float, extract the double with a fast macro. #21072

On path with a known exact float, extract the double with a fast macro. #21072

Uh oh!

rhettinger commented Jun 23, 2020

Uh oh!

tim-one left a comment

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

bedevere-bot commented Jun 23, 2020

Uh oh!

miss-islington commented Sep 4, 2020

Uh oh!

bedevere-bot commented Sep 4, 2020

Uh oh!

Uh oh!

Uh oh!

On path with a known exact float, extract the double with a fast macro. #21072

On path with a known exact float, extract the double with a fast macro. #21072

Uh oh!

Conversation

rhettinger commented Jun 23, 2020

Uh oh!

tim-one left a comment

Choose a reason for hiding this comment

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

miss-islington commented Jun 23, 2020

Uh oh!

bedevere-bot commented Jun 23, 2020

Uh oh!

miss-islington commented Sep 4, 2020

Uh oh!

bedevere-bot commented Sep 4, 2020

Uh oh!

Uh oh!