-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
micro-optimization of PyLong_FromSize_t() #81983
Comments
Currently PyLong_FromSize_t() uses PyLong_FromLong() for values < PyLong_BASE. It's suboptimal because PyLong_FromLong() needs to handle the sign. Removing PyLong_FromLong() call and handling small ints directly in PyLong_FromSize_t() makes it faster: $ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 18.7 ns +- 0.3 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 16.7 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 18.7 ns +- 0.3 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 16.7 ns +- 0.1 ns: 1.12x faster (-10%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**10).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 26.2 ns +- 0.0 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.0 ns +- 0.7 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 26.2 ns +- 0.0 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.0 ns +- 0.7 ns: 1.05x faster (-5%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**30).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 25.6 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.6 ns +- 0.0 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 25.6 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.6 ns +- 0.0 ns: 1.00x faster (-0%) This change makes PyLong_FromSize_t() consistently faster than PyLong_FromSsize_t(). So it might make sense to replace PyLong_FromSsize_t() with PyLong_FromSize_t() in __length_hint__() implementations and other similar cases. For example: $ python -m perf timeit -s "_len = iter(bytes(2)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 19.4 ns +- 0.3 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 17.3 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 19.4 ns +- 0.3 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 17.3 ns +- 0.1 ns: 1.12x faster (-11%)
$ python -m perf timeit -s "_len = iter(bytes(2**10)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 26.3 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 25.3 ns +- 0.2 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 26.3 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 25.3 ns +- 0.2 ns: 1.04x faster (-4%)
$ python -m perf timeit -s "_len = iter(bytes(2**30)).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=10000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 27.6 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 26.0 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 27.6 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 26.0 ns +- 0.1 ns: 1.06x faster (-6%) |
Previous benchmarks results were obtained with non-LTO build. Here are results for LTO build: $ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 0).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=1000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 14.9 ns +- 0.2 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 13.1 ns +- 0.5 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 14.9 ns +- 0.2 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 13.1 ns +- 0.5 ns: 1.13x faster (-12%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**10).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=1000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 22.1 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 20.9 ns +- 0.4 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 22.1 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 20.9 ns +- 0.4 ns: 1.05x faster (-5%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**30).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=1000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 23.3 ns +- 0.0 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 21.6 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 23.3 ns +- 0.0 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 21.6 ns +- 0.1 ns: 1.08x faster (-8%)
$ python -m perf timeit -s "from itertools import repeat; _len = repeat(None, 2**60).__length_hint__" "_len()" --compare-to=../cpython-master/venv/bin/python --duplicate=1000
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 24.4 ns +- 0.1 ns
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 22.7 ns +- 0.1 ns
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 24.4 ns +- 0.1 ns -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 22.7 ns +- 0.1 ns: 1.08x faster (-7%) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: