-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
_PyDict_NewPresized() creates too small dict #72917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
_PyDict_NewPresized(6) creates dict which keysize is 8. This internal API is used by BUILD_MAP and BUILD_CONST_KEYMAP. |
This patch includes fix for ESTIMATE_SIZE macro. (see below) >>> def estimate_size(n):
... return n * 3 // 2 # Current ESTIMATE_SIZE
...
>>> def usable(n):
... return n * 2 // 3
...
>>> def keysize(minsize):
... size = 8
... while size < minsize: # Current implementation uses <=
... size *= 2
... return size
...
>>> def check():
... for i in range(1000):
... estimate = estimate_size(i)
... size = keysize(estimate)
... cap = usable(size)
... if cap < i:
... print(i, estimate, size, cap)
...
>>> check()
11 16 16 10
43 64 64 42
171 256 256 170
683 1024 1024 682
>>> #
>>> estimate_size = lambda n: (n * 3 +1) // 2 # Fixed version
>>> check()
>>> |
current: patched: |
FYI if you rename Python binaries, you can use: ./python-patched perf timeit --compare-to=./python-orig |
Nice speedup. Can you please compare Python 3.5? Better if you compile it I'm curious if 925 ns is slower than py3.5 or if 726 ns is faster than |
The condition in the loop in _PyDict_NewPresized() contains the test newsize > 0. This is a check for integer overflow. But it doesn't make much sense. First, the overflow is undefined behavior, and it is too late to detect it when it already is happen. Second, after detecting the negative value just is passed to new_keys_object() which either is crashed in debug build or makes other integer overflow and creates invalid object. I would add a runtime check that minused is less than PY_SSIZE_MAX/3 (or more strong PY_SSIZE_MAX/3*2/sizeof(Pobject *)). This would guarantee that integer overflow is not possible. The test "newsize > 0" could be removed. There is similar code in dictresize(). |
inada-n@test1:~/work/py36$ ~/local/py36/bin/patched -m perf timeit --compare-to ~/local/py36/bin/master -- 'd = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}' Median +- std dev: [master] 910 ns +- 49 ns -> [patched] 713 ns +- 26 ns: 1.28x faster inada-n@test1:~/work/py36$ ~/local/py36/bin/patched -m perf timeit --compare-to ~/local/py35/bin/python3.5 -- 'd = {"a":1, "b":2, "c":3, "d":4, "e":5, "f":6}' Median +- std dev: [python3.5] 1.17 us +- 0.06 us -> [patched] 720 ns +- 24 ns: 1.62x faster |
Python 3.5 has similar issue: Median +- std dev: [master] 1.15 us +- 0.09 us -> [patched] 885 ns +- 37 ns: 1.30x faster patch:
diff -r 20f62e4a9c2f Objects/dictobject.c
--- a/Objects/dictobject.c Wed Nov 16 16:32:22 2016 -0800
+++ b/Objects/dictobject.c Fri Nov 18 12:56:38 2016 +0000
@@ -1015,8 +1015,9 @@ PyObject *
{
Py_ssize_t newsize;
PyDictKeysObject *new_keys;
+ minused = (minused * 3 + 1) / 2;
for (newsize = PyDict_MINSIZE_COMBINED;
- newsize <= minused && newsize > 0;
+ newsize < minused && newsize > 0;
newsize <<= 1)
;
new_keys = new_keys_object(newsize); |
Ok, it's an optimization, not a performance regression of Python 3.6. In this case, I suggest to first push the change to Python 3.7, and after 3.6.0 release, backport to 3.6 (if you want to), and maybe even 3.5 if it makes sense (and if you want). I didn't review the patch. |
On Fri, Nov 18, 2016 at 9:31 PM, Serhiy Storchaka
Nice idea. I'll update patch in bpo-28147. In case of _PyDict_NewPresized(minused), it would be called from 3rd #define MAX_INITSIZE (128 * 1024)
if (minused > USABLE_FRACTION(MAX_INITSIZE)) {
newsize = MAX_INITSIZE;
}
else {
newsize = PyDict_MINSIZE;
whilie (newsize < minused)
newsize <<= 1; // Can't we assume *= 2 is optimized?
}; |
Even if the use case is limited (creation of small dictionaries), it's a nice enhancement, good job Naoki! It's nice to see someone looking at this very important Python type! 1.28x faster or 1.62x faster is significant for a micro-optimization! (My personal threshold is at least 10% faster for a micro-optimization.) |
I don't know what is the benefit of this, but this change looks correct. The benefit of preresizing is vanishingly small for very large dicts. |
This optimization affects only 6-element (and perhaps to less extend 11-element) dicts. But in any case this is good enhancement. Naoki, could you please make microbenchmarks for 5- and 7-element dicts with your next patch? |
Can I assume PY_SSIZE_T_MAX is 2**n-1? #define PyDict_MAXSIZE (PY_SSIZE_T/8+1) |
I think so. Add an assertion just to be sure :-) |
This affects when minused = (n*2/3)+1 .. n-1 (n=8, 16, ...) n=8: 6, 7
n=16: 11, 12, ..15
+ /home/inada-n/local/py36/bin/patched -m perf timeit --compare-to /home/inada-n/local/py36/bin/master -- 'd = {"a":1, "b":2, "c":3, "d":4, "e":5}' Median +- std dev: [master] 568 ns +- 18 ns -> [patched] 567 ns +- 23 ns: 1.00x faster Median +- std dev: [master] 1.01 us +- 0.04 us -> [patched] 756 ns +- 20 ns: 1.33x faster Median +- std dev: [master] 1.12 us +- 0.03 us -> [patched] 867 ns +- 31 ns: 1.29x faster Median +- std dev: [master] 1.04 us +- 0.03 us -> [patched] 1.03 us +- 0.04 us: 1.00x faster Median +- std dev: [master] 1.16 us +- 0.04 us -> [patched] 1.16 us +- 0.04 us: 1.00x faster Median +- std dev: [master] 1.67 us +- 0.06 us -> [patched] 1.39 us +- 0.04 us: 1.20x faster |
A variant in msg281118 looks better to me than 28731-PyDict_NewPresized-too-small-2.patch. It doesn't look good to me that newsize is set to arbitrary small value if estimated size is larger than very large PyDict_MAXSIZE (this is 2**28 on 32-bit and 2**60 on 64-bit). |
OK. I've updated the patch. BTW, I used const Py_ssize_t for function local constant. |
I think function scope constant is good. Similar code in dictresize() should have different bound. The patch LGTM. |
Thanks. |
New changeset b708b3190ecb by INADA Naoki in branch 'default': |
New changeset d03562dcbb82 by INADA Naoki in branch '3.6': |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: