-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
PEP 509: Add ma_version to PyDictObject #70246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Attached patch implements the following PEP currently discussed on python-ideas: |
Result of pybench on Dict microbenchmarks:
Hum, as usuall, pybench doesn't seem reliable at all: I expect worse performance with the patch since it adds "version++" in dict.__setimte__(). I don't really trust pybench, results seem to have a lot of noise :-/ Maybe I'm not using pybench correctly? I used: $ make distclean; ./configure && make
$ ./python Tools/pybench/pybench.py -f pybench.default
$ patch -p1 < dict_version.patch
$ ./python Tools/pybench/pybench.py -f pybench.dictversion
$ ./python Tools/pybench/pybench.py -s pybench.dictversion -c pybench.default Full output: -------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Test minimum run-time average run-time (this=pybench.dictversion, other=pybench.default) |
Using timeit to microbenchmark dict operations (constructor, __setitem__, __delitem__, clear), I get exactly the same performance or even better performance (???) with the patch. ./python -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()' Original: 318 ns ./python -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()' Original: 19.9 ms |
Updated patch, I forgot to include unit tests. I added a test on integer overflow on the version. The patch adds _testcapi.dict_setversion(dict, version) and _testcapi.PY_SIZE_MAX. |
I don't understand this:
|
Oh, ignore it, it's an outdated comment :-) Dictionary version is initialized to 0, it's not more "reserved". FYI I added the comment in my first implementation which also added a version to dictionary *entries*. |
A better way to benchmark this is to run hg.python.org/benchmarks with a patched interpreter and see if that shows any performance impact. |
Brett:
Ok. Here is the output. Can you please explain me the result? :-) Especially the "significant" value. $ ./python.orig ../benchmarks/perf.py ./python.orig ./python.patched INFO:root:Automatically selected timer: perf_counter Report on Linux smithers 4.2.8-300.fc23.x86_64 #1 SMP Tue Dec 15 16:49:06 UTC 2015 x86_64 x86_64 ### 2to3 ### ### fastpickle ### ### fastunpickle ### ### json_load ### ### nbody ### The following not significant results are hidden, use -v to show them: |
So min is the fastest time in a benchmark execution, average is the average across all benchmark executions, and the t-value is some statistics thing. Anything marked insignificant had such a small average difference it isn't worth reporting. If you want to smooth out the numbers you should do a rigorous run that uses more loops per benchmark to help smooth out outliers. And if you are doing this on Linux there is a flag to measure memory usage as well. |
The PEP is now at python.org: PEP-509. |
Patch version 3 updated for the second version of the PEP-509. Main changes:
|
Patch version 4: I forgot to include my patch to update test_sys. Now the full Python test suite pass. |
I'm not a big fan of pybench (it looks unstable and so not reliable), but here are results with dict_version-4.patch. I used more loops and a lower warp factor to get more reliable tests (I hope): ./python Tools/pybench/pybench.py -f pybench.orig -w 2 -C 100 -n 25 -------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Test minimum run-time average run-time (this=pybench.dictver, other=pybench.orig) |
guard_benchmark.patch: patch adding a _testcapi.guard_benchmark(), a microbenchmark on dictionary guard. The benchmark measures the cost of checking if a dictionary key was modified. Run the benchmark with attached guard_benchmark.py. Result on my PC:
python3 -m platform: You have to modify manually _testcapi.c to choose between the 3 implementations. guard_benchmark.patch requires the issue bpo-26098 patch and the fat module which implements fat.GuardDict. The fat module can be found at: |
Patch version 5: a global counter is now used to set ma_version field of dictionaries. The global counter is incremented each time that a dictionary is created and each time that a dictionary is modified. A dictionary version is now unique: two dictionaries cannot have the same version. So if a guard stores a version, the check on the version will fail if a different dictionary is used. |
This is great, thank you, Victor. |
I will update the PEP-509 later for the global counter. |
Patch version 6: remove now unused function _testcapi.dict_set_version(). I also moved tests to test_pep509.py to make it more explicit that the implementation of the PEP-509 is not part the Python dictionary type specification, other Python implementations can choose to not implement it. |
The implementation is outdated: ma_version was renamed to ma_version_tag in the latest version of the PEP-509. |
Patch version 7 is updated to the latest PEP (rebased and rename ma_version to ma_version_tag). |
Patch version 8:
|
timeit microbenchmarks on dict_version-8.patch, minimum of 10 runs. $ ./python.orig -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
1000000 loops, best of 3: 0.292 usec per loop
$ ./python.version -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
1000000 loops, best of 3: 0.293 usec per loop => 1 nanosecond (0.3%) slower $ ./python.orig -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'
10 loops, best of 3: 21.2 msec per loop
$ ./python.version -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'
10 loops, best of 3: 21.3 msec per loop => 0.1 ms (0.5%) slower |
I ran again timeit microbenchmarks with CPU isolation on dict_version-8.patch, minimum of 10 runs. -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
-m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'
|
pybench results on dict_version-8.patch with CPU isolation: http://haypo-notes.readthedocs.org/microbenchmark.html As usual, I'm very skeptical on the pybench results which almost look like noise. I don't understand how my change can make any operation *faster*, whereas some benchmarks are faster with the patch... Dict microbenchmarks:
SimpleDictManipulation: 59ms 59ms -0.4% 59ms 59ms -0.4% Full output: $ ./python.version Tools/pybench/pybench.py -f pybench.version
$ ./python.orig Tools/pybench/pybench.py -f pybench.orig
$ ./python.orig Tools/pybench/pybench.py -s pybench.version -c pybench.orig PYBENCH 2.1
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Test minimum run-time average run-time (this=pybench.version, other=pybench.orig) |
On 19.04.2016 12:52, STINNER Victor wrote:
This can easily happen as a result of different memory layout, but is
Only dict creation and the integer keys benchmark results are relevant. Could you perhaps check what's causing these slowdowns ? |
It's obvious, no? My patch causes the slowdown. On a timeit microbenchmark, I don't see such slowdown. That's also why python3.6 -m timeit '{}' says 105 ns with and without the patch. python3.6 -m timeit 'd={}; d[1]=1; d[2]=2; d[3]=3; d[4]=4; d[5]=5; I have to "cheat": I run timeit enough times until I see the "minimum". |
On 19.04.2016 13:11, STINNER Victor wrote:
Well, yes, of course :-) I meant whether there's anything you
Those operations are too fast for timeit. The overhead associated |
Results of the CPython benchmark suite on dict_version-8.patch. IMHO regex_v8 can be ignored, this benchmark is unstable (issue bpo-26275). Original python: ../pep509/python Patched python: ../pep509/python $ python3 -u perf.py --rigorous -b all ../default.copy/python ../pep509/python INFO:root:Automatically selected timer: perf_counter ### 2to3 ### ### call_method ### ### call_simple ### ### etree_generate ### ### etree_parse ### ### etree_process ### ### fannkuch ### ### float ### ### json_load ### ### normal_startup ### ### regex_effbot ### ### regex_v8 ### ### richards ### ### silent_logging ### The following not significant results are hidden, use -v to show them: |
New changeset d43f819caea7 by Victor Stinner in branch 'default': |
The PEP-509 has been approved by Guido, I just pushed the implementation. |
New changeset c3776dd858f0 by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: