-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Speed up pickling of dicts in cPickle #49920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The attached patch adds another version of cPickle.c's batch_dict(), Pickle: Benchmark is at This patch passes all the tests added in bpo-5665. I would recommend |
Without taking a very detailed look, the patch looks good. |
By the way, could the same approach be applied to lists and sets as well? |
On Thu, Apr 2, 2009 at 12:20 PM, Antoine Pitrou <report@bugs.python.org> wrote:
Certainly; see http://bugs.python.org/issue5671 for the list version. |
How about splitting the benchmark in parts:
|
Antoine: pickletester.py:test_newobj_generic() appears to test dict |
The patch produces different output for an empty dict: a sequence "MARK |
Well, Python-level dict subclasses are also C-level subclasses (in the |
I ported the patch to py3k. In addition, I added a special-case when the |
Oops, I forgot to add the comment on top of batch_dict_exact in the |
Oops again, I just remarked that the comment for batch_dict_exact refers |
Silly me, I had changed the PyDict_Size call in outer loop for Py_SIZE Collin, you can go ahead and commit both patches. Nice work! |
Sigh... silly me again. There is some other junk in my last patch. |
FYI, I just added a pickle_dict microbenchmark to perf.py. Using this pickle_dict: I still need to address the comment about pickling empty dicts. |
Amaury, I can't reproduce the issue you're seeing with empty dicts. dhcp-172-19-19-199:trunk collinwinter$ ./python.exe
Python 2.7a0 (trunk:71100M, Apr 3 2009, 14:40:49)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cPickle, pickletools
>>> data = cPickle.dumps({}, protocol=2)
>>> pickletools.dis(data)
0: \x80 PROTO 2
2: } EMPTY_DICT
3: . STOP
highest protocol among opcodes = 2
>>> data
'\x80\x02}.'
>>> What are you doing to produce the MARK SETITEMS sequence? |
Sorry, I was wrong. I think I noticed that the case size==1 was handled |
Can this patch be used or ported to 2.5.x? |
Sorry, it won't even be integrated in 2.6 actually. It's a new feature, |
Fixed the len(d) == 1 size regression. Final performance of the patch Using Unladen Swallow's perf.py -b pickle,pickle_dict on trunk: pickle_dict: Performance for py3k: pickle_dict: regrtest.py -uall test_xpickle passes all backwards-compatibility tests Committed as r72909 (trunk), r72910 (py3k). |
Thanks!
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: