New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up cPickle's pickling generally #49933
Comments
This patch simplifies cPickle's complicated internal buffering system. Benchmarked against virgin trunk r71058 using "perf.py -r -b pickle: pickle_dict: pickle_list: A similar patch for unpickling will follow. As bpo-5670 and 5671 are This patch passes all the tests added in bpo-5665. I would recommend |
Any updates? |
Applied collinwinter's patch and svn up'd to r73872; replaced additional Passed test_xpickle.py as referenced by bpo-5665. |
Are you still willing to work on this? |
Last august, I worked on integrating Collin's optimization work into And if Collin allows me, I would like to merge the other pickle |
Updated the patch against the latest version of cPickle.c (r77393). All tests pass on my Mac. |
There were a couple of comments on the Rietveld code review above. |
Antoine> There were a couple of comments on the Rietveld code review Indeed there are. Given that the Unladen Swallow folks were focusing on the Skip |
The main thing I'm worried about is the potentially unbounded buffering, |
Antoine> The main thing I'm worried about is the potentially unbounded Got a test case in mind? If so, I'll code it up and compare 2.6 and Unladen S |
Trying to pickle a structure that's larger than half the RAM should do |
Quick test on a 3GB machine: Without patch ("top" shows the process reaches 1.2GB RAM max): $ time ./python -c "import cPickle;l=['a'*1024 for i in xrange(1000000)];cPickle.dump(l, open('/dev/null', 'wb'))"
10.67user 1.47system 0:12.92elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+319042minor)pagefaults 0swaps With the patch, the same command quickly swaps hopelessly and after 5 minutes of elapsed time I finally manage to kill the process. |
Antoine> With the patch, the same command quickly swaps hopelessly and Verified with an Unladen Swallow test case. I'll see if I can fix it. S |
You can fix it if you are dumping to a file, however if you are calling dumps() you are kind of screwed if dumping large objects. There's no place to flush the buffer. I have a fix to Unladen Swallow's cPickle module. I'm run it by them before submitting it here. |
Do we need an intermediate buffer at all when called from dumps()? How about allocating the buffer as a PyStringObject, so that it can be used directly for the result in that case? |
Perhaps. Let's take it one step at a time though. If I change your |
Oh, BTW, the proposed fix is in Rietveld: http://codereview.appspot.com/189051 |
Too late to make this change in 2.x. And the patch in bpo-9410 includes the optimization for 3.x. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: