-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpickling is really slow #48123
Comments
Unpickling e.g. a large list seems to be really slow in Python 3.0. The attached test script gives the following results for pickling and [hagenf@cluster-06 hagenf]$ python pickletst.py |
Do the numbers vary if you read the whole file at once and then unpickle |
Yes, it gets much better, but even so (first reading file and timing [hagenf@cluster-06 hagenf]$ python pickletst2.py But I guess this can still be blamed on the BytesIO implementation... |
Indeed. If I replace the file with After quick profiling, it seems that PyLong_New would benefit from a |
Gregory had patches for a freelist of long objects in bpo-2013. |
The solution is to add a read buffer to Unpickler (Pickler already has a cPickle has a performance hack to make it uses cStringIO and PyFile |
I think a read buffer is not possible without being able to unread bytes |
Making this a duplicate of bpo-4565 ("Rewrite the IO stack in C"). |
With the io-c branch I see much better unpickling performance than Is this expected at this point of io-c development? Otherwise perhaps |
Hello,
It's much closer here. With 2.7 (trunk) and cPickle:
20-30% slower is expected for binary I/O (which is what I get above), |
I uploaded a new pickletst.py which specifies protocol 2, otherwise 0.211881160736 for Python 2.6 and 0.158488035202 on the io-c branch. Can you confirm this? |
Nice catch! I can confirm your figures with protocol=2 (and protocol=-1 |
alexandre.vassalotti wrote:
Yes, unpickler_read() calls Buffered_read() through |
Unladen Swallow has a project to optimize pickle. Currently, it uses 3 |
gprof (--enable-profiler) results: Each sample counts as 0.01 seconds. It looks like PyArg_ParseTuple() should be optimized ;-) |
Create a read buffer (4096 bytes) in unpickler class. Using [0]*10**7 or I removed last_string attribute because it wasn't used. If there are "tail" bytes, seek backward. |
I don't know why, but python-trunk is *much* slower than py3k (eg. with |
My version of pickletest.py:
without the patch : version | data | dump ms | load ms | with the patch: version | data | dump ms | load ms | speed up: (data: 0, 10^6 means: [0]*10^6) |
Note about my patch: the buffer should be truncated after |
Perhaps you tried with the pure Python version (pickle) rather than the |
Victor, Unpickler shouldn't raise an error if the given file object does Incidentally, I think it would be nice to add to Python 3 some kind of |
By the way, the patch won't work with unseekable files, which is |
Has this slipped under the radar? I believe that one way or the other any performance issue should be resolved if at all possible. |
New version of my patch:
test_pickle pass without any error. Disable read buffer at the first call to unpickle_readline() because unpickle_readline() have to flush the buffer. I will be very difficult to optimize protocol 0, but I hope that nobody uses it nowadays. =========== Benchmark with [0]*10**6, Python compiled with pydebug. Without the patch Protocol 0:
Protocol 1:
Protocol 2:
With the patch Protocol 0
Protocol 1
Protocol 2
|
Same benchmark with Python 2.6.5+, so without the patch, but compiled with maximum compiler optimization (whereas pydebug means no optimization): Protocol 0
Protocol 1
Protocol 2
It would be better to redo all tests with the same compiler options, but the most important point is that Python3 is *faster* than Python2 with my patch ;-) |
bench_pickle.py: script used to produce last benchmarks. |
Victor, have you tried using peek() instead of seek()? I mentioned this previously in msg85780. |
In a file encoded in protocol 0, backward seek are needed to each call to unpickler_readline... and this function is called to read a number, a boolean, etc. (to read most, or all, data values). I choosed to disable the read buffer because it's slower with it. For protocol 1 and 2, there is only *one* seek at the end (but a lot of read: file size / 4096). So I don't think that it does really matter to use peek or seek. seek() is more natural (for me) and frequent than peek(), so I prefer to keep seek(). |
Here is a fixed version of Victor's bench (didn't work on 2.x). |
And here is new performance patch (Victor's patch was outdated because of heavy changes incorporated from Unladen Swallow). Results of bench_pickle.py are as follows:
Protocol 0
Protocol 0
Protocol 0
|
I get this error with the patch: python: /home/alex/src/python.org/py3k/Modules/_pickle.c:908: _Unpickler_ReadFromFile: Assertion `self->next_read_idx == 0' failed. |
Ah, thank you. I hadn't tested in debug mode and there was a wrong assert from the previous code. |
One problem with the seek() approach is that some file-like objects have expensive seeks. One example is GzipFile, where seek(n) is O(n) (it first rewinds to the start of file, then reads n decompressed bytes). In the end, unpickling from a GzipFile becomes O(n**2). I will try to use peek() instead. |
Didn't Victor say that only one seek at the end is necessary per |
If you are unpickling from a multi-megabyte gzip file and the seek at Another issue with seeking only at the end is that it would make |
Here is an update bench_pickle which also makes the file unpeekable. |
Here is a patch using peek() rather than seek(). There are some inefficiencies around (such as using read() to skip the consumed prefetched bytes), but the benchmark results are still as good as with seek(): Protocol 0
Protocol 1
Protocol 2
|
Patch committed in r85384. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: