-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize new io library #48811
Comments
The new io library needs some serious profiling and optimization work. More profiling tests have shown a speed problem in write() files opened |
I've done some profiling and the performance of reading line-by-line is for line in open("somefile.txt"):
pass Ran 35 times slower in Python 3.0 than Python 2.6 when I tested it on a This slowdown is really unacceptable for anyone who uses Python for |
Your issue is most like caused by bpo-4533. Please download the latest svn |
Here is a patch againt the py3k branch that reduces the time for the |
Tried this using projects/python/branches/release30-maint and using the for line in open("BIGFILE):
pass Python 2.6: 0.67s This is running on a MacBook with a warm disk cache. For what it's |
Just as one other followup, if you change the code in the last example for line in open("BIG","rb"):
pass You get the following results: Python 2.6: 0.64s |
David, the reading bug fix/optimization is not (yet?) on |
Just checked it with branches/py3k and the performance is the same. |
What's your OS, David? Please post the output of "uname -r" and ./python |
bash-3.2$ uname -a |
I've updated the patch with proper formatting, some minor cleanups and a |
I don't think this is a public API, so the function should probably be |
I'll come up with some reading benchmarks tomorrow. For now here is a |
Roundup doesn't display .log files as plain text files. |
Christian, by benchmarks I meant a measurement of text reading with and |
I've written a small file IO benchmark, available here: It runs under both 2.6 and 3.x, so that we can compare speeds of |
Without Christian's patch: [400KB.txt] read one byte/char at a time... 0.2685 MB/s (100% CPU) [ 20KB.txt] read whole contents at once... 52.42 MB/s (99% CPU) With the patch: [400KB.txt] read one byte/char at a time... 0.2761 MB/s (100% CPU) [ 20KB.txt] read whole contents at once... 66.17 MB/s (99% CPU) Python 2.6's builtin file object: [400KB.txt] read one byte/char at a time... 1.347 MB/s (97% CPU) [ 20KB.txt] read whole contents at once... 1072 MB/s (100% CPU) |
I'm getting caught-up with the IO changes in 3.0 and am a bit confused. |
The previous implementation only returns bytes and does not translate |
I seem to recall one of the design principles of the new IO stack was to In any case, binary reading has acceptable performance in py3k (although |
I don't agree that that was a worthy design goal. Tons of code (incl |
I agree with Raymond. For binary reads, I'll go farther and say that It's fine that text mode now uses Unicode, but if I don't want that, I |
I don't necessarily agree either, but it's probably too late now. In any case, Amaury has started rewriting the IO lib in C (*) and |
But "cranking data" implies you'll do something useful with it, and In any case, you can try to open your file in unbuffered mode: it will bypass the Python buffering layer and will go directly to the
No. It's a bit more limited, doesn't support autoconversion to/from |
Good luck with that. Most people who get bright ideas such as "gee, As for cranking data, that does not necessarily imply heavy-duty CPU |
David: |
I wish I shared your optimism about this, but I don't. Here's a short The problem of I/O and the associated interface between hardware, the
So, you'll have to forgive me for being skeptical, but I just don't Again, I would love to be proven wrong. |
[...] Although I agree all this is important, I'd challenge the assumption it In any case, it will be difficult to undo the current design decisions |
We can't solve this for 3.0.1, downgrading to critical. |
Marking this as a duplicate of bpo-4565 "Rewrite the IO stack in C". |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: