-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
readline() + seek() on codecs.EncodedFile breaks next readline() #77542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It appears that calling readline() on a codecs.EncodedFile stream breaks seeking and causes subsequent attempts to iterate over the lines or call readline() to backtrack and return already consumed lines. A minimal example:
Output:
As you can see, the line being skipped is actually the second line, and when we try reading from the stream again, the iterator starts from the beginning of the file. Even weirder, adding a second call to readline() to skip the second line shows it's going **backwards**:
The expected output shows that we got a header, skipped it, and then read one data line.
I'm sure this is related to the implementation of readline() because if we change this:
to this:
then we get the expected output. If on the other hand we comment out the seek() in the finally clause, we also get the expected output (minus the "skipping the header") code. |
I cannot replicate this when the stream is: In: stream_ex = io.BytesIO(u"abc\ndef\nghi\n".encode("utf-8")) In: run(f) Out: Got header: b'abc\n' |
That's because the stream isn't transcoding, since UTF-8 is ASCII-compatible. Try using something not ASCII-compatible as the codec e.g. 'ibm500' and it'll give incorrect results.
|
Update: If I run your exact code it still breaks for me:
I'm running Python 2.7.14 and 3.6.5 on OSX 10.13.4. Startup banners: Python 2.7.14 (default, Feb 7 2018, 14:15:12) Python 3.6.5 (default, Apr 2 2018, 14:03:12) |
I've tried this with Python 3.6.0 on OSX 10.13.4 |
For you specific example I get also a weird result. Tried this in Python 2.7.10 and Python 3.6.0. |
I've modified a little your example and it's clearly that the readline moves the cursor.
The first call to readline returns cd instead of ab. |
Update: Tested this on Python 3.5.4, 3.4.8, and 3.7.0b3 on OSX 10.13.4. They also exhibit the bug. Updating the ticket accordingly. |
Bug still present in 3.7.0, now seeing it in 3.8.0a0 as well. |
Still seeing this in 3.7.3 so I don't think so? |
Thank you for the report, Diego and thank you for the patch, Ammar! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: