New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Negative _io.BufferedReader.tell
#95782
Comments
Reproducible |
lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests.
This is caused by When calling I also bumped into further not necessarily expected behavior, of which I'm not sure if it can be a problem, and which I didn't change:
Demo: import io
import _pyio as pyio
# io.BufferedWriter and _pyio.BufferedWriter tell() returning number of bytes written, seek(0, io.SEEK_CUR) resets it, which is not supposed to change the offset.
f = open("/dev/urandom", "wb")
print(f.tell()) # 0
print(f.write(b"1")) # 1
print(f.write(b"12")) # 2
print(f.tell()) # 3
print(f.seek(0, io.SEEK_CUR)) # 0
print(f.tell()) # 0
f.close()
print("---")
f = pyio.open("/dev/urandom", "wb")
print(f.tell()) # 0
print(f.write(b"1")) # 1
print(f.write(b"12")) # 2
print(f.tell()) # 3
print(f.seek(0, io.SEEK_CUR)) # 0
print(f.tell()) # 0
f.close()
print("---")
# io.BufferedReader/Random seek(0, io.SEEK_CUR) returning the number of bytes read, tell() resets it. Only the C implementation is affected.
f = open("/dev/urandom", "rb")
print(f.tell()) # 0
print(f.read(1)) # b'\xf1'
print(f.read(2)) # b'X\x1a'
print(f.seek(0, io.SEEK_CUR)) # 3
print(f.tell()) # 0 with the fix, -4093 without
print(f.seek(0, io.SEEK_CUR)) # 0 with the fix, -4093 without
print(f.tell()) # 0 with the fix, -4093 without
f.close()
print("---")
f = pyio.open("/dev/urandom", "rb")
print(f.tell()) # 0
print(f.read(1)) # b'\xaa'
print(f.read(2)) # b'[\x96'
print(f.seek(0, io.SEEK_CUR)) # 0
print(f.tell()) # 0
print(f.seek(0, io.SEEK_CUR)) # 0
print(f.tell()) # 0
f.close() |
lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests.
…ets < 0 (GH-99709) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests.
…n offsets < 0 (pythonGH-99709) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests. (cherry picked from commit 26800cf) Co-authored-by: 6t8k <58048945+6t8k@users.noreply.github.com>
…n offsets < 0 (pythonGH-99709) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests. (cherry picked from commit 26800cf) Co-authored-by: 6t8k <58048945+6t8k@users.noreply.github.com>
…rn offsets < 0 (GH-99709) (GH-115600) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests. (cherry picked from commit 26800cf) Co-authored-by: 6t8k <58048945+6t8k@users.noreply.github.com>
…rn offsets < 0 (GH-99709) (GH-115599) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests. (cherry picked from commit 26800cf) Co-authored-by: 6t8k <58048945+6t8k@users.noreply.github.com>
…n offsets < 0 (pythonGH-99709) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests.
…n offsets < 0 (pythonGH-99709) lseek() always returns 0 for character pseudo-devices like `/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it always returns -1, to which CPython reacts by raising appropriate exceptions). They are thus technically seekable despite not having seek semantics. When calling read() on e.g. an instance of `io.BufferedReader` that wraps such a file, `BufferedReader` reads ahead, filling its buffer, creating a discrepancy between the number of bytes read and the internal `tell()` always returning 0, which previously resulted in e.g. `BufferedReader.tell()` or `BufferedReader.seek()` being able to return positions < 0 even though these are supposed to be always >= 0. Invariably keep the return value non-negative by returning max(former_return_value, 0) instead, and add some corresponding tests.
Bug report
I believe
tell
should never be negative, even for/dev/urandom
. And @benjaminp wrote ToDo that it shouldn't be negative.Environment
Fedora linux, python 3.10.6
Linked PRs
io.BufferedReader.tell
etc. being able to return offsets < 0 #99709The text was updated successfully, but these errors were encountered: