New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fileobject.c can switch between fread and fwrite without an intervening flush or seek, invoking undefined behaviour #52200
Comments
The C standard (any version, or POSIX), says in the description of fopen that: Objects/fileobject.c makes calls to fread and fwrite without taking this into account. So calls from Python to read or write methods of a file object opened in any "rw" mode, may invoke undefined behaviour. It isn't reasonable to rely on Python code to avoid this situation, even if were considered acceptable in C. (Arguably this is a bug in the C standard, but it is unlikely to be fixed there or in POSIX, because of differences in philosophy about language safety.) To fix this, fileobject.c should keep track of whether the last I/O operation was an input or output, and perform a call to fflush whenever an input follows an output or vice versa. This should not significantly affect performance in any case where the behaviour was previously defined (in cases where it wasn't, correctness trumps performance). fflush does not affect the file position and should have no other negative effect, because the stdio implementation is free to flush buffered data at any time (and certainly on I/O operations). Despite the undefined behaviour, I don't currently know of a platform where this would lead to an exploitable security bug. I'm marking this issue as security-relevant anyway, because it may prevent analysing whether Python applications behave securely only on the basis of documented behaviour. |
Correction: when input is followed by output, the call needed to avoid undefined behaviour has to be to a "file positioning function" (fseek, fsetpos, or rewind, but not fflush). Since fileobject.c does not use wide I/O operations, it should be sufficient to use _portable_fseek(fp, 0, SEEK_SET). (_portable_fseek may call some function that is not strictly defined to be a "file positioning function", e.g. fseeko() or fseek64(). However, it would be insane for a stdio implementation not to treat those as being file positioning functions as far as the intent of the C or POSIX standards is concerned.) |
"... it should be sufficient to use _portable_fseek(fp, 0, SEEK_SET)." I meant SEEK_CUR here. |
The 2.x file object mostly mirrors abilities of the standard C buffered IO functions (including, for example, special behaviour of text files under Windows). Therefore, Python code should call flush() manually if needed. It should be noted that the 2.x file object has been existing for ages and this issue is, to my knowledge, very rarely brought up. The 3.x IO library should not have this problem since we wrote our own buffering layer, and we have dedicated tests for mixed reads and writes. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: