-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tempfile mixes str and bytes in an inconsistent manner #84878
Comments
tempfile fails on mixed str and bytes when setting tempfile.tempdir to a non-existent bytes path but succeeds when set to an existing bytes path. $ python3.9
Python 3.9.0a6 (default, Apr 28 2020, 00:00:00)
[GCC 10.0.1 20200430 (Red Hat 10.0.1-0.14)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tempfile.tempdir = b'/doesntexist'
>>> tempfile.TemporaryFile()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.9/tempfile.py", line 615, in TemporaryFile
(fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File "/usr/lib64/python3.9/tempfile.py", line 248, in _mkstemp_inner
file = _os.path.join(dir, pre + name + suf)
File "/usr/lib64/python3.9/posixpath.py", line 90, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.9/genericpath.py", line 155, in _check_arg_types
raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components
>>> tempfile.tempdir = b'/tmp'
>>> tempfile.TemporaryFile()
<_io.BufferedRandom name=3>
>>> tempfile.mktemp()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.9/tempfile.py", line 400, in mktemp
file = _os.path.join(dir, prefix + name + suffix)
File "/usr/lib64/python3.9/posixpath.py", line 90, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.9/genericpath.py", line 155, in _check_arg_types
raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components It seems to me that the case of |
In the meantime, I noticed the following in addition: [ericl@tuxedo ~]$ python3.9
Python 3.9.0a6 (default, Apr 28 2020, 00:00:00)
[GCC 10.0.1 20200430 (Red Hat 10.0.1-0.14)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tempfile.tempdir = b'/tmp'
>>> tempfile.gettempdir()
b'/tmp'
>>> tempfile.tempdir = '/tmp'
>>> tempfile.gettempdirb()
b'/tmp' This actually explicitly hurts the interface description which states that tempfile.gettempdir() returns a string. "Encouraged" by this discovery, I've tried to write a patch of tempfile.py addressing the issues discovered. It's my first patch ever of Python so bare with me. The default remains string but if someone _explicitly_ sets tempdir to bytes, it'll become bytes. I've tried all the commands listed previously and it all looks consistent to me. |
Sorry, I uploaded by mistake an early version of the patch. The new one is the one I had actually tested (the old one would fail with mixing bytes and string under certain circumstances, I can't remember any more). |
Could you please turn that into a Github PR? |
On 23/05/2020 21:41, Gregory P. Smith wrote:
|
Maybe just document that tempdir should be a string? |
On 24/05/2020 20:30, Serhiy Storchaka wrote:
|
In any case this is a new feature, so it can be added only in 3.10, and we need the documentation patch for 3.9 and older. As a workaround you can use os.fsdecode(): tempfile.tempdir = os.fsdecode(b'/doesntexist') |
Well, your decision but, as a user of the library, it didn't feel like a new feature just like a bug to be fixed, the main issue being the inconsistent handling of bytes vs. str. |
We consider it closer to new feature as it changes existing behavior in a way that people cannot _depend_ on being present in older Python releases as it'd only appear in a bugfix release, so most people could never write code depending on it while claiming to generally support 3.7-3.9. Anyways your PR overall looks good for 3.10. I left some comments. |
Well, the behavior for an existing bytes path was not unintended, but some people can depend on it. But before making it an official feature, we should also check other cases of an unintended behavior. What if set tempfile.tempdir to a Path object or to a file descriptor of a directory? Does anything work in these cases and should we support them? |
I expect the best decision to be to get rid of tempfile.tempdir entirely. That would need be its own issue with a deprecation period involved. A process global that alters behavior of all calls into a module that don't explicitly opt-out is a bad API. |
I don't think that it is so bad. The behavior depends on environment variables TMPDIR, TEMP, TMP. The tempdir variable is just a cache for them. As sys.path is a cache for PYTHONPATH. We need just document that it should be a string if not None. Nobody expects bytes paths be valid in sys.path. On other hand, there is gettempdir(), so we have two different ways to get the value of tempfile.tempdir. |
I clarified the documentation in the PR and added a regression test. I chose to explicitly document that tempfile.tempdir may only be str or bytes and cannot be a path-like object. We already document that people really should not set it and instead pass dir= to their APIs. Now the docs make that more clear when it comes to setting it to a bytes object due to the global API return type change consequences. I view this PR as cleaning up a partial misbehavior while preserving an API wart of it ever working in any manner when set to a bytes value. getting rid of tempfile.tempdir entirely or preventing it from being a bytes value at all would be much more of a breaking change, even though desirable. Not a bugfix. Now with the PR as is, we at least document that people should avoid doing that and make it clear what consistent behavior happens when they do it anyways. With a test. |
Fixed for 3.10. Marking as a release blocker for 3.9 and assigning for a release manager decision on if they accept this change (the PR) as a bugfix in 3.9 or not. (see the PR) It is late enough in 3.8's lifetime I wouldn't touch that one. |
Sorry, I decided not to take this. We're four bugfix releases into 3.9 and there is non-zero chance somebody relies on existing behavior somehow. Marking this as changing with 3.10.0 is cleaner from an end user standpoint. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: