-
-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tempfile.NamedTemporaryFile not particularly useful on Windows #58451
Comments
NamedTemporaryFile is too hard to use portably when you need to open the file by name after writing it. To do that, you need to close the file first (on Windows), which means you have to pass delete=False, which in turn means that you get no help in cleaning up the actual file resource, which as you can see from the code in tempfile.py is devilishly hard to do correctly. The fact that it's different on posix (you can open the file for reading by name without closing it first) makes this problem worse. What we really need for this use-case is a way to say, "delete on __del__ but not on close()." |
This is quite silly indeed, and is due to the use of O_TEMPORARY in the file creation flags. |
What's the proposal here? If delete is True, close() must delete the file. It is not acceptable for close() and __del__() to behave differently. OTOH, if the proposal is merely to change the way the file is opened on Windows so that it can be opened again without closing it first, that sounds fine. |
That would be my proposal. It probably needs getting rid of O_TEMPORARY, |
I disagree that it's unacceptable for close() and __del__() to behave differently. The acceptable difference would be that __del__() closes (if necessary) /and/ deletes the file on disk, while close() merely closes the file. If you can in fact "change the way the file is opened on Windows so that it can be opened again without closing it first," that would be fine with me. It isn't clear to me that Windows supports that option, but I'm not an expert. Another possibility, of course, is something like what's implemented in: |
The whole point of close() methods is to offer deterministic resource management to applications that need it. Pointing out to applications when they're relying on CPython's refcounting for prompt resource cleanup is why many of the standard types now trigger ResourceWarning for any application that relies on the GC to clean up such external resources in __del__. So, no, we're not going to back away from the explicit guarantee in the NamedTemporaryFile docs: "If delete is true (the default), the file is deleted as soon as it is closed." (Especially since doing so would also breach backward compatibility guarantees) However, you're right that the exclusive read lock in the current implementation makes the default behaviour of NamedTemporaryFile significantly less useful on Windows than it is on POSIX systems, so the implementation should be changed to behave more like the POSIX variant. |
If file.close() "offers deterministic resource management," then you have to consider the file's open/closed state to be a resource separate from its existence. A NamedTemporaryFile whose close() deterministically managed the open/closed state but not the existence of the file would be consistent with file. That said, I understand the move toward deprecating (in the informal sense) cleanups that rely on GC. I'm not suggesting breaking backward compatibility, either. I'm suggesting that it might make sense to allow an explicit close-without-delete as an /extension/ of the current interface. Given the move away from GC-cleanups, you'd probably want an explicit unlink() method as well in that case. |
Dave, decoupling the lifecycle of the created file from the object that created it is exactly what delete=False already covers. The complicated dance in NamedTemporaryFile is only to make *del* work a bit more reliably during process shutdown (due to some messy internal problems with what CPython is doing at that point). If you're doing deterministic cleanup (even via atexit), you don't need any of that - you can just use os.unlink(). |
Nick, not to belabor this, but I guess you don't understand the use-case in question very well, or you'd see that delete=False doesn't cover it. The use case is this: I have to write a test for a function that takes a filename as a parameter and opens and reads from the file with that name. The test should conjure up an appropriate file, call the function, check the results, and clean up the file afterwards. It doesn't matter when the file gets cleaned up, as long as it is cleaned up "eventually." Having to explicitly delete the file is exactly the kind of boilerplate one wants to avoid in situations like this. Even if Windows allows a file to be opened for reading (in some circumstances) when it is already open for writing, it isn't hard to imagine that Python might someday have to support an OS that didn't allow it under any circumstances. It is also a bit perverse to have to keep the file open for writing after you're definitively done writing it, just to prevent it from being deleted prematurely. I can understand most of the arguments you make against close-without-delete, except those (like the above) that seem to come from a "you shouldn't want that; it's just wrong" stance. |
See bpo-14514 for an alternate proposal to solve this. I did search before I opened that issue, but search is currently somewhat broken and I did not find this issue. I'm not marking it as a dup because my proposal is really a new feature. |
I agree we need to add something here to better support the idiom where the "close" and "delete" operations on a NamedTemporaryFile are decoupled without the delete becoming a completely independent call to os.unlink(). I agree with RDM's proposal in bpo-14514 that the replacement should be "delete on __exit__ but not on close". As with generator context managers, I'd also add in the "last ditch" cleanup behaviour in __del__. Converting the issue to a feature request for 3.3 - there's no bug here, just an interaction with Windows that makes the existing behavioural options inconvenient. After all, you can currently get deterministic cleanup (with a __del__ fallback) via: @contextmanager
def named_temp(name):
f = NamedTemporaryFile(name, delete=False)
try:
yield f
finally:
try:
os.unlink(name)
except OSError:
pass You need to be careful to make sure you keep the CM alive (or it will delete the file behind your back), but the idiom RDM described in the other issues handles that for you: with named_temp(fname) as f:
data = "Data\n"
f.write(data)
f.close() # Windows compatibility
with open(fname) as f:
self.assertEqual(f.read(), data) As far as the API goes, I'm inclined to make a CM with the above behavour available as a new class method on NamedTemporaryFile: with NamedTemporaryFile.delete_after(fname) as f:
# As per the workaround |
Although, for the stdlib version, I wouldn't suppress the OS Error (I'd follow what we currently do for TemporaryDirectory) |
"delete_after" what? I know it is somewhat implicit in the fact that it is a context manager call, but that is not the only context the method name will be seen in. (eg: 'dir' list of methods, doc index, etc). Even as a context manager my first thought in reading it was "delete after what?", and then I went, "oh, right". How about "delete_on_exit"? |
By the way, I still think it would be nicer just to have the context manager work as expected with delete=True (ie: doesn't delete until the end of the context manager, whether the file is closed or not). I'm OK with being voted down on that, though. |
Indeed, the current behaviour under Windows seems to be kind of a |
I agree. If the primary usage of the class does not work well on Windows, developers will continue to write code using the primary usage because it works on their unix system, and it will continue to cause failures when run on windows. Because Python should run cross-platform, I consider this a bug in the implementation and would prefer it be adapted such that the primary use case works well on all major platforms. If there is a separate class method for different behavior, it should be for the specialized behavior, not for the preferred, portable behavior. I recognize there are backward-compatibility issues here, so maybe it's necessary to deprecate NamedTemporaryFile in favor of a replacement. |
Well, fixing NamedTemporaryFile in either of the ways we've discussed isn't going to fix people writing non-portable code. A unix coder isn't necessarily going to close the file before reading it. However, it would at least significantly increase the odds that the code would be portable, while the current situation *ensures* that the code is not portable. |
Daniel. If you have any interest in this issue, would you mind |
Tim Golden, I have implemented a Windows-friendly solution to the latter case using Nick Coghlan's code. My version does not delete the file until the context manager exits, and *if* the context manager exits due to an exception it leaves the file in place and reports its location, to aid me in debugging. |
Daniel, Nick, shouldn't the context manager yield f within a with block? |
Rather than add a NamedTemporaryFile.delete_after() classmethod, would it not be simpler to just add a close_without_unlink() method to NamedTemporaryFile? with NamedTemporaryFile() as f:
<write to f>
f.close_without_unlink()
with open(f.name, 'rb') as f:
<read from f> |
Davide, the @contextlib.contextmanager decorator effectively wraps the On Sat, Jun 30, 2012 at 1:46 AM, Davide Rizzo <report@bugs.python.org>wrote:
|
Looking at the various comments, I think we have 5 votes for deleting on CM exit when used as a CM, and no change in behaviour otherwise (me, Zachary, Ethan, Jason and Steve). Steve also wants O_TEMPORARY to be removed, which doesn't seem controversial among this group of people. Eryk has argued for a delete_on_close flag that would need to be explicitly set to False, retaining the use of O_TEMPORARY in the default case, but there doesn't seem to be a lot of support for that. If I've misrepresented anyone's view, please speak up! I didn't look back at the stuff from 2013 and earlier, I'll admit. I do think this needs care to implement (and document!) correctly. For example, consider the following case: ntf = NamedTemporaryFile()
# Do some stuff (1)
with ntf:
# Do some stuff (2)
# Do some followup stuff I assume we'd want a close in (1) to delete the file, but a close in (2) to leave it open until the CM exit. Evgeny, would you be willing to update your PR (including adding the docs change, and tests to catch as many edge cases as you can think up) to match this behaviour? |
Removing O_TEMPORARY is not an afterthought here. It is the core of this issue. The O_TEMPORARY flag MUST NOT be used if the goal is to make NamedTemporaryFile() "particularly useful on Windows". A file that's opened with DELETE access cannot be reopened in most cases, because most opens do not share delete access, but it also can't be closed to allow it to be reopened because the OS will delete it. I replied twice that I thought using the CM exit instead of O_TEMPORARY is okay for NamedTemporaryFile() -- but only if a separate implementation of TemporaryFile() that uses O_TEMPORARY is added at the same time. I want guaranteed cleanup for TemporaryFile() since it's not intended to be reopened. |
At the moment, the TemporaryFile directly reuses NamedTemporaryFile for none-posix or cygwin systems. Line 552 in a92d738
Does it mean, that your suggestion to leave the O_TEMPORARY for TemporaryFile means, that NamedTemporaryFile needs to have a mechanism to know whether it was called as a TemporaryFile and then to have a different functionality in this case relative to the situation it would be called directly? |
Paul, thank you for moving this forward. I will give it a try. |
Just implement a separate function for TemporaryFile() instead of aliasing it to NamedTemporaryFile(). See msg390814. |
Eryk, thank you for clarifying. I apologise - I got bogged down somewhere in the middle of the discussion on reimplementing bits of the CRT (your posts are so information-dense that my usual habit of skimming breaks down - that's not a complaint, though!) |
Eryk, forgive my ignorance, but aren't in your msg390814 you are proposing yet another enhancement (separate from the bpo-14243, discussed here), in this case for TemporaryFile in Windows systems? I may be mistaken, but I see that you are proposing some trick with first unlinking the file and then employing it as a temporary file using previously known fd. This "trick" is not present in the current code and does not seem to address bpo-14243. Or am I talking total nonsense? |
It doesn't need to unlink the file to make it anonymous. I included that to highlight that it's possible. The details can be discussed and hashed out in the PR. I don't think implementing TemporaryFile() in Windows is separate from this issue. It goes hand in hand with the decision to stop using O_TEMPORARY in NamedTemporaryFile(). |
Eryk, I agree, that implementing TemporaryFile() in Windows goes hand in hand with the decision to stop using O_TEMPORARY in NamedTemporaryFile() The only thing I want to point out is that your suggestion also includes this "unlinking trick" (sorry, may be there is a better description for this), which seems to be separate/extra to the usage of O_TEMPORARY. Hence my newbie questions are:
|
+1 to Eryk.
Same to TempoaryFile in Unix.
I don't think so. We didn't unlink because wi didn't have separate implementation for TemporaryFile and NamedTemporaryFile.
I am not sure. But it is "POSIX function" in windows. I believe MS won't break compatibility.
We already have tests for TemporaryFile.
If anyone against it. But I think Eryk's proposal is the most reasonable. |
…ryFile on Windows and Python 3.12. delete_on_close is available in Python 3.12: - python/cpython#58451 - python/cpython#97015 so we don't need a custom NamedTemporaryFile implementation anymore.
As far as I can tell, this was fixed with the merge of GH-97015. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: