Refactor usage of temporary files in multipers.io#49
Refactor usage of temporary files in multipers.io#49DavidLapous merged 7 commits intoDavidLapous:mainfrom
Conversation
The code can be simplified by using python's "tempfile" library. Which has some benefits: * A system dependent temp directory is automatically selected. For example '/tmp/' on linux. * Temp files are automatically deleted. * Users can customize the temp directory using the environment variable "TMPDIR".
multipers.io no longer has a global variable input_path
|
Thanks for taking a look into it ! This solution is indeed much more elegant.
So IIC, the folder names are “almost surely” different, but is this guaranteed ? In the worse case, we can just re-add the PID in the loop. The github tests do not include the full scc suite (since there is no mpfree & others there) I'll check that asap on my laptop. |
|
Hi @DavidLapous, thank you for reviewing this so quickly! Regarding your question:
You're absolutely right that the documentation gives that impression. However, randomized file names aren't the only safeguard in the
I interpret that as: "there’s no better way to safely create a temp directory." Implementing a custom solution risks introducing subtle bugs. Things become difficult especially when it comes to concurrency. |
It's great that you have the unit tests in place. Right as it should be ;) |
|
It also worked on my machine, LGTM ! Thank you for the contribution ! |
|
Thank you very for reviewing and merging this so quickly. It's awesome! |
Thank you for providing the amazing multipers package. It's a powerful package, doing some amazing math. My colleagues are using it for their research. While doing so, they got into trouble with the creation of temp files. Turns out the temp folder was simply running out of storage space. I think they contacted you as well...
While investigation this issue I noticed that the code dealing with temporary files can be simplified. The trick would be to use the python standard library tempfile.
The tempfile library has the added advantage of allowing users to configure temp dir. It can set by the environment variable "TMPDIR". See https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp
And there are system dependent defaults: on unix it is
/tmp/like currently in your code.The "TMPDIR" environment variable would also help my colleague, with the running out of space problem. As he could configure a directory with enough storage space.
Please note that I removed the parameter 'id'. As it is no longer used. And if users set 'clear=False', they would find the temporary files in a different location.