Skip to content

Refactor usage of temporary files in multipers.io#49

Merged
DavidLapous merged 7 commits intoDavidLapous:mainfrom
maarzt:simple-tempfiles
Oct 27, 2025
Merged

Refactor usage of temporary files in multipers.io#49
DavidLapous merged 7 commits intoDavidLapous:mainfrom
maarzt:simple-tempfiles

Conversation

@maarzt
Copy link
Copy Markdown
Contributor

@maarzt maarzt commented Oct 24, 2025

Thank you for providing the amazing multipers package. It's a powerful package, doing some amazing math. My colleagues are using it for their research. While doing so, they got into trouble with the creation of temp files. Turns out the temp folder was simply running out of storage space. I think they contacted you as well...

While investigation this issue I noticed that the code dealing with temporary files can be simplified. The trick would be to use the python standard library tempfile.

The tempfile library has the added advantage of allowing users to configure temp dir. It can set by the environment variable "TMPDIR". See https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp
And there are system dependent defaults: on unix it is /tmp/ like currently in your code.

The "TMPDIR" environment variable would also help my colleague, with the running out of space problem. As he could configure a directory with enough storage space.

Please note that I removed the parameter 'id'. As it is no longer used. And if users set 'clear=False', they would find the temporary files in a different location.

The code can be simplified by using python's "tempfile"
library. Which has some benefits:
* A system dependent temp directory is automatically selected. For example '/tmp/' on linux.
* Temp files are automatically deleted.
* Users can customize the temp directory using the
  environment variable "TMPDIR".
multipers.io no longer has a global variable input_path
@DavidLapous
Copy link
Copy Markdown
Owner

Thanks for taking a look into it ! This solution is indeed much more elegant.
Quick question : The id was mostly meant to allow parallelism with external software, and as I read it in the doc :

Files names used by this module include a string of random characters which allows those files to be securely created in shared temporary directories

So IIC, the folder names are “almost surely” different, but is this guaranteed ? In the worse case, we can just re-add the PID in the loop.

The github tests do not include the full scc suite (since there is no mpfree & others there) I'll check that asap on my laptop.

@maarzt
Copy link
Copy Markdown
Contributor Author

maarzt commented Oct 27, 2025

Hi @DavidLapous, thank you for reviewing this so quickly!

Regarding your question:

Quick question : The id was mostly meant to allow parallelism with external software, and as I read it in the doc :

Files names used by this module include a string of random characters which allows those files to be securely created in shared temporary directories

You're absolutely right that the documentation gives that impression. However, randomized file names aren't the only safeguard in the tempfile package. I looked into the implementation, and another key mechanism is the handling of FileExistsError — see this line.

tempfile is designed precisely for this kind of use case, and it's widely used. Essentially test daily a billion times. The docs (somewhat cryptically) states:

Creates a temporary directory in the most secure manner possible. There are no race conditions in the directory’s creation.

I interpret that as: "there’s no better way to safely create a temp directory."

Implementing a custom solution risks introducing subtle bugs. Things become difficult especially when it comes to concurrency.

@maarzt
Copy link
Copy Markdown
Contributor Author

maarzt commented Oct 27, 2025

The github tests do not include the full scc suite (since there is no mpfree & others there) I'll check that asap on my laptop.

It's great that you have the unit tests in place. Right as it should be ;)
I run the tests on my computer with mpfree installed, and it worked.

@DavidLapous DavidLapous merged commit f960e95 into DavidLapous:main Oct 27, 2025
17 checks passed
@DavidLapous
Copy link
Copy Markdown
Owner

It also worked on my machine, LGTM ! Thank you for the contribution !

@maarzt
Copy link
Copy Markdown
Contributor Author

maarzt commented Oct 28, 2025

Thank you very for reviewing and merging this so quickly. It's awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants