Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on avoiding race conditions when multiple jobs try to access and (re)generate the same scratch space #26

Open
fingolfin opened this issue Jan 7, 2022 · 2 comments

Comments

@fingolfin
Copy link
Member

fingolfin commented Jan 7, 2022

Motivated by oscar-system/Polymake.jl#381 :

A user may do @everywhere using PACKAGENAME on a package which uses scratch spaces. if the scratch space is missing/outdated, then each of the parallel jobs may try to regenerate it, creating a mess.

I think it would be good to warn about this situation in the Scratch.jl documentation, and perhaps also provide some guidance on how to deal with that.

@fredrikekre
Copy link
Member

@fingolfin
Copy link
Member Author

Perhaps? The documentation of Pidfile.jl is a bit sparse... I assume one ought to use one of its API functions to create a lock file, then create the scratch space and finally release the lock? An example would be useful. I also wonder how well it e.g. works on NFS volumes (still see those in labs for home dirs), and whether it works on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants