Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spack locking on some NFS systems #32426

Merged
merged 4 commits into from
Sep 6, 2022
Merged

Conversation

sethrj
Copy link
Contributor

@sethrj sethrj commented Aug 29, 2022

It turns out that on some filesystem/os combinations, calling os.makedirs can raise PermissionError (errno.EPERM) if the directory is extant and readable, but not writable. This call happened before trying to activate environments.

Fixes #17407

@spackbot-app spackbot-app bot added core PR affects Spack core functionality locking utilities labels Aug 29, 2022
@sethrj sethrj requested a review from alalazo August 29, 2022 16:23
Copy link
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think catching OSError and just warning if the directory already exists is too broad. We're trying to be really specific in llnl.util.lock and I think it's better to fail fast in the corner cases than to let something potentially dangerous slip through.

#17407 is giving you errno 13, which is EACCES. If that is the same error you see with NFSv4, I think we should check for that errno specifically, and allow it if it happens that the directory exists.

This is ok since as you note, we may be trying to get a read-only lock. I think we could avoid this issue altogether in Python 3 with the exist_ok parameter to os.makedirs but sadly we (for the time being) need to support 2.7. It'd be interesting to know if exist_ok gets rid of the permissions error if you are trying this with Python 3.

@tgamblin
Copy link
Member

tgamblin commented Sep 5, 2022

Ok actually I just looked at the cpython source, and it's broad. So why don't we just go with that. Here is what they do in Python 3:

    try:
        mkdir(name, mode)
    except OSError:
        # Cannot rely on checking for EEXIST, since the operating system
        # could give priority to other errors like EACCES or EROFS
        if not exist_ok or not path.isdir(name):
            raise

So apparently we care about EROFS and maybe others too.

lib/spack/llnl/util/lock.py Outdated Show resolved Hide resolved
Co-authored-by: Todd Gamblin <tgamblin@llnl.gov>
@tgamblin tgamblin merged commit c7292aa into spack:develop Sep 6, 2022
@sethrj sethrj deleted the fix-eperm branch September 6, 2022 17:10
@sethrj
Copy link
Contributor Author

sethrj commented Sep 6, 2022

Thanks @tgamblin !

ma595 pushed a commit to ma595/spack that referenced this pull request Sep 13, 2022
Co-authored-by: Todd Gamblin <tgamblin@llnl.gov>
@tpdownes
Copy link

tpdownes commented Oct 7, 2022

@tgamblin the way I read things, this change is not presently slated to make it into v0.18.2. Is it possible to make that happen? We have been hitting it regularly maintaining shared Spack installations for our HPC users.

GoogleCloudPlatform/cluster-toolkit#539

@haampie
Copy link
Member

haampie commented Oct 7, 2022

Me and @alalazo are usually the only people doing backports, but I've largely stopped doing that since #24718 because pretty much nothing applies cleanly.

This is a small fix, so we can probably do it.

haampie pushed a commit that referenced this pull request Oct 7, 2022
Co-authored-by: Todd Gamblin <tgamblin@llnl.gov>
@haampie haampie mentioned this pull request Oct 7, 2022
12 tasks
haampie pushed a commit that referenced this pull request Oct 7, 2022
Co-authored-by: Todd Gamblin <tgamblin@llnl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core PR affects Spack core functionality locking utilities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Permission error when activating environment by a different user
4 participants