Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFS permission problem on MacOS Sierra #8061

Closed
crashev opened this issue Nov 30, 2016 · 116 comments
Closed

NFS permission problem on MacOS Sierra #8061

crashev opened this issue Nov 30, 2016 · 116 comments

Comments

@crashev
Copy link

crashev commented Nov 30, 2016

Vagrant version

Vagrant 1.9.0

Host operating system

MacOs Sierra 10.12.1

Guest operating system

Linux - Debian 8.5 - mokote/debian-8 (version 8.5)

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.synced_folder "~/SparkSoftware/applications", "/srv", type: "nfs", :linux__nfs_options => ["rw,no_subtree_check,no_root_squash"]
  config.vm.box = "spark"
  config.vm.network "private_network", ip: "192.168.33.10"
end

Debug output

vagrant@lb:/srv/octopus$ ls -ld README.md
-rw-r--r-- 1 501 dialout 221 Nov 30 15:59 README.md
vagrant@lb:/srv/octopus$ ls -ld app/config/parameters.yml
-rw-r--r-- 1 501 dialout 2905 Nov 28 14:30 app/config/parameters.yml
vagrant@lb:/srv/octopus$ rm README.md
vagrant@lb:/srv/octopus$ rm app/config/parameters.yml
rm: cannot remove ‘app/config/parameters.yml’: Permission denied
vagrant@lb:/srv/octopus$

And on MacOS, file /etc/exports contains:

# VAGRANT-BEGIN: 501 9f80a7df-67dc-4bee-b0c6-38a21aab06a2
"/Users/user/SparkSoftware/applications" 192.168.33.10 -alldirs -mapall=501:20
# VAGRANT-END: 501 9f80a7df-67dc-4bee-b0c6-38a21aab06a2

Expected behavior

I should be able to remove any file mounted in nfs share directory.

Actual behavior

I can only delete some files - seems like only in root directory, anything below can
not be deleted.

Steps to reproduce

Everything is in debug.

References

@lightster
Copy link

lightster commented Dec 1, 2016

I'm getting very similar behavior with CentOS 6 and 7 guest OS operating systems. My host operating system is also macOS Sierra. I've tried downgrading/upgrading Vagrant and Virtualbox.

The problem seems to be made worse by restarting my host/physical machine—everything on the shared partitions becomes undeletable as far as I can tell.

I have a few interesting ways to make the files deletable again:

  • If I do an ls -lR from the host machine of the files/directories I'm trying to delete, I can then delete the files via the guest machine.
  • If I edit the file from the guest machine, then I can delete the files from the guest machine! (Yes, that's right, the files are writeable from the guest machine even though they are not deletable)

I'm not sure this is really a Vagrant issue so much as a macOS Sierra nfsd issue, but hopefully the Vagrant community might be able to put information together to come up with a bug report and/or workaround.

@whizkid79
Copy link

i can confirm @lightster 's observations. same issues and solutions here with running ubuntu on parallels.

@joshuasmickus
Copy link

I have this issue too, CentOS 6.8 and Mac OS Sierra.

@kazysgurskas
Copy link

Confirmed with various Vagrant and Virtualbox versions and several guest OSes. Disabling NFS lookupcache seems to work, but the performance is unbearable. Recursive listing and then deletion seems to work as well. I found it the fastest to do find -delete twice to delete files.

@crashev
Copy link
Author

crashev commented Dec 13, 2016

I wonder when someone from Vagrant team will lookup into this, can't properly work on OSX with Vagrant

@dominikzogg
Copy link

@crashev it seems, that its a OSX Bug, not a vagrant ones. Until now there are no known working workarounds, with an acceptable performance

@araines
Copy link

araines commented Dec 13, 2016

@lightster mentioned a workaround - from your host machine (i.e. macOS) if you perform the following operation in the root of the shared folder:

ls -lR > /dev/null

Then that appears to refresh all the NFS links, meaning you are able to operate as normal inside Vagrant again. It isn't ideal, but I've been doing that pretty successfully for a while now.

@crashev
Copy link
Author

crashev commented Dec 13, 2016

@araines just chcked ls -lR but it does seem not to work for me, still a lot of permission denied when trying to delete folder like vendors/ in my php project, but found that's because ls -lR does not go into directories with dot like .git in this case it's better to use:

ls -alR > /dev/null

@joshuasmickus
Copy link

I tried quite a few things, and ended up removing vagrant (with instructions from the website) and installing 1.9.1 with virtualbox 5.1 and that worked for me - now my setup works as expected, not sure if that helps anyone.

@kazysgurskas
Copy link

@joshuasmickus could you please elaborate? As far as I know, vagrant 1.9.1 supports virtualbox up to 5.0, not 5.1

@arendjr
Copy link

arendjr commented Dec 14, 2016

After spending almost a day trying to debug this thing, it seems I have a "solution". The reason we kept on searching was that I was having this exact issue as described in this bug report, but a colleague of mine, also on Sierra, was having no problems: there had to be something more going on.

So, last thing we tried that seems to have fixed it: On my host machine the directories that were shared were owned by the group "staff" whereas on my colleague's machine the shared directories on the host were owned by group "wheel". So I've changed the group owning the shared directories on my machine to "wheel" and now it works.

I don't claim to fully understand why this matters, because in /etc/exports I see the NFS exports map the group to group ID 20, which is actually "staff" on my machine. But I'm done with this, it works and I'm not touching it any further... I hope someone here can confirm whether it works for them though!

Edit: Bummer, it doesn't appear to be entirely solved. Still getting the problem sometimes, even though somehow the situation does appear to have improved for some amount of time...

@jfbibeau
Copy link

@kazgurs - It definitely supports 5.1.x. From the docs page:

The VirtualBox provider is compatible with VirtualBox versions 4.0.x, 4.1.x, 4.2.x, 4.3.x, 5.0.x, and 5.1.x.

@joshuasmickus For what it's worth, I'm still seeing the issue on my setup with VirtualBox 5.1.2, Vagrant 1.9.1.

I think the problem is definitely intermittent so we have to be careful when seeing the problem go away for a while. Too bad the workaround with the group didn't work out either :/ Maybe the effect of doing a chown on the files were having similar effects to the ls -lR command.

@whizkid79
Copy link

@arendjr can you please investigate further what's different to your colleagues's machine?

Right now I'm running a ls every minute on a cron on my host, which hides the issue completely and I haven't noticed any performance side effects on this. But it is quite a dirty workaround.

@jfbibeau
Copy link

jfbibeau commented Dec 14, 2016

Been trying to debug this further into the NFS all morning. I can use rpcdebug on the client side (guest/linux), and I can see that the server seems to return a -13 error code (NFS3ERR_ACCES):

Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/676493), mask=0x81, res=-10
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/676493 fh_crc=0x29063737 ct=2 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/676493), mask=0x1, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(/jf-sandbox) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/28952869 fh_crc=0xf6db0f64 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/28952869), mask=0x1, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(jf-sandbox/build) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802703 fh_crc=0x6720be52 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802703), mask=0x1, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(build/classes) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802705 fh_crc=0x42ffaad2 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802705), mask=0x1, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(classes/main) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802706 fh_crc=0x9c5681b8 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802706), mask=0x1, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(main/com) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: revalidating (0:39/29802710)
Dec 14 12:14:06 nsp-latest kernel: NFS call  getattr
Dec 14 12:14:06 nsp-latest kernel: NFS reply getattr: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802710 fh_crc=0x6ebbe361 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS: (0:39/29802710) revalidation complete
Dec 14 12:14:06 nsp-latest kernel: NFS: dentry_delete(main/com, 10808cc)
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/676493), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(/jf-sandbox) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/28952869), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(jf-sandbox/build) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802703), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(build/classes) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802705), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(classes/main) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802706), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: revalidating (0:39/29802710)
Dec 14 12:14:06 nsp-latest kernel: NFS call  getattr
Dec 14 12:14:06 nsp-latest kernel: NFS reply getattr: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802710 fh_crc=0x6ebbe361 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS: (0:39/29802710) revalidation complete
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(main/com) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS call  access
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802710 fh_crc=0x6ebbe361 ct=1 info=0x27e7f)
Dec 14 12:14:06 nsp-latest kernel: NFS reply access: 0
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802710), mask=0x24, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: open dir(main/com)
Dec 14 12:14:06 nsp-latest kernel: NFS: readdir(main/com) starting at cookie 0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_do_filldir() filling ended @ cookie 2; returning = 0
Dec 14 12:14:06 nsp-latest kernel: NFS: readdir(main/com) returns 0
Dec 14 12:14:06 nsp-latest kernel: NFS: readdir(main/com) starting at cookie 2
Dec 14 12:14:06 nsp-latest kernel: NFS: readdir(main/com) returns 0
Dec 14 12:14:06 nsp-latest kernel: NFS: dentry_delete(main/com, 10808cc)
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/676493), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(/jf-sandbox) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/28952869), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(jf-sandbox/build) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802703), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(build/classes) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802705), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(classes/main) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802706), mask=0x81, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_lookup_revalidate(main/com) is valid
Dec 14 12:14:06 nsp-latest kernel: NFS: permission(0:39/29802706), mask=0x3, res=0
Dec 14 12:14:06 nsp-latest kernel: NFS: rmdir(0:39/29802706), com
Dec 14 12:14:06 nsp-latest kernel: NFS call  rmdir com
Dec 14 12:14:06 nsp-latest kernel: NFS: nfs_update_inode(0:39/29802706 fh_crc=0x9c5681b8 ct=1 info=0x7feff)
Dec 14 12:14:06 nsp-latest kernel: NFS reply rmdir: -13
Dec 14 12:14:06 nsp-latest kernel: NFS: dentry_delete(main/com, 10808cc)

Unfortunately, I can't for the life of me figure out how to use the equivalent of rpcdebug on osx. I've tried upping the verbosity of nfsd on osx but it doesn't go anywhere as verbose as on linux using rpcdebug... No idea how I can look into this further on the server side since there's no logging available, but that looks like where the problem would lie...

@joshuasmickus
Copy link

Srana at my work suggested the following:

Right click on the folder which has the permissions error, and then click "Get info". At the bottom, give it "Read & write" access for "Everyone".

This should solve the permissions error

@arendjr
Copy link

arendjr commented Dec 14, 2016

@joshuasmickus I can confirm I'm having some success by doing that... but I'm holding my breath whether it'll last this time. Fingers crossed!

@whizkid79 I wish I could... My colleague and I are running the exact same virtual machine image now, and we cannot find any further changes on our hosts... Right now I'm sincerely out of ideas of what to try next. I can only hope Joshua's solution will last for now...

@arendjr
Copy link

arendjr commented Dec 15, 2016

One more update: My colleague has suddenly gotten the problem as well. I'm starting to believe no Sierra users are truly immune here...

And @joshuasmickus' workaround was also only a temporary relief...

@kazysgurskas
Copy link

@arendjr it was found in the original issue (#6360) the problem lies within NFS lookupcache. Changing permissions and listing files are just temporary workarounds, as it does refresh the cache. It would be nice if somebody with better understanding of NFS mechanism would dig deeper.

@panique
Copy link

panique commented Dec 15, 2016

Same here, i can reproduce the problem on my own local boxes but also on totally new ones (loading any vagrantfiles from any projects on the web, vagrantfiles that worked perfectly for years, and now they fail due to NFS issue.)

@matleh
Copy link

matleh commented Dec 15, 2016

One workaround is to use unfs3d instead of the nfsd server included with MacOS.

brew install unfs3
sudo nfsd stop
sudo unfsd -e /absolute/path/to/some/exports

The exports file has another syntax then for the "normal" nfsd. I use something like:

/Users 192.168.0.0/16(rw,anonuid=501,anongid=20,all_squash)

@rozagh
Copy link

rozagh commented Dec 16, 2016

@matleh can you also put the vagrantfile command?

@toonvdn
Copy link

toonvdn commented Dec 20, 2016

Spilled couple days on this, but I managed to get it working (for now at least):

  • vagrant destroy
  • remove the user data folder (~/.vagrant.d/)
  • remove the vagrant project folder (.vagrant)
  • vagrant up

EDIT: Nope... Temporary fix...

@matleh
Copy link

matleh commented Dec 20, 2016

@rozagh actually, I don't use Vagrant - this problem also occurs with docker-machine-nfs.
I already moved away from my "solution" again, because it seemed like performance of the unfs3 setup was unbearable.

I am back to stock nfsd and running ls -lr > /dev/null on the host from time to time...

@oleyka
Copy link

oleyka commented Dec 21, 2016

This thread might be related: https://discussions.apple.com/thread/7760267?start=0&tstart=0

@scottsb
Copy link

scottsb commented Nov 11, 2017

Reports in #8788 are that that bug is fixed in High Sierra 10.13.2 Beta 2. Can anybody verify if this bug is fixed there as well?

@jfbibeau
Copy link

I sure hope so. Upgrading to that load this weekend, will very this one as well.

@jfbibeau
Copy link

Upgraded to High Sierra 10.13.2 Beta 2, removed my cronjob that would periodically do an ls -laR as a workaround, and will let it soak running a few clean + builds. It usually takes some random amount of time until this bug manifests, so I'm not calling it fixed yet, but will report back if I see it.

@emileber
Copy link

@jfbibeau Have you had the problem since?

@jfbibeau
Copy link

@emileber I'm happy to report I haven't seen this in almost 2 weeks of using High Sierra. It's completely fixed!

@uberjay
Copy link

uberjay commented Nov 21, 2017

I can confirm, too -- I've been using High Sierra (both beta2 and beta3) for a week with zero NFS issues. Woohoo!

@tristanbes
Copy link

Is this an OPEN beta ? I mean can we upgrade straight from Sierra to High Sierra beta 3 without a developper account ?

@chrisfromredfin
Copy link

I had to sign up for a developer account, but it was free.

@tristanbes
Copy link

thanks @chrisfromredfin

So just to be clear, with this beta, is it safe to upgrade or there are still pending issues with High Sierra ?

@phoenixgao
Copy link

phoenixgao commented Nov 27, 2017 via email

@sinisilm
Copy link

It would seem this issue has come to replace it: #8788

@jfbibeau
Copy link

@sinisilm That one is also fixed in High Sierra latest beta.

@chrisfromredfin
Copy link

Correct, since High Sierra Beta 2 I have not experience #8788 NOR this one, #8061.

@emileber
Copy link

Be careful with High Sierra as there is a major security breach right now: https://www.macrumors.com/2017/11/28/macos-high-sierra-bug-admin-access/

@briancain
Copy link
Member

Also, I recommend installing the latest High Sierra patch from Apple, which fixes the security vulnerability that @emileber mentioned: https://support.apple.com/en-us/HT208315

@lantz
Copy link

lantz commented Nov 30, 2017

Just got this from Apple - apparently it is fixed in 10.13.1 GM. (I can't test it atm since I'm deferring 10.13 due to APFS compatibility issues.)

This is a courtesy email regarding Bug ID# 28927426.  

Please verify this issue with the macOS High Sierra 10.13.1 GM and update your bug report at https://bugreport.apple.com/ with your results.

macOS High Sierra 10.13.1 GM (17B48)
https://developer.apple.com/download/
Posted Date: Oct 31st, 2017

If the issue persists, please attach a new sysdiagnose captured in the latest build and attach it to the bug report. Thank you.

@pnoeric
Copy link

pnoeric commented Mar 24, 2018

Unclear if this is the same bug as what I am seeing, but it seems close. I am on High Sierra 10.13.3, using nfs (with bindfs extension) and Vagrant to develop.

The problem I have is, when I edit a file on the host machine that is in a folder to sync over nfs to the VM, it takes > 7 seconds for the change to go through, and I get an I/O error on the file while I am waiting.

BUT if I change the file, then immediately do an lson the folder from inside the VM, it updates immediately.

Oddly, if I just do an lson the file itself from inside the VM. That doesn't work. I have to just lsthe folder and then all is good.

Any ideas how I can fix this? I see a bunch of workaround above (like running a cron job to ls the directory every minute or whatever... eek)... but would love to just fix the problem.

@emileber
Copy link

@pnoeric it looks like a bug, but it's definitely not related to this issue. You should open a new one.

@chrisfromredfin
Copy link

I have been having somewhat similar issues - but mostly just a seemingly large lag between changing a file from the outside and it being detected on the inside - especially with IDEs using atomic writes. If you open a new issue, please post a reference back here.

@joshlopes
Copy link

@chrisfromredfin that's another issue well known to mac users - the lag / sync between mounted volumes.

This issue has been fixed on the new MacOS versions, no point in keeping it open. Should be closed.

@chrisroberts
Copy link
Member

Closing this up as it should be resolved upstream now. Cheers!

@ghost
Copy link

ghost commented Mar 30, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@hashicorp hashicorp locked and limited conversation to collaborators Mar 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests