Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied when creating files w/ O_EXCL over NFS #343

Closed
dreamnid opened this issue Dec 8, 2016 · 54 comments
Closed

Permission denied when creating files w/ O_EXCL over NFS #343

dreamnid opened this issue Dec 8, 2016 · 54 comments

Comments

@dreamnid
Copy link

dreamnid commented Dec 8, 2016

I'm getting a permission error on creating files with the O_EXCL flag on NFS mounts.

It works fine if I...

  • Create the file directly on the file server in the mergerfs pool
  • Use a NFS mount that is not sourced by mergerfs (to disprove some google suggestions that NFS itself may not work with O_EXCL. It had worked with flexraid storage pool fine so fairly sure it's not NFS)
  • Squash all the nfs access to be under root

I believe the above suggest that there is something in mergerfs that is causing this issue.

A simple example is when using vim as it tries to create the temporary swap files with the O_EXCL flag, and vim will complain that the swap file cannot be opened.


Ubuntu 16.04
NFSv4
mergerfs 2.17

NFS Server

Hostname: fileserver

/etc/fstab

/mnt/hd0:/mnt/hd1 /mnt/dump  fuse.mergerfs  defaults,allow_other,minfreespace=10G  0       0

/etc/exports

# Broken
/mnt/dump     -rw,sync,no_subtree_check,fsid=1 10.X.X.X/24
# Working - but all files created are under root
/mnt/dump      -rw,sync,no_subtree_check,anonuid=0,anongid=0,all_squash,fsid=1 10.X.X.X/24

ps aux - Shows mergerfs is running under root

root       573  0.6  0.5 138432 21188 ?        S<sl 08:27   5:36 mergerfs /mnt/hd0:/mnt/hd1 /mnt/dump -o rw,allow_other,minfreespace=10G,dev,suid

dpkg -s mergerfs

Package: mergerfs
Status: install ok installed
Priority: optional
Section: utils
Installed-Size: 193
Maintainer: Antonio SJ Musumeci <trapexit@spawn.link>
Architecture: i386
Version: 2.17.0~ubuntu-xenial

NFS Client

/etc/fstab

fileserver:/mnt/dump     /mnt/dump     nfs     defaults,_netdev 0 0

Output of mount

fileserver:/mnt/dump on /mnt/dump type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.X.X.2,local_lock=none,addr=10.X.X.1,_netdev)

Snippet of strace

open(".test.py.swo", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE|O_NOFOLLOW, 0600) = -1 EACCES (Permission denied)

Steps to repro

Assuming that the NFS mounts is setup correctly with the /etc/exports labeled as broken

  1. cd /mnt/dump
  2. vim myfile
    Note error message that pops up concerning error opening the swap file

Alternative example

  1. cd /mnt/dump
    Create a python file named test.py with the following contents:
import os

filename = 'mydummyfile'
flags = os.O_RDWR | os.O_CREAT | os.O_EXCL
#flags = os.O_RDWR | os.O_CREAT
fd = os.open(filename, flags, 0600)

Execute: python test.py
Note the error:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    fd = os.open(filename, flags, 0600)
OSError: [Errno 13] Permission denied: 'myfile'

Note the file mode on myfile is 000
Before running again, delete mydummyfile
Note that removing the os.O_EXCL flag works fine

@trapexit
Copy link
Owner

trapexit commented Dec 8, 2016

mergerfs just passes the flags along. there is nothing fancy going on. it's probably some root squash conflict with mergerfs running as root.

Try stracing mergerfs and see precisely what fails.

@guysoft
Copy link

guysoft commented Jan 20, 2018

Hey,
Getting the same problem, Ubuntu 15.10. Is there a way to debug this? How should I strace?

@trapexit
Copy link
Owner

$ sudo strace -o /tmp/mergerfs.trace -f -e open -p $(pgrep mergerfs)

And while that's running perform the open through mergerfs.

@trapexit trapexit reopened this Jan 20, 2018
@guysoft
Copy link

guysoft commented Jan 20, 2018

Ok, rebooting all the NFS clients seems to solve it. Might also be case for OP

@guysoft
Copy link

guysoft commented Aug 22, 2018

Hey,
Upgraded the servers, getting this bug again, now on ubuntu 18.04. I think the reboot was some fluke. Now I also managed to reproduce it.

The command that @trapexit stays empty during the file creation.

Also now I have a non-production environment so I can test stuff.

mergerfs version: 2.24.2
FUSE library version: 2.9.8-mergerfs
fusermount version: 2.9.7
using FUSE kernel interface version 7.19

@guysoft
Copy link

guysoft commented Aug 22, 2018

Ok,
I got strace both of mergerfs using and of the git command.

Strace command:

sudo strace -f  -o /tmp/mergerfs.strace mergerfs -o defaults,allow_other,use_ino,nonempty /srv/storage1:/srv/storage2 /srv/git_new

Here is mergerfs strace:
mergerfs.strace.txt

git command strace:
git.strace.txt

Can we please reoopen this?

@trapexit
Copy link
Owner

If open wasn't showing up that likely means is that the OS is denying the request before it gets to mergerfs.

Can you trace with timestamps? It's hard to see what happened when otherwise. Also it's preferable to create the simplest example of this as possible.

@trapexit trapexit reopened this Aug 22, 2018
@guysoft
Copy link

guysoft commented Aug 22, 2018

Ok, found something that might be causing it.
The permisions on the disk of the file that is getting permission denined,
/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH.

This means directly without NFS. Or mergerfs. so its a file creation thing.
Are:

-r--r--r-- 1 www-data www-data 0 Sep 24  1970 /srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH

So its read only! and merger fs is trying to access it using O_RDWR|__O_LARGEFILE.
And git on the nfs-client side is using: O_RDWR|O_CREAT|O_EXCL, 0444

@trapexit
Copy link
Owner

I saw that. I'm looking into it. mergerfs does a lot of random stuff under the covers.

@guysoft
Copy link

guysoft commented Aug 22, 2018

Any way I can help?
Saw this: https://github.com/trapexit/mergerfs/blob/master/src/fs_clonefile.cpp#L40

Also to clarify I wrote __O_LARGEFILE and not __O_LARGEFILE because I was using this source to test stuff, and gcc yelled at me to use __O_LARGEFILE and #define _GNU_SOURCE (it returns error 13, permission denied).

#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#define _GNU_SOURCE

main() {
                int fd=open("/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH",O_RDWR|__O_LARGEFILE);
                printf("errno=%d\n",errno);
                close(fd);
}

@trapexit
Copy link
Owner

Can you get traces with a narrow example (like in the original post)? With mergerfs in single threaded mode? The mergerfs trace doesn't make sense to me.

@guysoft
Copy link

guysoft commented Aug 22, 2018 via email

@trapexit
Copy link
Owner

-s

@trapexit
Copy link
Owner

Are you accessing mergerfs via SMB or NFS or some other network filesystem?

@guysoft
Copy link

guysoft commented Aug 22, 2018

Yes, nfs, like the issue suggests.
And it seems like the file is created with the wrong permissions.
It works on nfs, it works on merger fs. It does not work with both.

@trapexit
Copy link
Owner

Right... long day. I was thinking it was the other way. The problem is that NFS (and other network filesystems) break calls up. Tracing git isn't all that helpful since it's not literally what happens on the other end. Best to run mergerfs in debug mode (-d) and provide me that to see what the kernel is asking of mergerfs.

If you look at the mergerfs trace you can see the strange behavior. The open translates to an open, close, utime, stat?, then another open. It's a bit strange in that the file is being created read only but with rdwr flags. If NFS is doing similar with a typical filesystem it should fail as well. If this is a problem with NFS -> FUSE I'm not sure I can fix this easily.

@trapexit
Copy link
Owner

Wrong permissions?

git is doing 0444 and RDWR.

openat(AT_FDCWD, ".git/objects/e6/tmp_obj_rF2TXH", O_RDWR|O_CREAT|O_EXCL, 0444) = -1 EACCES (Permission denied)

In mergerfs we see:

openat(AT_FDCWD, "/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH", O_WRONLY|O_CREAT|O_EXCL, 0100444) = 29
utimensat(AT_FDCWD, "/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH", [{tv_sec=7548, tv_nsec=0} /* 1970-01-01T02:05:48+0000 */, {tv_sec=23066290, tv_nsec=0} /* 1970-09-24T23:18:10+0000 */], AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, "/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_rF2TXH", O_RDWR|O_LARGEFILE) = -1 EACCES (Permission denied)

You can see how it's similar but it's not the same due to the 0444 (read only) flags set initially. If it was 0666 I suspect it'd work.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Ok, I have a test I made, I am not saying its a workaround, you'd be crazy to use something like this in production, but its to confirm what you are saying in #343 (comment) . If I checkout the git code for git version 2.14.1, then I sed replace all the references for 0444 to 0666 and compile it, I get a git binary that managed to run the git add command that failed earlier.

What is strange is that I still got on the mergerfs side:

16988 read(6, "0\0\0\0\16\0\0\0\337#\4\0\0\0\0\0+\246\3\0\0\0\0\0!\0\0\0!\0\0\0"..., 48) = 48
16988 lstat("/srv/storage1/pro/20/website2-2/.git/objects/e6/tmp_obj_Vp2iod", 0x7f2b5bff6120) = -1 ENOENT (No such file or directory)
16988 lstat("/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_Vp2iod", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
16988 openat(AT_FDCWD, "/srv/storage2/pro/20/website2-2/.git/objects/e6/tmp_obj_Vp2iod", O_RDWR|O_LARGEFILE) = 13

For reference code 13 is 'permission denied'. Also its saying its using 0644 and not 0666.

Here is mergerfs strace for this scenario (it will not upload):
https://pastebin.com/eBC388kt

git command strace for this scenario:
https://pastebin.com/wSV4Kw1Z

Moving to try your single mode idea.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Single thread mode reproduction

Here is mergerfs:
mergerfs_single_mode.strace.txt
Here is git:
git_strace_single_git.txt

@trapexit
Copy link
Owner

The return value of 13 is not a errno. It's the file descriptor. If it were an error it'd return -1 and errno would be set to EACCES (13)

@trapexit
Copy link
Owner

I'm not sure how this works normally except that NFS -> FUSE has a non-standard behavior. I'll need to do more research and ask the fuse community.

Ultimately, O_CREAT|O_EXCL is generally not recommended on NFS. Not for this reason but still a problem.

I'm not sure there's a good way for mergerfs to address this. It's doing whats asked of it and NFS is giving it commands that rightfully error. Can you give me the output of mergerfs when using debug mode -d? You'll need to redirect it to a file as it prints to to stderr I believe.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Ok, As requested in #343 (comment) here is a narrow reproduction code.

#include <fcntl.h>
#include <errno.h>
#include <stdio.h>

main() {
        int fd=open("tesfile",O_RDWR|O_CREAT|O_EXCL, 0444);
        printf("errno=%d\n",errno); 
        close(fd);
}

Running it on disk without mergerfs I get:
errno=0
running it with nfs and no mergerfs I get:
errno=0
Running it on mergerfs on disk I get
errno=0
Running it over nfs to mergerfs I get
errno=13

Strace for running it over nfs to mergerfs where we get permission denied:
mergerfs:
test_strace_mergerfs.txt
From the test.c side:
test_strace_tetst_command.txt

For comparison, here is the successful execution of test.c without nfs in between, that succeeds:

mergerfs side:
merger_fs_no_nfs.txt

test.c side:
test_no_nfs.txt

@trapexit
Copy link
Owner

I found a reference of this from several years back. Looks like NFS is issuing a mknod which is then opened. Outside rewriting the mode, which could be narrowed to mknod when regular file and mode is 0444 but that's not ideal, I'm not sure how to address this. I'll ping the libfuse mailinglist.

@trapexit
Copy link
Owner

Yeah, it's clearly some nfs -> fuse issue. NFS must be acting differently between a typical FS and FUSE because that set of requests would fail normally.

@trapexit
Copy link
Owner

Can you strace mergerfs with nfs but without O_EXCL?

@guysoft
Copy link

guysoft commented Aug 23, 2018

Sure
So I am using in this report

#include <fcntl.h>
#include <errno.h>
#include <stdio.h>

main() {
        int fd=open("tesfile",O_RDWR|O_CREAT, 0444);
        printf("errno=%d\n",errno); 
        close(fd);
}

Still getting permission denied 13:

mergerfs side:
merger_fs_no_excel.txt

test without execl side:
test_test_no_execl.txt

I get errno=0 on mergerfs directly. And over nfs I get errno=13

Will note that a second run will give errno=13 also, in both cases.

@trapexit
Copy link
Owner

Thanks. Looks like EXCL isn't part of the problem. It's the 0444 and separation of the command.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Indeed, also just for a test I ran this code where I kept O_EXCL:

#include <fcntl.h>
#include <errno.h>
#include <stdio.h>


main() {
                int fd=open("tesfile",O_RDWR|O_CREAT|O_EXCL, 0666);
                printf("errno=%d\n",errno); 
                close(fd);
}

It gives errno=0. I am not sure if it means OP works. For good measure I also tried:
O_RDWR|O_CREAT|O_EXCL|__O_LARGEFILE|O_NOFOLLOW, 0600
Like mentioned in the the first post here and also got errno=0.
Didn't mean for it to turn out a hijack, but it does seem like its a good idea to keep posting here.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Hey, so I tried switching to unionfs-fuse, and it works there - does it help sharing stack traces from there?

@trapexit
Copy link
Owner

Sure. Looking at it's code it seems it might work because it's not doing the proper thing with permissions.

@trapexit
Copy link
Owner

I don't think this will be possible with NFS over FUSE. It's valid under POSIX for your open(2) to succeed with O_RDWR but perms set to 0444, but only if the file is actually created (which you are ensuring with O_EXCL). The NFS protocol accomplishes this task by sending a compound request (mknod + open) which works fine inside the kernel but loses its atomicity when the FUSE kernel module sends the requests separately to the userspace FUSE server. I think this would be tricky to fix. -Michael Theall

@guysoft
Copy link

guysoft commented Aug 23, 2018

Hey, so I tried switching to unionfs-fuse, >and it works there - does it help sharing stack traces from there?

What is it not doing?

@trapexit
Copy link
Owner

trapexit commented Aug 23, 2018

https://github.com/rpodgorny/unionfs-fuse/blob/master/src/fuse_ops.c#L328

edit: It sets the mode differently rather than straight up. It adds to the mode #define S_PROT_MASK (S_ISUID| S_ISGID | S_ISVTX | S_IRWXU | S_IRWXG | S_IRWXO)

@guysoft
Copy link

guysoft commented Aug 23, 2018

edit: It sets the mode differently rather than straight up. It adds to the mode #define S_PROT_MASK (S_ISUID| S_ISGID | S_ISVTX | S_IRWXU | S_IRWXG | S_IRWXO)

So is that a workaround they do? I don't understand what all those flags do.

@trapexit
Copy link
Owner

Actually I'm wrong. I misread. It's limiting the flags which shouldn't impact it. Can you strace unionfs? It'll be easier to track down.

@guysoft
Copy link

guysoft commented Aug 23, 2018

Yes, but tomorrow because its late now

@guysoft
Copy link

guysoft commented Aug 27, 2018

Hey,
Sorry for the delay, here is the strace of unionfs-fuse.

The command I ran was

strace -o /tmp/untion-fuse.strace unionfs-fuse -s -f -o dirs=/srv/storage1=RW:/srv/storage2=RW,use_ino,nonempty,allow_other,async_read  /srv/git_new

Then I executed the test.c binary.

unionfs strace:
unionfs_fuse_strace.txt

test binary side:
test.strace.txt

Will note a file is created, and it indeed is read-only.
Also the git add command also works.

@trapexit
Copy link
Owner

trapexit commented Aug 27, 2018

Is unionfs running as root? I'm not seeing any setuid/setgid in the trace. If so then its cheating. Might want to test with a regular user. This was a problem with mhddfs. It ran as root and didn't properly change credentials meaning it had access to things it shouldn't and hid problems with user's permissions. When they moved to mergerfs they appeared because mergerfs properly sets uid and gid before performing actions.

Opening a 0444 file O_RDWR as a regular user will fail with EACCES but root is allowed.

@guysoft
Copy link

guysoft commented Aug 27, 2018 via email

@trapexit
Copy link
Owner

Try unionfs-fuse as non-root. Yes, mergerfs is being run as root but it doesn't stay root when performing these behaviors.

I'll look at the unionfs-fuse code but I suspect it works because its breaking the rules. It's the only way it could be opening a read only file in RW mode.

A 'root' mode could be dangerous and lead to odd behaviors. I suppose it could be made to attempt all opens as root but that'd be a pretty invasive and ugly change and breaks entitlements.

@trapexit
Copy link
Owner

Might want to try default_permissions with unionfs

@guysoft
Copy link

guysoft commented Aug 27, 2018

unionfs-fuse also works with default_permissions

Command:

strace -o /tmp/untion-fuse.strace unionfs-fuse -s -f -o dirs=/srv/storage1=RW:/srv/storage2=RW,use_ino,nonempty,allow_other,async_read,default_permissions  /srv/git_new

strace:

strace_default_ermissions.strace.txt
(lazy not posting the test.c this time)

@trapexit
Copy link
Owner

Usually default_permissions preempts certain requests and I would have thought it'd catch this one but apparently not.

Regardless I was correct. unionfs-fuse is running as root and does not do proper switching of credentials. The request shouldn't work. mergerfs is technically behaving correctly. I'll need to think more about how this could be addressed without breaking other things or breaking security.

@guysoft
Copy link

guysoft commented Aug 27, 2018

Anything I can do beyond this point? Technically NFS is managing my credentials, so what unionfs-fuse seems like the only workaround.

@trapexit
Copy link
Owner

As I mentioned I need to think about it more. Need to consider if there are straightforward ways to distinguish when it's okay to open as root or not. There things are breaking standards. They shouldn't be working.

@trapexit
Copy link
Owner

@dreamnid @guysoft

Are these concerns for anything besides git?

@guysoft
Copy link

guysoft commented Jul 16, 2020

This was two years ago. But if I remember its relevant to any time you create a file that is read only and then write to it.
Its because the atomic operation of creating a read-only file and writing to it is split in to

  1. creating the file
  2. then writing.
    Which is not the case otherwise.
    However step 2 fails because the file is read only.
    git create such files in its blobs, they don't change because they are named after the hashes.
    I even got a workaround by recompiling git to create read-write files.

tl;dr - no anyhhing that creates read only files with the permission I think mentioned up here

Reproduction code: #343 (comment)

@trapexit
Copy link
Owner

I know it effects anything performing that behavior. I'm asking if the concern of the behavior was solely related to git or also other apps. If the work around should be generic or specific. Given the issue is really with the app assuming POSIX when not using a POSIX setup and the hack is ugly and can only change the chmod's mode value... I prefer a narrow as possible hack.

@guysoft
Copy link

guysoft commented Jul 16, 2020

I am not really using the software anymore. So I am not sure what to answer about that - do as you see fit in your project, you can list it as a known limitation when using mergefs over nfs.
Its a behavior that only happens with both.

@PhilipOakley
Copy link

Just closing the loop on a question on the Git-users and Git devs mailing list:

The Git use of the particular Posix file permissions is deliberate. The dev discussion is at https://lore.kernel.org/git/CAPx1GvfKxu8gwbp0Gn2dBf9th874skKjD-echeAFr7_77o8FYw@mail.gmail.com/T/#mead6be6c92f0ab29cf9fd600781dea7315e87411.

The git-user thread is at https://groups.google.com/d/msgid/git-users/54b2a1de-73e8-4722-8a1d-ccec027ae625o%40googlegroups.com

@trapexit
Copy link
Owner

Historically O_EXCL was not supported on NFS and wasn't supposed to be used as still mentioned in the open manpage. While the problem with FUSE & NFS is not the same it is still a kernel level implementation detail and effects any FUSE filesystem (and perhaps others).

That said: Yeah. I can add it. The reason I've not is that such a situation (not using O_CREAT|O_EXCL with NFS) is a generally accepted reality for those looking to write compatible software.

@trapexit
Copy link
Owner

Just closing the loop on a question on the Git-users and Git devs mailing list:

The Git use of the particular Posix file permissions is deliberate. The dev discussion is at https://lore.kernel.org/git/CAPx1GvfKxu8gwbp0Gn2dBf9th874skKjD-echeAFr7_77o8FYw@mail.gmail.com/T/#mead6be6c92f0ab29cf9fd600781dea7315e87411.

The git-user thread is at https://groups.google.com/d/msgid/git-users/54b2a1de-73e8-4722-8a1d-ccec027ae625o%40googlegroups.com

Thanks for that. And yeah... that's what I expected them to say.

@guysoft
Copy link

guysoft commented Jul 16, 2020

I had a talk.with git devs at the time. I was considering a PR to git, there was something they would accept, but don't remember it.

@trapexit
Copy link
Owner

#786

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants