Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] File system operations fail with permission errors (chown, rm, others?) #473

Open
jSML4ThWwBID69YC opened this issue May 28, 2022 · 11 comments

Comments

@jSML4ThWwBID69YC
Copy link

Have you read through available documentation, open Github issues and Github Q&A Discussions?

Yes

System information

Your moosefs version and its origin (moosefs.com, packaged by distro, built from source, ...).

FreeBSD ports version 3.0.116.

Operating system (distribution) and kernel version.

FreeBSD 13.1 on the client.
FreeBSD 13.0/1 mix on master/chunkservers

Hardware / network configuration, and underlying file systems on master, chunk servers, and clients.

1x Master
4x chunk servers.
1x client for testing
The underlying file system on all servers is ZFS. All servers have plenty of free memory and space.

2x LACP 40Gib connections between all servers.
Mount: mfsmount IP:/test -o mfspassword=***** -o mfscachemode=DIRECT /root/test

How much data is tracked by moosefs master (order of magnitude)?

  • All fs objects: 9078833
  • Total space: 57TiB
  • Free space: 50TiB
  • RAM used: 2.9GiB
  • last metadata save duration: 2.4s

Describe the problem you observed.

Basic file system operations fail with permission errors when they should not. Sometimes you receive an error on standard output, sometimes it fails silently. In all cases, there is a corresponding EACCES error in the .oplog.

Can you reproduce it? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

It's easy to reproduce. Create a folder with a bunch of files in it. Then try and chown, or rm the files. I can't replicate the issue with a single file. It seems to require multiple files at once to trigger.

1: Use FreeBSD 13.1 as a client.
2. Install moosefs3-client and mount an export.
3: Create a folder with multiple files on the mount. Then try typical file system operations in bulk. I've tested chown, and 'rm -rf' over directories with a thousand files in it for testing. The issue triggers most of the time. Here's some examples.

Example 1:
chown -R 5000:www /mfsmount/
chown: fts_read: Permission denied

Example 2:
rm -rf /mfsmount/

The rm command simply stops. The following is in the oplog at the time.
05.27 21:54:12.741066: uid:0 gid:0 pid:10650 cmd:lookup (12860998,..): OK (1.0,12852070,1.0,[drwxr-s---:0042750,4,0,0,1653688450,1653670283,1653670423,2006637])
05.27 21:54:12.741079: uid:0 gid:0 pid:10650 cmd:access (12852070,0x4): OK (forced - kernel bug workaround)
05.27 21:54:12.741207: uid:0 gid:0 pid:10650 cmd:opendir (12852070): OK [handle:00000033]
05.27 21:54:12.741220: uid:0 gid:0 pid:10650 cmd:getattr (12852070) [no handle] (using open dir cache): OK (1.0,[drwxr-s---:0042750,4,0,0,1653688418,1653670283,1653670423,2007599])
05.27 21:54:12.741229: uid:0 gid:0 pid:10650 cmd:access (12852070,0x1): EACCES (Permission denied)
05.27 21:54:12.741240: uid:0 gid:0 pid:10650 cmd:releasedir (12852070) [handle:00000000]: OK

Chown and rm trigger the issue. The cp, and mv commands seem to work. I've not tested any other commands.

I am not able to trigger this on FreeBSD 13.0 operating on the same physical network, and same MooseFS setup. It seems new to version 13.1.

@chogata
Copy link
Member

chogata commented May 31, 2022

Thank you for the report, I will try to test it ASAP and see what is happening exactly...

@jSML4ThWwBID69YC
Copy link
Author

jSML4ThWwBID69YC commented May 31, 2022

Here are replication steps that consistently trigger the issue for me.

On FreeBSD 13.1

mfsmount IP:/test -o mfspassword=*** -o mfsmkdircopysgid=1 -o mfscachemode=DIRECT /test
cd /test
fetch https://www.drupal.org/download-latest/zip
unzip zip
rm -rf drupal-9.3.14

Note that you can't successfully rm the drupal-9.3.14 folder. The command appears to complete, but the files are still there.

Follow the exact same steps on FreeBSD 13.0 and it works as expected.

In both cases, MooseFS is built against fusefs-libs3-3.11.0.

@chogata
Copy link
Member

chogata commented Jun 9, 2022

Just dropping a note here that we are looking into it and so far we found out that the difference between FreeBSD 13.0 and 13.1 is that the latter performs an operation the former never did and this operation behaves wrongly. It is still to be determined whether this is MooseFS's fault or FreeBSD's.

@jSML4ThWwBID69YC
Copy link
Author

Thank you for the update. Hopefully it's an easy one to fix.

@jSML4ThWwBID69YC
Copy link
Author

@chogata

Any luck?

@chogata
Copy link
Member

chogata commented Jun 24, 2022

Yes, we have found the issue, we'll try to commit the fix soon.

@jSML4ThWwBID69YC
Copy link
Author

Hi @chogata

Any luck with a patch? I've FreeBSD 13.0 and 13.1 servers to work with if you need a second tester.

@alpharde
Copy link

I'm also getting hit by this bug. Any ETA for a fix?

@jSML4ThWwBID69YC
Copy link
Author

@alpharde
Copy link

alpharde commented Dec 5, 2022

Patch failed against v3.0.116 branch, can I simply compile mfsmaster from the master branch and use it instead?

@pkonopelko
Copy link
Member

Yes, you can build MooseFS from the master branch and use it.
We recommend to use ./freebsd_build.sh script to set correct paths etc. and build, then make install.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants