Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.copyFile fails with permission denied when source is read-only #3322

Closed
kakaroto opened this issue Oct 8, 2021 · 8 comments · Fixed by #3354
Closed

fs.copyFile fails with permission denied when source is read-only #3322

kakaroto opened this issue Oct 8, 2021 · 8 comments · Fixed by #3354
Labels

Comments

@kakaroto
Copy link

kakaroto commented Oct 8, 2021

Reposting this as a new issue since I cannot re-open issue #3117 and it appears comments from a closed issue might not be seen, as I have not received a response in 6 weeks : #3117 (comment)_

I see that this fix was released as part of libuv 1.42.0 which was included in node 16.7.0 released last week.
I've tested my previous test case with node 16.7.0 and unfortunately the issue is still there.

Note that I was also able to reproduce a similar problem with a cifs mount (which also affects 12.18.x) and which also still happens in 16.7.0. Hopefully, with cifs (SMB mount), it should make it much easier to test and reproduce the issue on dev environments.

docker run --rm -it -v /cifs-mount/tmp:/mount node:12.19.0-alpine /bin/sh

su - node
echo foo > /tmp/foo
chmod -w /tmp/foo
cat <<EOF > test.js
  const fs = require("fs");
  fs.copyFileSync("/tmp/foo", "/mount/bar");
EOF

rm -f /mount/bar
node test.js
exit
exit

(Note that running the same commands as above but without the chmod -w on the input file does not result in any errors.)

The difference between cifs and cephfs mounts in the behavior is that cephfs doesn't have the issue on node 12.18.4 (libuv 1.38.0), but has the problem on node 12.19.0 (libuv 1.39.0), while cifs has the problem on both versions. The other difference is with cifs, the target file actually gets created (but empty), despite the fact that it throws the EPERM error, while on cephfs, it will not create the target file at all.
Note that I tested/confirmed the cephfs behavior on Linux 4.19.0-13 (Linux 4.19.0-13-cloud-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux) while I tested/confirmed the cifs behavior on Linux 5.0.16-100 (Linux 5.0.16-100.fc28.x86_64 #1 SMP Tue May 14 18:22:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux)

@bnoordhuis
Copy link
Member

I'm more than willing to believe network file systems are still troublemakers. I don't have time to investigate but I can review pull requests.

@bnoordhuis bnoordhuis added the bug label Oct 9, 2021
@kakaroto
Copy link
Author

I'm more than willing to believe network file systems are still troublemakers. I don't have time to investigate but I can review pull requests.

Thanks for answering. I'll have to dig into libuv in that case and see how it can be fixed. I had a quick look and it sounds like it's file_copy_range that's a bit of a nightmare to handle, and seeing as support for it was indeed added between 1.38.0 and 1.39.0, it's likely the culprit here. The question would therefore have to be "how to determine if the error is caused by the syscall misbheaving vs a legitimate error".
If node had a flag to tell it not to use file_copy_range in the fs.copyFile APIs, that would also make it easier, but I don't want to add any changes to the API, as it likely would be rejected.
If you have any ideas, feel free to share then I can poke at it and try things and test before sending a PR.
Thanks

@bnoordhuis
Copy link
Member

It probably needs to be handled the same way libuv deals with buggy cephfs mounts, by detecting and blacklisting at runtime.

You're right that backwards-incompatible API changes won't be accepted into the v1.x branch.

@kakaroto
Copy link
Author

Humm... I'm thinking that EPERM error should basically never be returned. A permission denied error should be returned when trying to open the input file for reading or the output file for writing, but it shouldn't be an error returned by the file_copy_range since we already have valid open file descriptors at that point.
I can't think of a use case where EPERM would be a valid error that could be returned. The documentation for the syscall says however :
EPERM fd_out refers to an immutable file.
I'm assuming it means "the fd_out file descriptor was created by opening a file in read-only mode".

My proposal would be that if we get a EPERM error for the file_copy_range we fallback on using a regular copy API, but without entirely marking the syscall as unusable (like the cephfs_buggy check does), this way it would continue to be used in other situations, but just fall back on a regular copy for those specific files that cause the permission error.

Does that sound right to you? If it looks like I'm on the right track for an implementation, I'll set it up and test then push the PR for review.
I think it's the best solution, but if you think that I should entirely disable the file_copy_range for all future calls (by marking it unusable) if EPERM is returned, or if you think additional checks need to happen rather than just checking for errno == EPERM let me know.
Thanks!

bnoordhuis added a commit to bnoordhuis/libuv that referenced this issue Nov 8, 2021
It's been reported that copy_file_range() on a CIFS mount fails with
an EPERM error, even though the source and target files have the
appropriate permissions.

This is probably a variation on the EPERM error that libuv handles when
it tries to fchmod() a file on a CIFS mount that hasn't been mounted
with the "noperm" option, so let's handle it here too.

This commit applies minor refactoring because the "work around file
system bugs" section got quite unwieldy.

I also snuck in a bug fix where the workaround for buggy CephFS mounts
disabled copy_file_range() permanently.

Fixes: libuv#3322
@bnoordhuis
Copy link
Member

Sorry @kakaroto, I missed your reply. Your approach seems reasonable and we're already handling a similar CIFS EPERM issue elsewhere.

I opened #3354, can you check if that solves the issue for you? Thanks.

bnoordhuis added a commit that referenced this issue Nov 9, 2021
It's been reported that copy_file_range() on a CIFS mount fails with
an EPERM error, even though the source and target files have the
appropriate permissions.

This is probably a variation on the EPERM error that libuv handles when
it tries to fchmod() a file on a CIFS mount that hasn't been mounted
with the "noperm" option, so let's handle it here too.

This commit applies minor refactoring because the "work around file
system bugs" section got quite unwieldy.

I also snuck in a bug fix where the workaround for buggy CephFS mounts
disabled copy_file_range() permanently.

Fixes: #3322
@kakaroto
Copy link
Author

Sorry @kakaroto, I missed your reply. Your approach seems reasonable and we're already handling a similar CIFS EPERM issue elsewhere.

I opened #3354, can you check if that solves the issue for you? Thanks.

I haven't had a chance to test it yet (didn't look into how to build nodejs with a custom libuv), but reviewing the code, it seems to handle CIFS only, but not CephFS, so it wouldn't fix my specific problem, but the patch looks good otherwise and would likely fix the problem I saw on CIFS and if the check for filesystem was done for type cephfs too, then it would likely solve it for me.
Thanks.

@bnoordhuis
Copy link
Member

didn't look into how to build nodejs with a custom libuv

You can normally drop libuv in node's deps/uv, then run ./configure && make -j<numcpus> again.

Can you check if the cephfs bug still exists in newer kernels? I suspect it's fixed in 4.20 but you're running 4.19, right?

JeffroMF pushed a commit to JeffroMF/libuv that referenced this issue May 16, 2022
It's been reported that copy_file_range() on a CIFS mount fails with
an EPERM error, even though the source and target files have the
appropriate permissions.

This is probably a variation on the EPERM error that libuv handles when
it tries to fchmod() a file on a CIFS mount that hasn't been mounted
with the "noperm" option, so let's handle it here too.

This commit applies minor refactoring because the "work around file
system bugs" section got quite unwieldy.

I also snuck in a bug fix where the workaround for buggy CephFS mounts
disabled copy_file_range() permanently.

Fixes: libuv#3322
Varstahl added a commit to Varstahl/libuv that referenced this issue Mar 10, 2023
Trying to copy a read-only file onto a ceph-fuse filesystem fails,
returning an `EACCES` error. This happens when the destination
doesn't exist yet, and a new file is created.

By checking that the error matches, and that the destination file
is empty, we can fix this issue while waiting for a proper Ceph
fix to be upstreamed.

Fixes: libuv#3919
Refs: nodejs/node#37284
Refs: libuv#3117
Refs: libuv#3322
@kakaroto
Copy link
Author

Commenting here for posterity:
After over a year and a half being stuck on 12.18, we finally got around to testing this more thoroughly, since it never got fixed for us, and found that the issue wasn't with the sendfile, but an issue with the ftruncate call instead.
If the source file is read-only, then the source mode gets applied to the destination file when it's opened with O_CREAT, and the permission error is when trying to truncate the file which is where the cephfs bug is, returning EACCESS error since the file mode is read-only, even if the file descriptor is open with write flags.
We've created a new issue for that (#3919) and sent a PR (#3920) with a known-to-work fix.

bnoordhuis pushed a commit that referenced this issue Mar 12, 2023
Trying to copy a read-only file onto a ceph-fuse filesystem fails,
returning an `EACCES` error. This happens when the destination
doesn't exist yet, and a new file is created.

By checking that the error matches, and that the destination file
is empty, we can fix this issue while waiting for a proper Ceph
fix to be upstreamed.

Fixes: #3919
Refs: nodejs/node#37284
Refs: #3117
Refs: #3322
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants