-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy() fails on cifs mounts because of incorrect length (cfr_max) specified in streams.c:1584 copy_file_range() #10548
Comments
|
Your methodology to fix this issue sounds correct. You can probably use |
I would love to write a patch, but I am not sure if I am able to do it properly. The main problem here is that PHP uses a length argument with copy_file_range() that is larger than the source file size. ...so as long as PHP can ensure to reduce the length to the source size it would be good, and I guess combining it with looping through copy_file_range() until every byte is copied as proposed in #10440 would be the correct solution here. Can anyone help out to make this happen? |
I can put it on my todo list for this week, unless someone beats me to it |
Fantastic! |
…gth (cfr_max) specified in streams.c:1584 copy_file_range() On some filesystems, the copy operation fails if we specify a size larger than the file size. We use a stat call to clamp the size to copy to the actual filesize. This stat call shouldn't impact performance notably because stat calls can be cached. In some cases (like for /proc files), the returned size is 0, so we should avoid problems by not using copy_file_range in those cases (copy_file_range wouldn't work anyway on this particular example because the syscall is not supported for /proc).
…gth (cfr_max) specified in streams.c:1584 copy_file_range() On some filesystems, the copy operation fails if we specify a size larger than the file size. We use a stat call to clamp the size to copy to the actual filesize. This stat call shouldn't impact performance notably because stat calls can be cached. In some cases (like for /proc files), the returned size is 0, so we should avoid problems by not using copy_file_range in those cases (copy_file_range wouldn't work anyway on this particular example because the syscall is not supported for /proc).
…gth (cfr_max) specified in streams.c:1584 copy_file_range() On some filesystems, the copy operation fails if we specify a size larger than the file size in certain circumstances and configurations. In those cases EIO will be returned as errno and we will therefore fall back to other methods.
Thanks again! Does this mean that PHP will always try to copy_file_range() on cifs with length cfr_max, fail with EIO and then fall back to "classic copying"? I like that it will work, which is the most important improvement for me (so Wordpress can update itself in an Azure Web App). But will this also mean that copies will never be made server side, resulting in PHP reading the source file over the network and writing it back over the network? If so, that sounds unfortunate for those who want to copy larger files over cifs, or those who have limited network bandwidth for such operations. I am no expert, but is it that bad to use a stat call, even it is cached, since it will fall back to classic copying if it fails anyway? I mean there is also already code for "there may be more data; continue copying using the fallback code below" as well. |
Yes. Actually, I did find that it depends on the server. On my testing I found that I could specify a file length larger than the max file length and it worked properly without EIO.
Depends on the configuration.
Hmm that could work even in the case of getting a too small size or having races because we now do fallback in case of EIO. |
Yes, this seems sensible |
… a too large length Some filesystems will cause failures if the max length is greater than the file length. We therefore use a stat call to get the actual file size. Since stat calls are cached this shouldn't have much impact on performance. This addresses a performance concern for CIFS filesystems and is described in phpGH-10548. In case the cached result is less than the actual size, the code will just copy more in the next iteration. In case it is too large, it will at worst fall into the EIO error case and therefore fall back to the mmap or regular copy. In case of races the reasoning is the same as for a wrong cached result. Therefore, this won't cause problems.
Thank you so much for your efforts! |
Description
I noticed that Wordpress on Azure Web App with PHP 8.2 stack on Linux fails to install and upgrade anything because copy() fails when trying to copy files from /home/site/wwwroot/wp-content/upgrade/ to /home/site/wwwroot/wp-content/plugins/.
/home/ is a cifs mount.
The following code:
Resulted in this output:
But I expected this output instead: (no output and successful file copy)
I did a strace:
Here we can see that PHP specifies a very large length value for the fifth argument, and that copy_file_range() results in an I/O error. So I guess the large length value works fine on most filesystems, but not on cifs.
PHP source from php-8.2.2/main/streams/streams.c lines 1576,1577,1584:
I took the copy_file_range() example from https://man7.org/linux/man-pages/man2/copy_file_range.2.html that specifies the correct length by getting the file size with fstat(). Compiling it and running it with strace shows successful copy of the same files within the same /home cifs mount:
The solution would be to do the same as the example on https://man7.org/linux/man-pages/man2/copy_file_range.2.html and get the correct file size of the source file and specify it as the fifth argument to copy_file_range().
Additional info (some parts replaced with ... for privacy):
PHP Version
PHP 8.2
Operating System
Debian 11.6
The text was updated successfully, but these errors were encountered: