Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy() fails on cifs mounts because of incorrect length (cfr_max) specified in streams.c:1584 copy_file_range() #10548

Closed
knowsshit opened this issue Feb 8, 2023 · 9 comments

Comments

@knowsshit
Copy link

Description

I noticed that Wordpress on Azure Web App with PHP 8.2 stack on Linux fails to install and upgrade anything because copy() fails when trying to copy files from /home/site/wwwroot/wp-content/upgrade/ to /home/site/wwwroot/wp-content/plugins/.

/home/ is a cifs mount.

The following code:

if ( ! copy( '/home/site/wwwroot/sourcefile', '/home/site/wwwroot/destfile' ) )
  echo "copy failed\n";

Resulted in this output:

copy failed

But I expected this output instead: (no output and successful file copy)

I did a strace:

copy_file_range(5, NULL, 6, NULL, 9223372036854775807, 0) = -1 EIO (Input/output error)

Here we can see that PHP specifies a very large length value for the fifth argument, and that copy_file_range() results in an I/O error. So I guess the large length value works fine on most filesystems, but not on cifs.

PHP source from php-8.2.2/main/streams/streams.c lines 1576,1577,1584:

                        /* clamp to INT_MAX to avoid EOVERFLOW */
                        const size_t cfr_max = MIN(maxlen, (size_t)SSIZE_MAX);
                        ssize_t result = copy_file_range(src_fd, NULL, dest_fd, NULL, cfr_max, 0);

I took the copy_file_range() example from https://man7.org/linux/man-pages/man2/copy_file_range.2.html that specifies the correct length by getting the file size with fstat(). Compiling it and running it with strace shows successful copy of the same files within the same /home cifs mount:

copy_file_range(3, NULL, 4, NULL, 11, 0) = 11

The solution would be to do the same as the example on https://man7.org/linux/man-pages/man2/copy_file_range.2.html and get the correct file size of the source file and specify it as the fifth argument to copy_file_range().

Additional info (some parts replaced with ... for privacy):

# php -v
PHP 8.2.0 (cli) (built: Dec 27 2022 20:38:32) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.0, Copyright (c) Zend Technologies
    with Zend OPcache v8.2.0, Copyright (c), by Zend Technologies

# egrep '(Version|Features)' /proc/fs/cifs/DebugData
/proc/fs/cifs/DebugData:CIFS Version 2.10
/proc/fs/cifs/DebugData:Features: DFS,FSCACHE,STATS,DEBUG,ALLOW_INSECURE_LEGACY,WEAK_PW_HASH,CIFS_POSIX,UPCALL(SPNEGO),XATTR,ACL

# mount | grep home
//10.0.0.1/volume-...-default/... on /home type cifs (rw,relatime,vers=default,cache=strict,username=dummyadmin_...,domain=,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.0.1,file_mode=0777,dir_mode=0777,soft,nounix,mapposix,mfsymlinks,noperm,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)

# uname -a
Linux 3f6c4cc7fa2c 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 GNU/Linux
# cat /etc/debian_version 
11.6

# cat /etc/motd 
   _____                               
  /  _  \ __________ _________   ____  
 /  /_\  \\___   /  |  \_  __ \_/ __ \ 
/    |    \/    /|  |  /|  | \/\  ___/ 
\____|__  /_____ \____/ |__|    \___  >
        \/      \/                  \/ 
A P P   S E R V I C E   O N   L I N U X

Documentation: http://aka.ms/webapp-linux
PHP quickstart: https://aka.ms/php-qs
PHP version : 8.2.0
Note: Any data outside '/home' is not persisted

PHP Version

PHP 8.2

Operating System

Debian 11.6

@nielsdos
Copy link
Member

nielsdos commented Feb 8, 2023

Could be fixed by #10440.
EDIT: nevermind I misunderstood the issue, sorry (when I looked at the length of the strace I immediately thought about the corruption case but this one is different, I should've looked more carefully).

@nielsdos
Copy link
Member

nielsdos commented Feb 8, 2023

Your methodology to fix this issue sounds correct. You can probably use php_stream_state instead of fstat to support all kinds of streams of PHP (and if that call fails by returning -1, then you can skip the file_copy_range I guess). If you have a patch feel free to PR it :)

@knowsshit
Copy link
Author

I would love to write a patch, but I am not sure if I am able to do it properly. The main problem here is that PHP uses a length argument with copy_file_range() that is larger than the source file size.

...so as long as PHP can ensure to reduce the length to the source size it would be good, and I guess combining it with looping through copy_file_range() until every byte is copied as proposed in #10440 would be the correct solution here.

Can anyone help out to make this happen?

@nielsdos
Copy link
Member

nielsdos commented Feb 8, 2023

I can put it on my todo list for this week, unless someone beats me to it

@knowsshit
Copy link
Author

I can put it on my todo list for this week, unless someone beats me to it

Fantastic!

nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 9, 2023
…gth (cfr_max) specified in streams.c:1584 copy_file_range()

On some filesystems, the copy operation fails if we specify a size
larger than the file size. We use a stat call to clamp the size to copy
to the actual filesize. This stat call shouldn't impact performance
notably because stat calls can be cached. In some cases (like for /proc
files), the returned size is 0, so we should avoid problems by not using
copy_file_range in those cases (copy_file_range wouldn't work anyway on
this particular example because the syscall is not supported for /proc).
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 10, 2023
…gth (cfr_max) specified in streams.c:1584 copy_file_range()

On some filesystems, the copy operation fails if we specify a size
larger than the file size. We use a stat call to clamp the size to copy
to the actual filesize. This stat call shouldn't impact performance
notably because stat calls can be cached. In some cases (like for /proc
files), the returned size is 0, so we should avoid problems by not using
copy_file_range in those cases (copy_file_range wouldn't work anyway on
this particular example because the syscall is not supported for /proc).
nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 11, 2023
…gth (cfr_max) specified in streams.c:1584 copy_file_range()

On some filesystems, the copy operation fails if we specify a size
larger than the file size in certain circumstances and configurations.
In those cases EIO will be returned as errno and we will therefore fall
back to other methods.
arnaud-lb added a commit that referenced this issue Feb 11, 2023
* PHP-8.2:
  [ci skip] NEWS
  Fix GH-10548: copy() fails on cifs mounts because of incorrect length (cfr_max) specified in streams.c:1584 copy_file_range() (#10551)
@knowsshit
Copy link
Author

Thanks again!

Does this mean that PHP will always try to copy_file_range() on cifs with length cfr_max, fail with EIO and then fall back to "classic copying"? I like that it will work, which is the most important improvement for me (so Wordpress can update itself in an Azure Web App).

But will this also mean that copies will never be made server side, resulting in PHP reading the source file over the network and writing it back over the network? If so, that sounds unfortunate for those who want to copy larger files over cifs, or those who have limited network bandwidth for such operations. I am no expert, but is it that bad to use a stat call, even it is cached, since it will fall back to classic copying if it fails anyway? I mean there is also already code for "there may be more data; continue copying using the fallback code below" as well.

@nielsdos
Copy link
Member

Does this mean that PHP will always try to copy_file_range() on cifs with length cfr_max, fail with EIO and then fall back to "classic copying"? I like that it will work, which is the most important improvement for me (so Wordpress can update itself in an Azure Web App).

Yes. Actually, I did find that it depends on the server. On my testing I found that I could specify a file length larger than the max file length and it worked properly without EIO.

But will this also mean that copies will never be made server side, resulting in PHP reading the source file over the network and writing it back over the network?

Depends on the configuration.

If so, that sounds unfortunate for those who want to copy larger files over cifs, or those who have limited network bandwidth for such operations. I am no expert, but is it that bad to use a stat call, even it is cached, since it will fall back to classic copying if it fails anyway? I mean there is also already code for "there may be more data; continue copying using the fallback code below" as well.

Hmm that could work even in the case of getting a too small size or having races because we now do fallback in case of EIO.
@arnaud-lb It might make sense to extend the current approach with a stat call. In case of having a size less than the actual file size, the code will fall back. In case the size is too large and CIFS complains then EIO will be hit and the fallback will be taken. In case of races the size is wrong so we again end up in the fallback case. So I think it would work. What do you think?

@arnaud-lb
Copy link
Member

Yes, this seems sensible

nielsdos added a commit to nielsdos/php-src that referenced this issue Feb 18, 2023
… a too large length

Some filesystems will cause failures if the max length is greater than
the file length. We therefore use a stat call to get the actual file size.
Since stat calls are cached this shouldn't have much impact on performance.
This addresses a performance concern for CIFS filesystems and is described in phpGH-10548.
In case the cached result is less than the actual size, the code will just copy more
in the next iteration. In case it is too large, it will at worst fall into the EIO error
case and therefore fall back to the mmap or regular copy. In case of races the reasoning
is the same as for a wrong cached result. Therefore, this won't cause problems.
@knowsshit
Copy link
Author

Thank you so much for your efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants