-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File corruption in _php_stream_copy_to_stream_ex
when using copy_file_range
#10370
Comments
Thanks for the issue report. |
No need for a reproducer anymore, I can reproduce it using an artificial test file and artificial modification to my system. |
The following patch works for me. I agree that we should also look into repeating diff --git a/main/streams/streams.c b/main/streams/streams.c
index 20029fc73e..28fa7cb651 100644
--- a/main/streams/streams.c
+++ b/main/streams/streams.c
@@ -1634,9 +1634,15 @@ PHPAPI zend_result _php_stream_copy_to_stream_ex(php_stream *src, php_stream *de
char *p;
do {
- size_t chunk_size = (maxlen == 0 || maxlen > PHP_STREAM_MMAP_MAX) ? PHP_STREAM_MMAP_MAX : maxlen;
- size_t mapped;
+ size_t must_read = maxlen - haveread;
+ size_t chunk_size = (maxlen == 0 || must_read > PHP_STREAM_MMAP_MAX) ? PHP_STREAM_MMAP_MAX : must_read;
+
+ /* Prevent mapped from becoming larger than the file */
+ if (chunk_size > must_read) {
+ chunk_size = must_read;
+ }
+ size_t mapped;
p = php_stream_mmap_range(src, php_stream_tell(src), chunk_size, PHP_STREAM_MAP_MODE_SHARED_READONLY, &mapped);
if (p) {
@@ -1650,6 +1656,7 @@ PHPAPI zend_result _php_stream_copy_to_stream_ex(php_stream *src, php_stream *de
didwrite = php_stream_write(dest, p, mapped);
if (didwrite < 0) {
*len = haveread;
+ php_stream_mmap_unmap(src);
return FAILURE;
}
@@ -1666,9 +1673,9 @@ PHPAPI zend_result _php_stream_copy_to_stream_ex(php_stream *src, php_stream *de
if (mapped < chunk_size) {
return SUCCESS;
}
- if (maxlen != 0) {
- maxlen -= mapped;
- if (maxlen == 0) {
+ if (must_read != 0) {
+ must_read -= mapped;
+ if (must_read == 0) {
return SUCCESS;
}
}
Could you please test the patch too if you're able to :) ? You can apply the patch on top of the PHP-8.2 branch using the Note: the patch also contains a fix for a mmap-resource leak. It also prevents modifying maxlen such that the fallback after mmap works too. |
Sorry for the late answer it seems that your answer got lost in my mailbox. First thanks a lot for taking the time to work on that issue. Here are the steps that I took in order to test the issue:
And it seemed to work !
Which is the correct checksum (I also manually checked the file and it's valid) It looks like your patch is indeed fixing the issue we're having! |
No worries, thank you for testing! |
…n using copy_file_range copy_file_range can return early without copying all the data. This is legal behaviour and worked properly, unless the mmap fallback was used. The mmap fallback would read too much data into the destination, corrupting the destination file. Furthermore, if the mmap fallback would fail and have to fallback to the regular file copying mechanism, a similar issue would occur because both maxlen and haveread are modified. Furthermore, there was a mmap-resource in one of the failure paths of the mmap fallback code. This patch fixes these issues. This also adds regression tests using the new copy_file_range early-return simulation added in the previous commit.
…n using copy_file_range copy_file_range can return early without copying all the data. This is legal behaviour and worked properly, unless the mmap fallback was used. The mmap fallback would read too much data into the destination, corrupting the destination file. Furthermore, if the mmap fallback would fail and have to fallback to the regular file copying mechanism, a similar issue would occur because both maxlen and haveread are modified. Furthermore, there was a mmap-resource in one of the failure paths of the mmap fallback code. This patch fixes these issues. This also adds regression tests using the new copy_file_range early-return simulation added in the previous commit.
Description
Recently a user notified us of a corrupted file after a
composer install
on our platform while using PHP 8.2.1.It seems that under some condition the
_php_stream_copy_to_stream_ex
can corrupt files.I am not a PHP expert but here is the information I have found.
If I run the following code:
I get this output on some of our servers:
But I expected this output instead:
The file outputed on the first server is clearly corrupted when on the second server it's the expected output.
After some investigation I think that we traced back the issue to the
copy_file_range
call instreams/streams.c
.On system where corruption is occurring, only a part of the requested length is read, where on systems where no corruption is occurring the entire requested length is read.
Here are part of the
strace
that seems relevant to me.The
strace
which results in corrupted file:The
strace
that results in a valid file:When it resulted in a valid file, it seems that
copy_file_range
copied the entire requested length (5667 bytes).But on the corrupted version, it looks like it copied a part of the file (4096 bytes). Then it switched to the
mmap
method to copy the rest of the file.I'm really neither an expert on PHP internals nor on system calls so the following might be completely wrong. But my understanding, seems to be that the
mmap
call copies too much data:Seems to ask 5667 bytes on the FD 5. But the FD already read 4096 bytes so shouldn't we read
4096 - 5667 = 1571
bytes?This seems to be confirmed by the seek that comes after.
The base offset seems to be
73728
(cf. the seek beforecopy_file_range
) so the next seek should be73728 + 5667 = 79395
.But according to the
strace
, it seeks to83491
(73728 + 5667 + 4096).A solution would be to subtract the
result
bytes from the bytes read in the mmap.Or to do a loop around
copy_file_range
that would keep calling that method until it read the entire requested length.The thing we do not understand is why
copy_file_range
is returning4096
on some systems. Because on almost all systems it copies the entire file even if the environment are pretty close (both are on Ubuntu 20.04, Using the same kernel branch, code run in a docker container, ..). That's why it's a bit hard to give some precise instructions on how to reproduce the issue.Additional information
On both servers PHP version is:
PHP Version
PHP 8.2.1
Operating System
Ubuntu 20.04
The text was updated successfully, but these errors were encountered: