Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use stream_copy_to_stream #1014

Merged
merged 3 commits into from
Apr 21, 2023
Merged

Conversation

divinity76
Copy link
Contributor

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)
@ner00
Copy link
Contributor

ner00 commented Apr 9, 2023

This might be a problem under 32bit environments with files bigger than 4GiB (bug #74395).
Also, on Windows, PHP 64-bit prior to v8.0.9 (bug #81145) seems to have an issue with files bigger than 4 GiB regardless.

At best stream_copy_to_stream would need to check if the input file is supported by the server function or if it's going to end with a corrupt output. Something similar to this to prevent issues might work IF the performance gains are drastically beneficial to the classic method:

if (PHP_INT_SIZE === 4 || (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN' && version_compare(phpversion(), '8.0.9', '<'))) {
    while ($buff = fread($in, 4096)) { fwrite($out, $buff); }
} else {
    stream_copy_to_stream($in, $out);
}

Copy link
Contributor

@ner00 ner00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if server can write files bigger than 4GiB, especially considering PHP bugs #74395 and #81145.

@divinity76
Copy link
Contributor Author

divinity76 commented Apr 10, 2023

@ner00 even just testing 32bit php5.5 is very cubersome; i installed a 32bit debian 11 VM, couldn't find a php5.5 pre-built binary, tried to compile php5.5, but building php5.5 from source requires an ancient 2012 version of GNU Bison (2.7), couldn't find a pre-built bison2.7, trying to build bison2.7 from source ends with this error:

  CC       fprintf.o
  CC       fseterr.o
fseterr.c: In function 'fseterr':
fseterr.c:77:3: error: #error "Please port gnulib fseterr.c to your platform! Look at the definitions of ferror and clearerr on your system, then report this to bug-gnulib."
   77 |  #error "Please port gnulib fseterr.c to your platform! Look at the definitions of ferror and clearerr on your system, then report this to bug-gnulib."
      |   ^~~~~
make[3]: *** [Makefile:1915: fseterr.o] Error 1
make[3]: Leaving directory '/temp/bison/bison-2.7.1/lib'
make[2]: *** [Makefile:1672: all] Error 2
make[2]: Leaving directory '/temp/bison/bison-2.7.1/lib'
make[1]: *** [Makefile:1558: all-recursive] Error 1
make[1]: Leaving directory '/temp/bison/bison-2.7.1'
make: *** [Makefile:1488: all] Error 2
root@32bitdebian:/temp/bison/bison-2.7.1# 

googlging that lead me to this patch: https://github.com/rdslw/openwrt/blob/e5d47f32131849a69a9267de51a30d6be1f0d0ac/tools/bison/patches/110-glibc-change-work-around.patch

after applying that patch to bison, i was finally able to get bison2.7 to build with

wget 'https://ftp.gnu.org/gnu/bison/bison-2.7.1.tar.xz';
tar xfv https://ftp.gnu.org/gnu/bison/bison-2.7.1.tar.xz;
cd bison-2.7.1;
wget 'https://github.com/rdslw/openwrt/blob/e5d47f32131849a69a9267de51a30d6be1f0d0ac/tools/bison/patches/110-glibc-change-work-around.patch' -O- | git apply -;
./configure;
make;
make install;

, and finally able to build PHP5.5 with

git clone -b 'PHP-5.5' --depth 1 https://github.com/php/php-src.git && \
cd 'php-src' && \
./buildconf && \
./configure --disable-all --disable-cgi --enable-cli && \
make clean && \
make -j $(nproc)

and voila

root@32bitdebian:/temp/php/php-src# sapi/cli/php -v
PHP 5.5.38 (cli) (built: Apr 10 2023 12:27:35) 
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.5.0, Copyright (c) 1998-2015 Zend Technologies

and now to test stream_copy_to_stream on a 5GB file:

root@32bitdebian:/temp/php/php-src# truncate -s 5G 5g.bin
root@32bitdebian:/temp/php/php-src# du -h --apparent-size 5g.bin 
5.0G	5g.bin
root@32bitdebian:/temp/php/php-src# du --bytes --apparent-size 5g.bin 
5368709120	5g.bin
root@32bitdebian:/temp/php/php-src# cat test.php
<?php
$h = fopen("5g.bin","rb");
$h2 = fopen("5g.copy.bin","wb");
$ret = stream_copy_to_stream($h,$h2);
var_dump($ret);

root@32bitdebian:/temp/php/php-src# time sapi/cli/php test.php

Warning: fopen(5g.bin): failed to open stream: Value too large for defined data type in /temp/php/php-src/test.php on line 2

Warning: stream_copy_to_stream() expects parameter 1 to be resource, boolean given in /temp/php/php-src/test.php on line 4
bool(false)

real	0m0.007s
user	0m0.007s
sys	0m0.001s

well.... that was certainly an unexpected result, but the problem isn't stream_copy_to_stream, it's fopen! :O (at least on 32bit php5.5.38)

@divinity76
Copy link
Contributor Author

divinity76 commented Apr 10, 2023

trying the same on a file that is exactly 0xFFFFFFFF ( 4294967295 ) bytes, aka the max possible value of a uint32_t (keep in mind, 32bit php's int is int32_t, not uint32_t) yield the same fopen error.

trying on a file that is exactly 0x7FFFFFFF (2147483647 ) bytes, aka 32bit PHP's PHP_INT_MAX yields:

root@32bitdebian:/temp/php/php-src# time sapi/cli/php test.php
int(2147483647)

real	0m36.857s
user	0m0.261s
sys	0m36.576s
root@32bitdebian:/temp/php/php-src# du --bytes --apparent-size 4g.copy.bin 
2147483647	4g.copy.bin

success! didn't hit any stream_copy_to_stream bug, when using the oldest 32bit php supported by tinyfilemanager.

trying 2147483648 bytes (PHP_INT_MAX + 1) yielded the same fopen error.

@divinity76
Copy link
Contributor Author

to keep it simple, how about we just do

if (PHP_VERSION_ID < 80009) {
    // workaround https://bugs.php.net/bug.php?id=81145
    while ($buff = fread($in, 4096)) { fwrite($out, $buff); }
} else {
    stream_copy_to_stream($in, $out);
}

?

@divinity76
Copy link
Contributor Author

divinity76 commented Apr 10, 2023

ouch shit, just found a bug in the original code: https://3v4l.org/PItRM
the while loop should be something like

while (!in_array($buff = fread($in, 4096), array("", false), true)) { fwrite($out, $buff); }

otherwise it will fail on files being any multiple of 4096-bytes+1 and ending with 0 😨

@divinity76
Copy link
Contributor Author

divinity76 commented Apr 10, 2023

PR updated, fixed the while loop early return, and only use stream_copy_to_stream on php >= 8.0.9 because of https://bugs.php.net/bug.php?id=81145

must admit, if you're into micro-optimizations, doing

for(;;){
    $buff = fread($in, 4096);
    if($buff === false || $buff === ""){
        break;
    }
    fwrite($out, $buff);
}

would use fewer cpu cycles, avoiding a relatively expensive function call (in_array).. but meh, bet you'd have to use hyperfine to heven notice a difference.

Copy link
Contributor

@ner00 ner00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Seems to have an average performance increase anywhere between 15% to 30% using stream_copy_to_stream

@prasathmani
Copy link
Owner

@divinity76 and @ner00 thanks for your support, I'll verify and merge sometime soon.

@divinity76
Copy link
Contributor Author

when this is merged, it will also fix #1016 :)

ner00 added a commit to ner00/tinyfilemanager that referenced this pull request Apr 16, 2023
@prasathmani prasathmani merged commit f6a9365 into prasathmani:master Apr 21, 2023
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 6, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 6, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 7, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 7, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 7, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 7, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
ner00 pushed a commit to ner00/tinyfilemanager that referenced this pull request May 7, 2023
* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref prasathmani#1016 (comment)
prasathmani added a commit that referenced this pull request May 7, 2023
* publish additional docker tags (#975)

* Update Romanian translations (#981)

* Update tinyfilemanager.php

* Prevent logout issue after page was cached (#1004)

Logout may not work otherwise, browser reloads cached page from disk instead of sending GET request ?logout=1 to server.

* tell git to always commit .php in unix-newlines (#1017)

so hopefully we don't get a repeat of #994 (comment)

* Check if posix_getpwuid/posix_getgrgid calls were successful (#1023)

* use stream_copy_to_stream (#1014)

* use stream_copy_to_stream

it's simpler, and should be faster.
For example, stream_copy_to_stream can use sendfile ( https://man7.org/linux/man-pages/man2/sendfile.2.html ) on operating systems supporting it, which is faster and use less RAM than fread()+fwrite() (because it avoids copying data to/from userland, doing the copy entirely in-kernel~)

* fix loop early return, and workaround bug

* use feof

ref #1016 (comment)

* added bengali translation (#1018)

* Fix upload of existing files (#1026)

* Fix typo. (#1028)

* login (Redirecting to Main domain of website instead of tfm.php) fix (#1031)

When logged in it takes to the website's main URL. For example, if I have tfm in www.example.com/tfm/index.php (index.php is tfm) then after logging in it redirects to www.example.com and then have to press back on the browser then it takes to www.example.com/tfm/index.php

* Add configurable path display modes for better privacy and clarity (#1034)

* Resize preview image and implement zoom in/out (#1036)

* Resize preview image and implement zoom in/out

* Remove redundant class name

---------

Co-authored-by: ssams <6338356+ssams@users.noreply.github.com>
Co-authored-by: Sergiu Bivol <sergiu@cip.md>
Co-authored-by: Prasath Mani <prasathmani@users.noreply.github.com>
Co-authored-by: divinity76 <divinity76@gmail.com>
Co-authored-by: Micha Ober <github@ober-mail.de>
Co-authored-by: Joy Biswas <74253956+joybiswas007@users.noreply.github.com>
Co-authored-by: Micha Ober <git@ober-mail.de>
Co-authored-by: Caleb Mazalevskis <maikuolan@gmail.com>
Co-authored-by: xololunatic <97784387+xololunatic@users.noreply.github.com>
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants