-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly improve file download speed by enabling mmap based… #119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the PR. This looks like a great investigated patch.
Maybe someone else can confirm this improves performance on other setups.
Ps. In case you don't have a fast storage (SSD), we can simulate by creating a sparse file on a slow disk. Linux is able to read the sparse file without actually I/O: truncate -s 10G 10G.bin |
After re-reading the patch a few times, I wonder if this is something which should be optimized on the php-src level.. couldnt php internally use a proper sized buffer so mmap works no matter how big the file is passed in from userland @nikic ? |
c5ea43b
to
62d280d
Compare
Codecov Report
@@ Coverage Diff @@
## master #119 +/- ##
============================================
+ Coverage 93.52% 93.61% +0.08%
- Complexity 251 257 +6
============================================
Files 15 15
Lines 819 830 +11
============================================
+ Hits 766 777 +11
Misses 53 53
Continue to review full report at Codecov.
|
@staabm Thanks for the review. Doing the loop is php-src level is a possible way, but I think the challenge is that there would be too many mmap/unmap cycles if the chunk is too small. Maybe we can open a discussion there. |
62d280d
to
15242fe
Compare
dont get me wrong. I would like to have your patch in sabre-io/http, but I am only suggesting to also double check the c-level because I guess this is a problem which might can be fixed in php-src and therefore improve performance in a lot more libraries/use-cases then ours. no matter how/when/why it will be fixed in php-src, we should land this fix here (in case it doesn't regress other things). |
@staabm Thanks, I got your point. I am also revisiting PHP core code and looking at if something can be done there. After investigation I will open a bug/pull-request/RFC to PHP core to discuss it. |
For partial content response, additionally make start position for stream copy become a multiple of the page size. Things become more complex, we should definitely look to if we can optimize PHP C function side. |
2f81b33
to
10bd103
Compare
Is it worth writing a couple of unittests for this? There's a few more branches in the new code. Are they all being hit? |
I will take a look at the unit test part soon. |
$chunk_size = 4 * 1024 * 1024; | ||
stream_set_chunk_size($output, $chunk_size); | ||
// If this is a partial response, flush the beginning bytes until the first position that is a multiple of the page size. | ||
$contentRange = $response->getHeader('Content-Range'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how common are Range-Requests here? is it sufficient for your reported use case to use this hack in the simpler form your submitted initially?
maybe something along the lines will also be good enough:
pseudo code:
if (range-request) {
// old way, which doesnt support mmap
} else {
// your optimized version, without range request support
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not common for downloading a file, but for streaming a video. Regarding to the performance, I found it is also beneficial to downloading a file on HDD: from 70-80 MB/s to 110-120 MB. But I think it might be because of a larger chunk size (4 MiB) is used in this PR, while manually copy implementation in stream_copy_to_stream
only uses a 8KiB chunk (https://github.com/php/php-src/blob/00d2b7a3d28d29383450fc2607c7d4fb91d57a4a/main/streams/streams.c#L1469 and https://github.com/php/php-src/blob/00d2b7a3d28d29383450fc2607c7d4fb91d57a4a/main/streams/php_streams_int.h#L49).
I personally prefer to use this "hack" for both range and non-range response and add unit test to cover this.
0792109
to
a780756
Compare
…am_copy_to_stream I run a Nextcloud server on a VM (1 CPU core, 1G RAM, 10Gbps virtio network, PHP 7.2.10) on a SSD storage. Directly reading a large file (let’s say 10GiB) on that server will get a read speed around 540-550 MiB/s. However the download speed with webdav is only about 240-260 MiB/s. I tried replacing the `stream_copy_to_stream` function call in Sapi.php with `fpassthru`, getting a significantly performance boost (440-470 MiB/s). It seems to me that the performance bottleneck is `stream_copy_to_stream`. I started investigating this problem by digging into source code of the PHP internal functions, then finally figured out what was going wrong. `stream_copy_to_stream` tries to mmap the source file. If failed, fallback to the manual copy way: <https://github.com/php/php-src/blob/00d2b7a3d28d29383450fc2607c7d4fb91d57a4a/main/streams/streams.c#L1467>. But PHP refuses to mmap a file larger than 4MiB according to <https://github.com/php/php-src/blob/623911f993f39ebbe75abe2771fc89faf6b15b9b/main/streams/mmap.c#L34 >. So, I get a workaround: Instead of asking `stream_copy_to_stream` to copy the whole to-be-downloaded file, calling `stream_copy_to_stream` in a loop for each 4MiB chunk of the file. After applying this PR, I got full download speed about just like reading that file locally (even faster than fpassthru). Download speed from HDD also increased from 70-80 MB/s to 110-120 MB/s in my local setup. I can’t wait to share my patch because I am really happy with the optimization! Please help me review and tell me if this is the right way.
a780756
to
c38972a
Compare
side-note: I reported the initial problem upstream on php-src so we can get the discussion going also on this level |
@staabm Hi, I haven't been following up this issue for a while. This PR is definitely a workaround until php-src can eventually fix this. Can this PR be merged? Anything I can help with? |
Hey @vfreex . I totally forgotten this PR. will merge now after a green build. |
will be in 5.0.3 .. #129 |
@staabm Thanks so much! |
@vfreex do you have some time to provide some in-detail feedback for the upstream php-src bug report? |
in any case, Nginx/apache would do a significantly better job than PHP anyway, nextcloud (or anything, really) should use X-accel-redirect / X-sendfile for large files, not PHP |
@ho4ho could you investigate the infinite loop? |
Sorry I missed the message because I have so many GitHub notifications. I will try to follow up within a few days.
Use nginx/apache to send files is nice to have. I think it is better to have that option in separated PRs (require a collaboration both on sabre and Nextcloud projects).
I'll also take a look and see if the infinity loop is related to this PR (or #133). |
@vfreex yes, there might be an edge case somewhere in this stuff. |
@phil-davis I created #137 to break the copy stream loop on failure, although I am not sure quite sure if it is the root cause. |
…am_copy_to_stream
Background
I run a Nextcloud server on a VM (1 CPU core, 1G RAM, 10Gbps virtio network, PHP 7.2.10) on a SSD storage. Directly reading a large file (let’s say 10GiB) on that server will get a read speed around 540-550 MiB/s. However the download speed with webdav is only about 240-260 MiB/s.
I tried replacing the
stream_copy_to_stream
function call in Sapi.php withfpassthru
, getting a significantly performance boost (440-470 MiB/s). It seems to me that the performance bottleneck isstream_copy_to_stream
.Root Cause
I started investigating this problem by digging into source code of the PHP internal functions, then finally figured out what was going wrong.
stream_copy_to_stream
tries to mmap the source file. If failed, fallback to the manual copy way:https://github.com/php/php-src/blob/00d2b7a3d28d29383450fc2607c7d4fb91d57a4a/main/streams/streams.c#L1467.
But PHP refuses to mmap a file larger than 4MiB according to <https://github.com/php/php-src/blob/623911f993f39ebbe75abe2771fc89faf6b15b9b/main/streams/mmap.c#L34 >.
Solution (workaround?)
So, I started to modify the code: Instead of asking
stream_copy_to_stream
to copy the whole to-be-downloaded file, callingstream_copy_to_stream
in a loop for each 4MiB chunk of the file.After applying this PR, I got full download speed about just like reading that file locally (even faster than
fpassthru
).I can’t wait to share my patch because I am really happy with the optimization! Please help me review and tell me if this is the right way.