Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase chunk transfer size for iRODS #6004

Closed
wants to merge 3 commits into from
Closed

Increase chunk transfer size for iRODS #6004

wants to merge 3 commits into from

Conversation

brainstorm
Copy link
Member

While testing the iRODS extension, I noticed that the transfer rates were suboptimal. Tracing with strace revealed a 8kb chunk size when transferring a 2.5GB file filled with zeros:

# lsof /tmp/php86IZkm
COMMAND  PID     USER   FD   TYPE DEVICE   SIZE/OFF NODE NAME
apache2 3009 www-data   17r   REG  253,1 2671771648   59 /tmp/php86IZkm

Actual data packets being sent by process id 3009 above (with a single apache thread), those messages appear every 0.5 seconds with a payload of 8192 bytes each:

sendto(16, "\0\0\0\210<MsgHeader_PI><type>RODS_API"..., 222, MSG_DONTWAIT, NULL, 0) = 222
sendto(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, MSG_DONTWAIT, NULL, 0) = 8192

sendto(16, "\0\0\0\210<MsgHeader_PI><type>RODS_API"..., 222, MSG_DONTWAIT, NULL, 0) = 222
sendto(16, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, MSG_DONTWAIT, NULL, 0) = 8192

Before the patch it took around 24h to upload. After this patch, the transfer takes less than an hour.

With native iRODS iput commands, the transfer takes around 1minute, so there's still room for improvement.

Thanks to @cansmith and @DeepDiver1975.

@ghost
Copy link

ghost commented Nov 22, 2013

Can one of the admins verify this patch?

@brainstorm
Copy link
Member Author

Future ways to improve this:

@DeepDiver1975
Copy link
Member

@brainstorm Thanks a lot for your contribution!
Contributions to the core repo require a signed contributors agreement http://owncloud.org/about/contributor-agreement/

Alternatively you can add a comment here where you state that this contribution is MIT licensed.

@DeepDiver1975
Copy link
Member

@owncloud-bot this is okay to test

@@ -201,7 +201,8 @@ public function readfile($path) {
@ob_end_clean();
$handle = $this->fopen($path, 'rb');
if ($handle) {
$chunkSize = 8192; // 8 kB chunks
//$chunkSize = 8192; // 8 kB chunks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation seems off here - we use tabs -THX

@brainstorm
Copy link
Member Author

Guys can I just merge this and open another issue with #6004 (comment) ?

As it is now it is just embarassingly slow and quite unusable.

@DeepDiver1975
Copy link
Member

Guys can I just merge this and open another issue with #6004 (comment) ?

definitely: NO! We are in release candidate mode and this pull request touches core mechanisms (changing block sizes). The impact is still to be analysed.

👎 sorry

@DeepDiver1975
Copy link
Member

Future ways to improve this:

Spawn more threads than just one when transferring

PHP has no threading concept - nothing we can do here

Write and run some tests to determine average optimal chunk size
Use PHP abstractions to sendfile or splice system calls
Perhaps wrap i-commands instead of using PRODS



Incorporate enhacements from upstream PRODS, like PAM authentication: https://code.renci.org/gf/project/irodsphp/tracker/?action=TrackerItemEdit&tracker_item_id=1682&tracker_id=1682

PAM has been implemented and tested early this year

@brainstorm
Copy link
Member Author

@DeepDiver1975,

Regarding PAM, I was referring to a recent patch that affects iRODS 3.3 and PRODS, from the link in the tasklist above, I can incorporate/sync this one myself later on:

(...)
Please find as attachment a patch that:
- fix the requested ttl in pRODS, and add an entry in prods.ini for PAM;
- workaround for saving the pam-derived password;
- extend the form in the login page, allowing the selection of the auth method
(irods or PAM).

Regarding threading, what about this one?: http://php.net/manual/en/book.pthreads.php

@DeepDiver1975
Copy link
Member

Regarding threading, what about this one?: http://php.net/manual/en/book.pthreads.php

Due to the requirements as listed here I don't see this as a reasonable approach: http://www.php.net/manual/en/pthreads.requirements.php

This would only make sense if it can be implemented as an option and not as a hard requirement.

@DeepDiver1975
Copy link
Member

Regarding PAM, I was referring to a recent patch that affects iRODS 3.3 and PRODS, from the link in the tasklist above, I can incorporate/sync this one myself later on:

I see - THX

if ($fh) {
// override the default 8k php stream chunk
// size for non-file streams.
stream_set_chunk_size($fh, 1024*1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get the chunk file size from a stream? then we can use that size for readFile and streamCopy and remove the need to hardcode the new values there.

@jancborchardt
Copy link
Member

I’m closing this pull request for now as @brainstorm didn’t comment or update this in 4 months.

@brainstorm you can always reopen or update the pull request. :) Improvement on this matter is very welcome, but as @DeepDiver1975 mentioned we have to properly test it before merging.

@joycenijkamp
Copy link

Hello Brainstorm, I reacted on our blogpost about the owncloud - irods integration, but cannot change the spelling faults I made by accident. Is it possible to correct these?

@michael-conway
Copy link

Hey guys, Roman just pointed me to this. I'm at the DICE group over at UNC and work on iRODS. I told Roman I could help you guys get the iRODS PHP API and streaming performance squared away. That PHP code is a little long in the tooth so I think there's headroom to make streaming i/o better on the PHP side.

Cheers
MC

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants