Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore streamInput() performance over PagedBytesReference. #5589

Closed
wants to merge 1 commit into from
Closed

Restore streamInput() performance over PagedBytesReference. #5589

wants to merge 1 commit into from

Conversation

hhoffstaette
Copy link

The initial implementation of bulk-reading a streamInput() over PagedBytesReference was slow (byte-by-byte reading when bulk copying).

Times in µs for bulk-reading a stream over plain vs. paged, averaged over 1000 runs:

MB plain µs paged µs Ratio
1 72 2048 28.6
2 140 4127 29.4
3 218 6208 28.4
4 430 8396 19.5
5 700 10525 15.0
10 1739 21055 12.1
20 3481 42063 12.1
50 8701 105337 12.1
100 17409 210848 12.1

This changeset restores performance:

MB plain µs paged µs Ratio
1 65 68 1.05
2 134 141 1.04
3 198 235 1.18
4 456 418 0.91
5 750 761 1.01
10 1736 1743 1.00
20 3514 3497 0.99
50 8706 8700 0.99
100 17608 17731 1.00

The performance jitters slightly due to the usual Hotspot variances, OS scheduling etc. For all practical purposes the performance is now back to what it was before.

while (written < len) {
b[bOffset + written] = bytearray.get(offset + written);
written++;
// how much can we bulk-read until hitting a multiple of PAGE_SIZE?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you put the //comments behind the actual line - this would be much easier to read and we have 120 chars at least :)

@hhoffstaette
Copy link
Author

Made args final, reformatted a bit, added PR.

long pagefragment = PAGE_SIZE - (bytearrayOffset % PAGE_SIZE); // how much can we read until hitting N*PAGE_SIZE?
int bulksize = (int)Math.min(pagefragment, todo - written); // we cannot copy more than a page fragment
boolean copied = bytearray.get(bytearrayOffset, bulksize, ref); // get the fragment
assert (copied == false); // we should never ever get back a materialized byte[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still confuses me why do we return that boolean if it is always expected to be false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I your reply got busted.... lemme reread

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert was just for testing. If you find it less confusing I can remove the return value & the assert .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just wondering if we can make sure we never get a class taht does that but I guess the assert is ok

@s1monw
Copy link
Contributor

s1monw commented Mar 28, 2014

LGTM

@hhoffstaette hhoffstaette self-assigned this Mar 28, 2014
hhoffstaette pushed a commit that referenced this pull request Mar 28, 2014
@hhoffstaette hhoffstaette deleted the streamcopy branch March 28, 2014 10:09
@hhoffstaette hhoffstaette removed their assignment Mar 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants