Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5909: [Java] Optimize ByteFunctionHelpers equals & compare logic #4852

Closed
wants to merge 4 commits into from

Conversation

tianchen92
Copy link
Contributor

Related to ARROW-5909.
Now it first compare Long values and then if length < 8 then it compares Byte values.
Add the logic to compare Int values when 4 < length < 8.

Buffer length far greater than 8 have similar performance in equals, in case buffer length < 8 performances are as below:

ByteFunctionHelpersBenchmarks#equals
Before: avgt 5 7.718 ± 0.401 ns/op
After: avgt 5 5.005 ± 0.830 ns/op

int leftInt = PlatformDependent.getInt(lPos);
int rightInt = PlatformDependent.getInt(rPos);
if (leftInt != rightInt) {
return unsignedIntCompare(Integer.reverseBytes(leftInt), Integer.reverseBytes(rightInt));
Copy link
Contributor

@praveenbingo praveenbingo Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i am missing something :) but why are we reversing the bytes here..should not PlatformDependent already return the interpreted value according to endianess..

Copy link
Contributor Author

@tianchen92 tianchen92 Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments. Seems you are right, I just make it similar with Long value logic(https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/util/ByteFunctionHelpers.java#L129). Looks like a legacy issue.
I would like to remove these reverse operations and add UT for this class, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR, would you mind taking a look again? thanks!

Copy link
Contributor

@praveenbingo praveenbingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM. Thanks a lot for your contribution :)

@tianchen92
Copy link
Contributor Author

@emkornfield Could you please take a look if we could get this merged? thanks!


buffer1.setByte(50, 10);

assertEquals(-1, ByteFunctionHelpers.compare(buffer1, 0, SIZE - 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add another test case for the + case.

In an ideal world we would explicitly test the three cases for different lengths >8, >3 and <=3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fixed.

@codecov-io
Copy link

Codecov Report

Merging #4852 into master will increase coverage by 2.15%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4852      +/-   ##
==========================================
+ Coverage   87.42%   89.58%   +2.15%     
==========================================
  Files         996      661     -335     
  Lines      139830    96464   -43366     
  Branches     1418        0    -1418     
==========================================
- Hits       122246    86415   -35831     
+ Misses      17222    10049    -7173     
+ Partials      362        0     -362
Impacted Files Coverage Δ
cpp/src/arrow/util/thread-pool-test.cc 97.66% <0%> (-0.94%) ⬇️
r/src/recordbatch.cpp
r/R/Table.R
js/src/util/fn.ts
go/arrow/array/bufferbuilder.go
r/src/symbols.cpp
rust/datafusion/src/execution/projection.rs
rust/datafusion/src/execution/filter.rs
rust/arrow/src/csv/writer.rs
rust/datafusion/src/bin/main.rs
... and 328 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e63905...18db45f. Read the comment docs.

@emkornfield
Copy link
Contributor

+1, LGTM. Thank you for the additional tests.

@emkornfield emkornfield changed the title ARROW-5909: [Java] Optimize ByteFunctionHelpers equals & compare logic ARROW-5909: [Java] Optimize ByteFunctionHelpers equals & compare logic Jul 13, 2019
kszucs pushed a commit that referenced this pull request Jul 22, 2019
Related to [ARROW-5909](https://issues.apache.org/jira/browse/ARROW-5909).
Now it first compare Long values and then if length < 8 then it compares Byte values.
Add the logic to compare Int values when 4 < length < 8.

Buffer length far greater than 8 have similar performance in equals,  in case buffer length < 8 performances are as below:

ByteFunctionHelpersBenchmarks#equals
Before: avgt    5  7.718 ± 0.401  ns/op
After:    avgt    5  5.005 ± 0.830  ns/op

Author: tianchen <niki.lj@alibaba-inc.com>

Closes #4852 from tianchen92/ARROW-5909 and squashes the following commits:

18db45f <tianchen> add more UT
dbd0923 <tianchen> remove reverse in ByteFunctionHelpers
41030c4 <tianchen> fix
e6ab2ed <tianchen> ARROW-5909:  Optimize ByteFunctionHelpers equals & compare logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants