-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-4990: [C++] Support Array-Array comparison #4398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-4990: [C++] Support Array-Array comparison #4398
Conversation
5c47b82 to
6cfef25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it pay to support scalar-scalar comparison (i.e. should this be 1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can support comparing scalars by promoting Scalar to Array of length 1. Is there a JIRA about Scalar->Array promotion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this different then AssignNullIntersection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it absolutely is not, didn't know why I missed this one. I'll refactor do call AssignNullIntersection or remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think refactoring to call AssignNullIntersection sounds good, since I clearly didn't pick a good name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more that I didn't look what was there than name picking. Even worse, I think I even reviewed your function...
|
Took a quick pass on this, will look at it in more detail tomorrow. |
emkornfield
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like appveyor is broken with something related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think refactoring to call AssignNullIntersection sounds good, since I clearly didn't pick a good name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not this change, but I think the convention we should be using for now is that we use detail::Invoke to handle chunked arrays in the top level method, and worry about the parallelization when we expose the raw kernels? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but I suspect it will be hard to make with the current Invoke method, as the shape can vary, e.g. the signature is (S,T) -> R, where S,T,R can be Array, ChunkedArray, Scalar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so I think the complication is scalar. (I thought invoke handled combinations of Array and ChunkedArray). One way of potentially doing this is converting the Kernel into a Unary one when the function is called with scalar and use the InvokeUnary instead (and handle scalar to scalar here).
b97c787 to
8c3d1de
Compare
|
@emkornfield updated with comments. I filled https://issues.apache.org/jira/browse/ARROW-5489 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems strange why the r-value reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of https://github.com/apache/arrow/pull/4398/files#diff-4cce1db37d9827bc657384ea82da0d63R42 . I'll modify the code to avoid the r-value.
|
1 small nit about r-value reference that I don't understand, otherwise LGTM. |
|
@ursabot build |
|
I've successfully started builds for this PR |
|
@emkornfield I changed the forward to a pointer as per comment and style guide. |
Convenience method for sub-type supporting the length method.
When given 2 arrays, it should compose the validity bitmap by intersecting the input validity bitmaps.
- Extend CompareKernel to support (Array, Array) -> Array signature. - Extend CompareKernel to support (Scalar, Array) -> Array signature (swap the order of inputs).
8fafd6d to
864c679
Compare
|
@ursabot benchmark |
|
I've successfully started builds for this PR |
|
AMD64 Ubuntu 18.04 C++ Benchmark (#14985) builder failed. Revision: 864c679 Archery: |
Codecov Report
@@ Coverage Diff @@
## master #4398 +/- ##
===========================================
- Coverage 88.26% 76.92% -11.35%
===========================================
Files 846 51 -795
Lines 103360 1976 -101384
Branches 1253 0 -1253
===========================================
- Hits 91231 1520 -89711
+ Misses 11882 456 -11426
+ Partials 247 0 -247Continue to review full report at Codecov.
|
|
+1, LGTM. It looks like the Ursabot output could be more useful |
|
I'm fixing this with Kristian. |
Comparison only supported for the left argument to be an array and the right argument a scalar. This extends support for comparing two arrays, but also supporting the case where the left argument is a scalar and the right an array.