-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-14374: [Java] Integration tests for the C data Interface implementation for Java #11543
ARROW-14374: [Java] Integration tests for the C data Interface implementation for Java #11543
Conversation
…entation for Java Lead-authored-by: tomersolomon1 <tomer.solomon1@gmail.com> Co-authored-by: tomersolomon1 <tomer.solomon1@gmail.com> Co-authored-by: roee88 <roee88@gmail.com> Signed-off-by: tomersolomon1 <tomer.solomon1@gmail.com>
|
- ${DOCKER_VOLUME_PREFIX}debian-ccache:/ccache:delegated | ||
command: | ||
[ "/arrow/ci/scripts/cpp_build.sh /arrow /build && | ||
/arrow/ci/scripts/python_build.sh /arrow /build && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to build Arrow C++ and PyArrow from scratch. You should be able to just pip install
a recent binary wheel. Did you try to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally, this is what Go does for a similar purpose: https://github.com/apache/arrow/blob/master/ci/docker/debian-10-go-cgo-python.dockerfile#L23-L36
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pitrou the reason was to do the integration test against the code in the main branch for cases where related fixes are made to the python/C++ implementation. It seems like this is what conda-python-jpype does. If undesired then we can switch to just pip install
(which was @tomersolomon1's original intent, so my bad)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you prefer, though this certainly makes the CI and docker run slower than installing a binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer to keep it this way b/c of the reasons @roee88 mentioned, and also this lowers the risk of introducing new inconsistencies between the implementations.
@amol- This should interest you, especially the bridge here: https://github.com/apache/arrow/pull/11543/files#diff-386b108c1492f76426f3ea17399c681593a8cc86b34daeb1b451d4da600863e8R56 |
Yes, I was reading it, there are some useful helpers. Some of those should probably live in |
def __init__(self): | ||
self.allocator = jpype.JPackage( | ||
"org").apache.arrow.memory.RootAllocator(sys.maxsize) | ||
self.jc = jpype.JPackage("org").apache.arrow.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion, I feel it might help us when we will come back to this codebase in 6 months to be a bit more explicit about when we are invoking Python things and when we are invoking Java things. Given that both of them appear as Python codebase as jpype is proxying Java calls too it's not as explicit as it could be.
Probably naming self.java_allocator
and self.java_c
would make a bit more obvious that we are dealing with Java methods there and not with Python things.
Signed-off-by: tomersolomon1 <tomer.solomon1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @roee88 Do you want to add anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Eventually this could reuse code from ARROW-14319 but there is no agreement on how a new jvm module should look like.
ci/scripts/java_cdata_integration.sh
Outdated
|
||
python integration_tests.py | ||
|
||
popd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting (missing new line at end of file)
|
||
if __name__ == '__main__': | ||
setup_jvm() | ||
unittest.main(verbosity=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same formatting issue here
Signed-off-by: tomersolomon1 <tomer.solomon1@gmail.com>
Benchmark runs are scheduled for baseline = 3c1f702 and contender = bb776d8. bb776d8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Integration tests for the C data Interface implementation for Java.