Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java][Docs] ArrowSubstrait options preventing jni modules from being build #36885

Closed
zinking opened this issue Jul 26, 2023 · 12 comments · Fixed by #36899
Closed

[Java][Docs] ArrowSubstrait options preventing jni modules from being build #36885

zinking opened this issue Jul 26, 2023 · 12 comments · Fixed by #36899
Assignees
Milestone

Comments

@zinking
Copy link

zinking commented Jul 26, 2023

Describe the enhancement requested

I am following documentation doc to build up the dataset jar. everything works fine till this step To build all JNI libraries (MacOS / Linux) except the JNI C Data Interface library:

and then I encountered following error:

-- Found the ArrowDataset static library: /Users/zhnwang/zhenw/arrow/java-dist/lib/x86_64/libarrow_dataset.a
CMake Error at dataset/CMakeLists.txt:19 (find_package):
  By not providing "FindArrowSubstrait.cmake" in CMAKE_MODULE_PATH this
  project has asked CMake to find a package configuration file provided by
  "ArrowSubstrait", but CMake did not find one.

  Could not find a package configuration file provided by "ArrowSubstrait"
  with any of the following names:

    ArrowSubstraitConfig.cmake
    arrowsubstrait-config.cmake

I took a look at the commit history and tried with following updated command

mvn generate-resources \
    -Pgenerate-libs-jni-macos-linux \
    -DARROW_GANDIVA=ON \
    -DARROW_SUBSTRAIT=OFF \  <- added
    -DARROW_JAVA_JNI_ENABLE_GANDIVA=ON \
    -N

still the same error?

I suppose the documentation should be added after ArrowSubstrait change?

Component(s)

Java

@raulcd raulcd changed the title ArrowSubstrait options preventing jni modules from being build [Java][Doc] ArrowSubstrait options preventing jni modules from being build Jul 26, 2023
@raulcd raulcd added the Type: usage Issue is a user question label Jul 26, 2023
@raulcd
Copy link
Member

raulcd commented Jul 26, 2023

CC ~ @danepitkin @davisusanibar

@kou kou changed the title [Java][Doc] ArrowSubstrait options preventing jni modules from being build [Java][Docs] ArrowSubstrait options preventing jni modules from being build Jul 26, 2023
@davisusanibar
Copy link
Contributor

Hi @zinking thank for reporting this.

I will update the documentation to include Substrait changes as well.

@danepitkin
Copy link
Member

You need to build with ARROW_SUBSTRAIT=ON. It's currently required when building the dataset module.

@danepitkin
Copy link
Member

Hey @davisusanibar , I went ahead and created a PR to fix it.

kou pushed a commit that referenced this issue Jul 27, 2023
…es (#36899)

### Rationale for this change

The Java JNI dataset module recently included the Substrait module as a dependency. The dependency was added to the CI scripts, but not added to the build profiles and documentation yet.

### What changes are included in this PR?

- Update maven build profiles
- Update Java build documentation

### Are these changes tested?

I tested locally on MacOS and was able to reproduce + fix with this change.

### Are there any user-facing changes?

No
* Closes: #36885

Authored-by: Dane Pitkin <dane@voltrondata.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 14.0.0 milestone Jul 27, 2023
@zinking
Copy link
Author

zinking commented Jul 27, 2023

@danepitkin I'm actually not sure why dataset isn't built by default in the java folder, I suppose there is currently nothing stopping us from doing so?

@danepitkin
Copy link
Member

The dataset module relies on C++ shared libraries, which need to be built first. The default java modules (e.g. built from mvn clean install) don't rely on any shared libs.

@zinking
Copy link
Author

zinking commented Jul 28, 2023

@danepitkin it seems I still have to figure out how to build the complete jni dataset jar myself (mvn clean install usually misses the jni libraries), instead of one maven command.
but it's there on maven repo, shouldn't there be 1 for the development phase ?

@danepitkin
Copy link
Member

Are you suggesting to have one maven build profile that builds all of java modules/C++ libs/JNI modules? You could script this pretty easily. I believe maven is meant to be used to build one component at a time which is why there is one maven command for each of java modules/C++ libs/JNI modules. @davisusanibar any thoughts on this?

@danepitkin
Copy link
Member

Actually, I guess this is trivial to do. This CI build script supposedly builds all of Java in one command: https://github.com/apache/arrow/blob/main/ci/scripts/java_full_build.sh#L49

@zinking
Copy link
Author

zinking commented Jul 29, 2023

Actually, I guess this is trivial to do. This CI build script supposedly builds all of Java in one command: https://github.com/apache/arrow/blob/main/ci/scripts/java_full_build.sh#L49

exactly what I am looking for, thanks a lot.

@zinking
Copy link
Author

zinking commented Jul 31, 2023

@danepitkin one more question: I can now successfully build the arm64 arch artifact, but in the jar there is no x8664 arch artifact. I am not sure if this is possible on mac build environment. or how should I do the cross compilation to get a usable release artifact jar? thanks.

@kou
Copy link
Member

kou commented Jul 31, 2023

You can use Docker to build x86_64 artifacts on arm64 machine. Our CI job also uses Docker:

archery docker run \
-e ARROW_JAVA_BUILD=OFF \
-e ARROW_JAVA_TEST=OFF \
java-jni-manylinux-2014

BTW, for a development use case, you may want to use nightly packages: https://arrow.apache.org/docs/dev/developers/java/building.html#installing-nightly-packages

R-JunmingChen pushed a commit to R-JunmingChen/arrow that referenced this issue Aug 20, 2023
…profiles (apache#36899)

### Rationale for this change

The Java JNI dataset module recently included the Substrait module as a dependency. The dependency was added to the CI scripts, but not added to the build profiles and documentation yet.

### What changes are included in this PR?

- Update maven build profiles
- Update Java build documentation

### Are these changes tested?

I tested locally on MacOS and was able to reproduce + fix with this change.

### Are there any user-facing changes?

No
* Closes: apache#36885

Authored-by: Dane Pitkin <dane@voltrondata.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…profiles (apache#36899)

### Rationale for this change

The Java JNI dataset module recently included the Substrait module as a dependency. The dependency was added to the CI scripts, but not added to the build profiles and documentation yet.

### What changes are included in this PR?

- Update maven build profiles
- Update Java build documentation

### Are these changes tested?

I tested locally on MacOS and was able to reproduce + fix with this change.

### Are there any user-facing changes?

No
* Closes: apache#36885

Authored-by: Dane Pitkin <dane@voltrondata.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants