Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve JPMS support: preinstall native libraries in jlink runtime #21

Conversation

HGuillemet
Copy link
Collaborator

Here is an attempt to improve support for creating jlink image by pre-extracting the native libraries in the runtime.
It has the following advantages compared to linking native jars :

  • it's more developer friendly since we avoid the mess of adding the native jars to the image with the --add-modules command line argument of jlink, with all possible variations (see this PR).
  • it's more end-user friendly since we don't need to extract the libraries to a hidden, never cleaned, cache in its home directory.

This PR defines a new Gradle task javacppBuildNativeModule which performs the following steps:

  1. Obtain the module path from the main source set (as computed by the java plugin), and the main class from the application configuration.
  2. Determine the set of java class dependencies by walking from the main class, using the Javassist bytecode editor and its ability to read the constant pool table. This is better than reading all classes from the whole module path since we end up with a much smaller set. It also avoids the case where some classes in the presets need other presets not included in the artifact dependencies (seen with opencv with some classes needing cpython).
  3. Retain from this set only the classes annotated with the JavaCPP Properties.
  4. Call JavaCPP Loader.load on each of the retained classes, with system properties org.bytedeco.javacpp.cachedir and org.bytedeco.javacpp.cachedir.nosubdir set to populate a temporary lib directory with all, and only, the required native libraries.
  5. Compile a module-info.java containing module org.bytedeco.javacpp.libs {}
  6. Build a jmod containing this module descriptor and the content of the lib directory. This allows to delegate to jlink the work of installing the native libraries in the proper runtime directory, depending on the platform.
  7. Call jlink with the jmod added to the module path and --add-modules org.bytedeco.javacpp.libs.

Here is an example build script using kotlin DSL:

plugins {
    application
    id("org.bytedeco.gradle-javacpp-platform") version "1.5.8-SNAPSHOT"
    id("org.beryx.jlink") version "2.25.0"
}

extra["javacppPlatform"] = "linux-x86_64"

dependencies {
    implementation("org.bytedeco:opencv-platform:4.5.5-1.5.7")
}

group = "org.bytedeco"
version = "1.0-SNAPSHOT"
description = "gradle-jlink-sample"

application {
    mainModule.set("org.bytedeco.sample")
    mainClass.set("org.bytedeco.sample.Application")
    applicationDefaultJvmArgs = listOf("--add-modules", "ALL-MODULE-PATH")
}

jlink {
    addExtraModulePath("build/native/native.jmod");
    addOptions("--add-modules", "org.bytedeco.javacpp.libs")
}

tasks.named("jlink") {
        dependsOn("javacppBuildNativeModule")
}

Tested and working on a simple application using opencv and javafx, but for some obscure reasons I didn't investigate:

  • multiple versions of the same library are installed (like libjniopenblas1.so, libjniopenblas2.so...).
  • libopenblas_nolapack.so.0 is still copied in ~/.javacpp altough it's present in the runtime.

@saudet
Copy link
Member

saudet commented Mar 21, 2022

Why are you trying to hack this with JMOD? For this to make sense, we need to show it works well without JMOD! I don't think the JDK considers native libraries to part of the module system, but please someone correct me if I'm wrong @johanvos? @mikehearn? @AlanBateman? For example, if I do the following on my Linux machine, everything seems to work perfectly fine for JNI and jlink without any trace of JMOD, and without JavaCPP extracting anything in ~/.javacpp or anywhere else:

git clone https://github.com/bytedeco/sample-projects
cd sample-projects/opencv-stitching-jlink
mvn clean package
sed -i s/opencv-platform/opencv/g pom.xml
mvn clean package
unzip -j ~/.m2/repository/org/bytedeco/openblas/0.3.19-1.5.7/openblas-0.3.19-1.5.7-linux-x86_64.jar -d ./target/maven-jlink/default/lib
unzip -j ~/.m2/repository/org/bytedeco/opencv/4.5.5-1.5.7/opencv-4.5.5-1.5.7-linux-x86_64.jar -d ./target/maven-jlink/default/lib
ln -s libopenblas.so.0 ./target/maven-jlink/default/lib/libopenblas_nolapack.so.0
./target/maven-jlink/default/bin/stitch panorama_image1.jpg panorama_image2.jpg --output panorama_stitched.jpg

Am I missing something? Why do we need JMOD?

@HGuillemet
Copy link
Collaborator Author

HGuillemet commented Mar 21, 2022

While I agree with you that some standard must emerge about the handling of native libraries by the artifacts distribution system, by the java compiler, by the java runtime and by jlink, this PR is meant to propose the best technical solution with the tools we have today, not to prove anything, like that jmod is hell.

I don't think the JDK considers native libraries to part of the module system

jmod was mainly introduced for bundling native libraries needed by modules. In a jmod, the library are in a special directory in the archive and treated as such by link which copy them in the proper directory.

It's possible to do without jmod here and install manually the libraries in the image runtime AFTER jlink has run, like you did. That's also what I do in one of my application. But:

  • the directory where the image is created depends on the jlink plugin used (build/image when using the Badass Gradle plugin, target/maven-jlink/default when using the apache maven plugin, etc...)
  • the directory within the image where to install the lib depend on the platform (lib on linux, bin in windows, Contents/runtime/Contents/Home/lib on MacOSX)
  • it might be interesting to directly use jpackage that will call jlink itself before constructing a package. In this case we don't have the opportunity to patch the image after jlink and before the packaging.

For these reasons delegating the installation of the libraries in the image to jlink using a temporary, pure-native, jmod seems the easiest for me.

Concerning the JavaCPP- specific questions of whether unzipping the native jars can be enough instead of calling Loader.load, I think only you can say when this will always work or what is the best.

@saudet
Copy link
Member

saudet commented Mar 22, 2022

It still feels like what you want to do is only tangentially related to JavaCPP. Why not create some generic plugin that also works for other libraries that chose not to use JavaCPP?

@HGuillemet
Copy link
Collaborator Author

This can be implemented in a separate plugin if you prefer, but I doubt it can be used in another framework than JavaCPP.
Steps 3 and 4 above are JavaCPP specific.
What do you think of them ? Is there a better way to do it ?
Any idea about the 2 problems mentioned at the end of the PR description?

@saudet
Copy link
Member

saudet commented Mar 23, 2022

This can be implemented in a separate plugin if you prefer, but I doubt it can be used in another framework than JavaCPP.
Steps 3 and 4 above are JavaCPP specific.
What do you think of them ? Is there a better way to do it ?

I'm not sure I understand what you're trying to do there because, for example, how are you going to make this work with the JMOD files from say JavaFX? jlink is just going to put everything, unfiltered, we have no control over that. I think this is all very specific to your application, which is fine, but if the goal is to make a tool, this needs to be generalized a bit more, in my opinion. Maybe what you want is an improved version of ProGuard that supports JNI? Android apparently already has something for other kinds of resources when we enable "shrinkResources", so why not come up with something like that, but for native libraries?

Any idea about the 2 problems mentioned at the end of the PR description?

JavaCPP keeps trying to use the cache to rename libraries it finds in the library path, but we can easily work around that by disabling the cache entirely with a system property. Let me work on that...

@HGuillemet
Copy link
Collaborator Author

I'm not sure I understand what you're trying to do there

Just extracting the library from the native jar, but counting on the loader for that, instead of unzipping the jar, in order to execute the code that needs to be executed at this moment.

This is the point of the LoadEnabled interface IIRC.
But what we would need here exactly is to trigger some onCache code, rather that code to be run when the lib is loaded in memory.

This reproduces more or less the behaviour of the cache mojo.

Also by determining the exact class dependencies and only loading those classes we limit the extraction to what is sufficient, and not the whole native jar.

Is there something else still unclear ?

saudet added a commit to bytedeco/javacpp that referenced this pull request Mar 23, 2022
@saudet
Copy link
Member

saudet commented Mar 24, 2022

I keep telling you, that's not general enough to be interesting. This is not useful for users not using jlink, but instead using something else like Android or GraalVM Native Image. I know you don't personally care about JavaFX, Android, or GraalVM, but there are so many more developers that do care about those vs jlink that it's not even funny. If you want to make this specific to JavaCPP, then make it work for other things than jlink. If you want to limit yourself to jlink, then make it work for other things than JavaCPP. Anyway, I understand you want to limit yourself to jilink and JavaCPP only, but I don't feel it's worth the time I would need to maintain this myself with all the dependencies your changes bring. However, there's nothing in your pull requests that depend on the current code in gradle-javacpp, so you could create another repository under https://github.com/bytedeco/ named "a-plugin-for-jlink-that-does-not-suck" or whatever you like and make releases in the org.bytedeco group. I'm perfectly OK with that.

In any case, I've added in commit bytedeco/javacpp@0e07735 a new system property "org.bytedeco.javacpp.cacheLibraries" that we can set to "false" to prevent JavaCPP from doing things in the cache with libraries.

@HGuillemet HGuillemet closed this Mar 24, 2022
@saudet
Copy link
Member

saudet commented Mar 24, 2022

Or, how about this, you take over gradle-javacpp and do whatever you want with it. This way I wouldn't need to worry about maintaining it. It's not a big plugin, but when things break, someone needs to fix it, and if that someone is you, that works. There hasn't been anything new added for over a year now, and I'm not planning on adding anything either, so there shouldn't be any problems letting you run things and see how it goes. Do you want to take that responsibility?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants