Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSGi compliant jars for containers such as Equinox or Felix. #3022

Open
carldea opened this issue Mar 12, 2017 · 34 comments
Open

OSGi compliant jars for containers such as Equinox or Felix. #3022

carldea opened this issue Mar 12, 2017 · 34 comments
Labels
DevOps Issues related to CI/CD and pipelines Enhancement New features and other enhancements
Milestone

Comments

@carldea
Copy link

carldea commented Mar 12, 2017

There are many OSGi containers with bundle context class loaders capable of dynamically loading Java services. It would be nice for DL4J and its dependencies to be OSGi friendly.

Some helpful links:
http://felix.apache.org/documentation/subprojects/apache-felix-maven-bundle-plugin-bnd.html
http://blog.sonatype.com/2009/09/maven-tips-and-tricks-creating-an-osgi-project-with-maven

@streetturtle
Copy link

Hi @carldea, did you manage to OSGIfy DL4J?

@jckautzmann
Copy link

Hi @carldea, any update on this one?

@agibsonccc
Copy link
Contributor

Hey folks - we want to do this at some point as part of the eclipse foundation. We'd still accept contributions for this.

@saudet saudet added Enhancement New features and other enhancements help wanted labels Apr 15, 2018
@raver119 raver119 added DevOps Issues related to CI/CD and pipelines Java and removed help wanted labels Apr 26, 2018
@laeubi
Copy link

laeubi commented Feb 11, 2019

I have used ND4J successfully inside an OSGi-Container, I'll try to contribute a patch to add OSGi-Metadata.

@tuf22191
Copy link

Hi, what is the status of this?

@laeubi
Copy link

laeubi commented May 1, 2019

My current problem is, that it is not easy to run a full ND4J build so I can't test my contribution right now :-\

@treo
Copy link
Member

treo commented May 2, 2019

@laeubi feel free to join us in gitter and ask questions about it. If you've got all the prerequisites installed, it is as simple as mvn clean install -pl '!pydatavec,!jumpy,!pydl4j,!nd4s' -Dmaven.test.skip=true

@laeubi
Copy link

laeubi commented May 2, 2019

@treo Thanks for the hint, with this command line I was now able to build the project. I still got lots of warnings from maven but the build completes with a success state.

@timothyjward
Copy link

I will take a look at this, and feed back on options for packaging things so that they work reliably

@damadux
Copy link

damadux commented Aug 12, 2019

I have this issue as well, when trying to launch an Eclipse application. I suspect this has to do with multiple artifacts having the same package (e.g. org.nd4j.nativeblas being a package in org.nd4j.native-api and org.nd4j.native.linux-x86_64 in my case, causing a NoAvailableBackendException). However, keep in mind that the JUnit Test works perfectly. Here is the pom that I use to generate all dependencies, using Reficio's p2-maven-plugin (https://github.com/reficio/p2-maven-plugin)

https://gist.github.com/damadux/a199ab600338ca6e62698dcad01f2ddf

@timothyjward
Copy link

I suspect this has to do with multiple artifacts having the same package (e.g. org.nd4j.nativeblas being a package in org.nd4j.native-api and org.nd4j.native.linux-x86_64 in my case, causing a NoAvailableBackendException).

There are several different places where this is a problem.

  • There are API packages split between the common, context and api modules, which must therefore be combined
  • There are packages shared between the native-api and native implementation
  • The JavaCPP library is used to load the native code, and because of the way this is done it requires the JavaCPP class loader to be able to see the JNI stubs, this means that JavaCPP and the native code must be in the same module

The "best" end result I could manage was to keep the "frontend" API separated from the "backend" code in an OSGi bundle.

The frontend has a requirement on the existence of a backend capability in the OSGi runtime. The frontend bundle also contains the JavaCPP library, which cannot safely be put in another module.

The backend is then packaged as a fragment bundle which attaches to the frontend, it provides the relevant capability so that the front end can resolve, but as a fragment it also flattens the class space so that the native code can be loaded properly. Different backend implementations (or the same implementation for different architectures) can be provided and deployed as separate fragments.

@saudet
Copy link
Contributor

saudet commented Aug 13, 2019

@timothyjward The JavaCPP Presets were refactored recently to avoid this kind of limitation with modules: http://bytedeco.org/news/2019/04/11/beyond-java-and-cpp/
Could you explain in what way this still isn't enough for OSGi?

@timothyjward
Copy link

Could you explain in what way this still isn't enough for OSGi?

The problem

The entire loading model of JavaCPP is fundamentally based around a piece of code telling the Loader to load all of the native code. This has been thought through, as there's a "context" (i.e. Class) passed in which lets you find the native code as a resource. There's then all sorts of caching going on (which I won't claim to understand completely) but the fundamental problem is this line:

https://github.com/bytedeco/javacpp/blob/c3946d20146763f484a85b73376fc7e6c05f2fe0/src/main/java/org/bytedeco/javacpp/Loader.java#L1483

On this line there is a call to System.loadLibrary() to load the native code. There is no way to pass a ClassLoader when using this method so it always uses the ClassLoader from the class that called System.loadLibrary(). In this case that type is org.bytedeco.javacpp.Loader.

Obviously if JavaCPP is packaged in its own OSGi bundle then this ClassLoader has no visibility of any JNI stubs that exist in other bundles (specifically the bundles that contained the original native binaries).

How it's supposed to work in OSGi

When a bundle in OSGi wants to load native code that it contains it is supposed to call System.load() from a class contained in the same bundle passing in the library name. The OSGi framework sets up the Bundle ClassLoader such that the correct native code is chosen for the platform (if multiple platforms are supported in the same bundle). This works because the call to System.load() picks up the correct ClassLoader. The Native Library is then also able to find other resources in the defining bundle (for example JNI stubs).

How could JavaCPP be fixed?

The only real option is for JavaCPP to make the call to System.loadLibrary() from a class defined using the correct ClassLoader (e.g. the one passed in as a context). Off the top of my head I can see two ways to achieve this:

  1. When generating code at build time JavaCPP could add a load method to a class in the bundle. This method could be called reflectively and used to load the library in the correct context
  2. When asked to load the native library JavaCPP could dynamically define a class using either the bundle class loader or a temporary child class loader. This class can contain a method which loads the native library.

In either setup what we're trying to do is to ensure that the call to System.load() or System.loadLibrary() picks up the correct ClassLoader from the stack.

I hope this helps to explain the issue.

@timothyjward
Copy link

Next steps DL4J

The deeplearning4j repository contains quite a few modules. Unfortunately it seems as though these modules are all tightly coupled by split packages, making it impossible to package and deploy them independently.

  • The org.deeplearning4j.util package is split across modules deeplearning4j-util, deeplearning4j-common, deeplearning4j-core and deeplearning4j-nn.
    • The modules deeplearning4j-cuda and deeplearning4j-nlp-uima also reference this split package in their tests
  • The org.deeplearning4j.datasets.iterator.impl package is split across modules deeplearning4j-datasets, deeplearning4j-utility-iterators and deeplearning4j-nn
  • The org.deeplearning4j.datasets.datavec package is split across modules deeplearning4j-core and deeplearning4j-datavec-iterators
  • The org.deeplearning4j.ui package is split across modules deeplearning4j-core, deeplearning4j-play and deeplearning4j-ui

With ND4J it was possible to combine a limited number of modules (common, context, buffer, api) to make a frontend and attach a backend as a fragment. This isn't an ideal solution, but it works and is relatively clean. The level of coupling in DL4J, however, is much higher. Unless API changes are permitted (relocating or changing the packages for some types to merge split packages) it is more or less impossible to do anything other than create an "uber bundle" containing all of the DL4J modules, including the UI.

I'm happy to go ahead and create this uber bundle, but I thought it worth discussing whether a more invasive approach (that gets a better end result) might be preferred.

@saudet
Copy link
Contributor

saudet commented Aug 13, 2019

I see, each OSGi bundle has its own class loader, so this sounds a lot closer in concept to uber JARs or webapps in containers like TomCat than Maven modules anyway, doesn't it? In which case having multiple versions of JavaCPP running in parallel in multiple bundles sounds like the way to go anyway. Would you agree?

@timothyjward
Copy link

In which case having multiple versions of JavaCPP running in parallel in multiple bundles sounds like the way to go anyway. Would you agree?

It's important to realise JavaCPP leaks heavily through the ND4J API (specifically ND4J accepts and returns JavaCPP types such as org.bytedeco.javacpp.Pointer).

If you tried to have multiple "private" versions of JavaCPP in each module then you would get Linkage Errors or Verify Errors at runtime because the Class Space is not compatible (i.e. you would be trying to pass the ND4J API an object implementing a different org.bytedeco.javacpp.Pointer).

This API leakage means that someone using or extending ND4J must share the same view (i.e. loaded by the same ClassLoader) of the JavaCPP types as the ND4J API does. This is why the PR repackages JavaCPP and exports it from the API bundle.

To finally answer your question, it absolutely is possible to have multiple versions of JavaCPP in the runtime, but each ND4J API (and the people using it) need to agree on one version and ignore the others. If there are two versions of ND4J running then these could use separate versions of JavaCPP, and clients could use one of them as long as they share the view of the JavaCPP API.

@saudet
Copy link
Contributor

saudet commented Aug 14, 2019

Right, that's how I see it. Each OSGi bundle is basically a standalone service so we need to provide some sort of minimal facade, like a REST API in the case of webapps. I'm not sure I understand what the issue is. Could you clarify? As far as I understand, whether we have multiple versions of JavaCPP or multiple versions of ND4J doesn't change the limitations.

@saudet
Copy link
Contributor

saudet commented Aug 14, 2019

Actually, @alexanderst and @raver119 have already started working on a "servlet": #96 Maybe the OSGi bundle could use that API?

@laeubi
Copy link

laeubi commented Aug 14, 2019

No each bundle has a own classloader, but is not a standalone service. A bundle can import packages/bundles and extend its classspace in a declarative manner this way. That has nothing to do with uber-jar (what is the other way round, put everything into one big blob), so bundles share classes with each others.
Webapps work similar, just you can't share class-space between web-apps (just add them to the global path).

Whats the problem with split-packages? If you define Package X in Bundle A and also in Bundle B, the a Bundle C (that likes to use classes in X) can only see either the Classes from A or B but not both. You still can use splitpackages e.g. with fragments (this has some limitations since you can't require a fragment), or if you use Require-Bundle instead of import package but this makes it harder to manage since you have to know all bundles and can still lead to confusing errors.

@saudet
Copy link
Contributor

saudet commented Aug 14, 2019

Package names are the least serious issue I think. To use it as you describe, we basically need to provide factory methods everywhere, among other things, don't we? It sounds like everything would have to be designed for OSGi to be able to use it like that, and if not, it ends up being the equivalent of a webapp... Put another way, how can we design a library that's perfect for OSGi, but that doesn't depend on OSGi?

@laeubi
Copy link

laeubi commented Aug 14, 2019

No we won't need factories at all, but I'll try to more generalize / divide the problem into different parts

First problem 'split packages'

Split-packages are always a problem, so for a clean design that can be used in a wide array of fields (applets, webapps, osgi, plain java, security enabled or not, java 9 module system) they should be IMO avoided under all circumstances.

So the very basic question is: Would it be possible to get rid of split packages at all (e.g. merge modules that in fact are not independent (e.g. require package private access), relocate classes. All this is plain java, no relation to OSGi, but will help to integrate into OSGi.

Second problem 'classloading'

Many project suffer from that they are developed in a context where there is only one global class-loader where everything is always accessible from everywhere. The most common pitfalls are classForName, java serialization and reflection where developers tend to rely on the "global classloader case"
The problem only occurs in more restrictive context: OSGi, Webapps, Sandboxed applications (applets), even though it is often referenced as a ""-problem it is in fact a problem of the design that relies on a specific setup

While this could be a challenging task it is IMO always worth to fix that kind of problem since these thing can be quite hard to debug and can lead to subtile bugs/security issues.
Again, not a OSGi thing, but will help to integrate into OSGi.

Third problem 'native-code'

Loading of native code was always a problem with java IMO, since the whole framework for this depends on some magic to load/find native code that can not be controlled/debugged to let the programmer help loading a lib. So often there are different 'try-this-and-that' approaches to satisfies special deployment methods (IDE, Build, User installation, ...).

I think there is not much we can do here but we might add some more support in the try this and these approach to enable OSGi

how can we design a library that's perfect for OSGi, but that doesn't depend on OSGi

If problem 1 + 2 are solved, there is not much to do, OSGi will only need some additional headers in the MANIFEST.MF file that will be ignored by the rest of the world, and these even can be automatically derived and added as part of your normal maven build.

For problem 3, we must check what we can do or must do, since OSGi already includes some methods to embed native code, in the best case, also here we only need to add some special headers to the manifest (but these can't be generated automatically)

@timothyjward
Copy link

Right, that's how I see it. Each OSGi bundle is basically a standalone service so we need to provide some sort of minimal facade, like a REST API in the case of webapps. I'm not sure I understand what the issue is. Could you clarify? As far as I understand, whether we have multiple versions of JavaCPP or multiple versions of ND4J doesn't change the limitations.

As @laeubi has already stated, this isn't really the case. An OSGi bundle is a jar file, it just has extra rules about what packages are visible to (and from) the outside world. You could use REST to talk between bundles, but it would be the same as using REST to talk between jar files (e.g. between nd4j-api and nd4j-native) in plain Java.

Package names are the least serious issue I think. To use it as you describe, we basically need to provide factory methods everywhere, among other things, don't we? It sounds like everything would have to be designed for OSGi to be able to use it like that, and if not, it ends up being the equivalent of a webapp... Put another way, how can we design a library that's perfect for OSGi, but that doesn't depend on OSGi?

This isn't really the case at all - you can see how in PR #8083 I was able to get ND4J working as an OSGi bundle with fragments (a fragment is basically an extension to an OSGi bundle) to provide the backend. These bundles work quite happily outside of OSGi. If we were able to enhance JavaCPP to be better at class loaders then the ND4J API and ND4J backend could be totally separate bundles (i.e. the backend not just a fragment).

The real problem that I encountered throughout the ND4J and DL4J codebase is that the jars being produced are highly coupled and poorly cohesive because the packages are poorly named. Using the same package name in different jar files is just bad, again, as @laeubi points out you simply can't do this in JPMS (the Java Platform Module System from Java 9 on).

Once the spilt package issues are solved the next step is to formally recognise which packages are public API, and which packages are private to the module (i.e. not intended to be used by other modules). This is necessary for both OSGi and JPMS, and forms the basis of your public contract with other modules. It also helps to improve the design and maintenance of your modules because you know which changes are affecting API and which are internal, you can even use a technique called baselining to tell you when you break binary compatibility.

The final part (discovering services/extensions) is probably the easiest part to manage. You can either use OSGi services to communicate between modules, or you can enhance your existing search. Here is an example of how I did this without forcing a runtime dependency on OSGi 04fa6a9#diff-1434875bdbce2621dd649e332005e347R84-R122

@saudet
Copy link
Contributor

saudet commented Aug 14, 2019

Thanks for the explanations! If I understand correctly, OSGi already has some system for loading JNI libraries? In that case, we don't need to use the one from JavaCPP, and we might not need to fix anything. It's already doing that in the case of Android, for example. Other than providing some script to "install" the libraries like OSGi likes them. Would that be sufficient?

As for fixing split packages, I'll leave it up to @AlexDBlack to decide what to do about that!

@timothyjward
Copy link

There has been some work going on to improve the OSGi support in JavaCPP, this in turn will allow more elegant solutions for DL4J and ND4J. I'm happy to do some further work in this area, but I do need some direction as to the preferred trade-offs.

  1. Keep all the existing Maven projects and package names, adding "OSGi bundle" projects that aggregate and repackage the other projects as OSGi bundles.
  2. Keep the existing package names but merge the current Maven projects to avoid split packages. The remaining Maven projects will be enhanced to produce OSGi bundles. This will keep API binary compatibility, but some maven coordinates will disappear.
  3. Keep the existing Maven project structure but change the package names to avoid having packages split between modules. This will keep maven coordinates, but break API binary compatibility.

My guess is that option 3 will be unpalatable due to the breaking API changes, but otherwise it is probably the "cleanest" simply enforcing the module boundaries that were originally intended. In my view option 1 is a less desirable approach as it is more confusing to users ("which project do I need?", "where is the source?") and it is harder to maintain (changes have impacts in more than one module). Option 2 feels like a happy middle ground, but it is quite a major change to the build layout.

What are the project owner's thoughts on the various approaches?

@AlexDBlack

@saudet
Copy link
Contributor

saudet commented Aug 23, 2019

To get something "now", that's going to be uber JARs like we have to do for anything else in these kinds of situations.

It's not possible to merge all modules to avoid split package, so that will unfortunately require some refactoring of the API.

@timothyjward
Copy link

To get something "now", that's going to be uber JARs like we have to do for anything else in these kinds of situations.

That's helpful - I'll continue on with that approach

@raver119 raver119 removed the Java label Nov 10, 2019
@AlexDBlack AlexDBlack added this to the 1.0.0 Release milestone Nov 20, 2019
@tonit
Copy link

tonit commented Jan 15, 2020

Hey there, i am in the process of achieving running some parts of DL4J in an OSGi container. Since I stumbled over this ticket only now.. and since this has a rather long story: is there any existing work done that can be shared (in a fork or a branch)?

@tonit
Copy link

tonit commented Jan 15, 2020

No we won't need factories at all, but I'll try to more generalize / divide the problem into different parts

First problem 'split packages'

Split-packages are always a problem, so for a clean design that can be used in a wide array of fields (applets, webapps, osgi, plain java, security enabled or not, java 9 module system) they should be IMO avoided under all circumstances.

So the very basic question is: Would it be possible to get rid of split packages at all (e.g. merge modules that in fact are not independent (e.g. require package private access), relocate classes. All this is plain java, no relation to OSGi, but will help to integrate into OSGi.

Second problem 'classloading'

Many project suffer from that they are developed in a context where there is only one global class-loader where everything is always accessible from everywhere. The most common pitfalls are classForName, java serialization and reflection where developers tend to rely on the "global classloader case"
The problem only occurs in more restrictive context: OSGi, Webapps, Sandboxed applications (applets), even though it is often referenced as a ""-problem it is in fact a problem of the design that relies on a specific setup

While this could be a challenging task it is IMO always worth to fix that kind of problem since these thing can be quite hard to debug and can lead to subtile bugs/security issues.
Again, not a OSGi thing, but will help to integrate into OSGi.

Third problem 'native-code'

Loading of native code was always a problem with java IMO, since the whole framework for this depends on some magic to load/find native code that can not be controlled/debugged to let the programmer help loading a lib. So often there are different 'try-this-and-that' approaches to satisfies special deployment methods (IDE, Build, User installation, ...).

I think there is not much we can do here but we might add some more support in the try this and these approach to enable OSGi

how can we design a library that's perfect for OSGi, but that doesn't depend on OSGi

If problem 1 + 2 are solved, there is not much to do, OSGi will only need some additional headers in the MANIFEST.MF file that will be ignored by the rest of the world, and these even can be automatically derived and added as part of your normal maven build.

For problem 3, we must check what we can do or must do, since OSGi already includes some methods to embed native code, in the best case, also here we only need to add some special headers to the manifest (but these can't be generated automatically)

Very good summary, @laeubi !

@timothyjward
Copy link

Hey there, i am in the process of achieving running some parts of DL4J in an OSGi container. Since I stumbled over this ticket only now.. and since this has a rather long story: is there any existing work done that can be shared (in a fork or a branch)?

I'm afraid that the work I've done is currently internal to a project, but I am trying to get it contributed back.

The main problem areas are as follows:

  1. Neither DL4J nor ND4J are packaged as OSGi bundles
  2. The DL4J code contains lots of split packages, making it impossible to bundle-ize the existing maven modules, you have to merge lots of them together.
  3. Native code loading (supplied by JavaCPP) doesn't completely work on a non-flat class path.

I've put a substantial amount of effort into getting something that just about works, packaging up the DL4J "API" and ND4J "API" into bundles, with platform specific fragments for the native/backend code. Note that the fragments must contain all the native code dependencies, and JavaCPP. This is obviously not the ideal way to do this, but it does work.

I've also put effort into improving JavaCPP's OSGi story, with thanks to @saudet for getting the patches whipped into shape. As of the latest release, JavaCPP is an OSGi bundle and does allow for native code to be loaded from other bundles, however there is a native code dependency problem, specifically that the native code needed by JavaCPP needs to be compiled separately and loaded by the JavaCPP bundle class loader. This means that pretty much anything which creates a custom pointer type doesn't work. See the PR at bytedeco/javacpp#344 for a description of the remaining problem. Note that this issue isn't purely an OSGi problem, it also leads to lifecycle problems in JavaCPP sometimes.

@laeubi
Copy link

laeubi commented Jan 17, 2020

@tonit
I packed a subset of ND4j into an update site and was able to run it, but as we are currently have no use-case for it development in that area are pause atm.
If you just need "get things working" you probably want to star by creating an OSGi-Jar with all dependencies in a lib folder and add them to the bundle classpath, I often use the maven-dependency-plugin:copy-dependencies goal for that.
Then put in an Activator and let run a sample code inside it. If that works (you might has to configure manually the loading of native-code by extracting them at runtime to a tmp directory) you can start removing libs from the claspath you have OSGi-Bundle for and replace them with proper package-imports.
That is not a real "nice" solution but works if you just want to try out the lib or just need it for a very restricted part of your eco-system.

@saudet
Copy link
Contributor

saudet commented Mar 30, 2020

I've also put effort into improving JavaCPP's OSGi story, with thanks to @saudet for getting the patches whipped into shape. As of the latest release, JavaCPP is an OSGi bundle and does allow for native code to be loaded from other bundles, however there is a native code dependency problem, specifically that the native code needed by JavaCPP needs to be compiled separately and loaded by the JavaCPP bundle class loader. This means that pretty much anything which creates a custom pointer type doesn't work. See the PR at bytedeco/javacpp#344 for a description of the remaining problem. Note that this issue isn't purely an OSGi problem, it also leads to lifecycle problems in JavaCPP sometimes.

This should now be taken care of with commits bytedeco/javacpp@abe3cf0 and bytedeco/javacpp-presets@d49542e. Let me know if you still see something that needs to be done though. Thanks for your help with this!

@cowwoc
Copy link

cowwoc commented Apr 23, 2020

@timothyjward I don't know if anyone mentioned this before, but you'll need to rename the dl4j-native module for it to work under JPMS.

Error occurred during initialization of boot layer
java.lang.module.FindException: Unable to derive module descriptor for C:\Users\Gili\.m2\repository\org\nd4j\nd4j-native\1.0.0-beta6\nd4j-native-1.0.0-beta6.jar
Caused by: java.lang.IllegalArgumentException: nd4j.native: Invalid module name: 'native' is not a Java identifier

The JAR filename can contain native in it but the module name cannot be nd4j.native. The portion between the dots cannot be native.

@AlexDBlack
Copy link
Contributor

FYI full list of split packages here: #8870

@AlexDBlack
Copy link
Contributor

FYI I have merged this PR that resolves all split package issues: KonduitAI#411
#8870

It will be available on eclipse/deeplearning4j within a day or so of this comment.
Note this is a breaking change for some packages/imports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DevOps Issues related to CI/CD and pipelines Enhancement New features and other enhancements
Projects
None yet
Development

No branches or pull requests