Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-class Java API #58973

Open
cowwoc opened this issue May 26, 2021 · 9 comments
Open

First-class Java API #58973

cowwoc opened this issue May 26, 2021 · 9 comments
Labels
feature A request for a proper, new feature. oncall: java triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@cowwoc
Copy link

cowwoc commented May 26, 2021

馃殌 Feature

Provide a first-class Java API that exposes PyTorch functionality in a Java-eque manner.

Motivation

The Java ecosystem is huge. There are many developers clamoring to integrate PyTorch pipelines into their workflow but they are unable to do so due to the lack of an official Java API.

Pitch

https://github.com/bytedeco/javacpp-presets/tree/master/pytorch contains a quickly maturing Java binding that is easy to maintain. The author would be more than happy to work with you if you could offer some official backing.

Alternatives

Use a generic python-java bridge (e.g. jpype) to interact with PyTorch. This isn't a good option because many Java developers do not code in Python nor are they interested in doing so.

@anjali411 anjali411 added feature A request for a proper, new feature. oncall: java triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 26, 2021
@skyline75489
Copy link
Contributor

Hi @cowwoc . Thanks for the feedback. According to my research, there's existing PyTorch Java binding as well as Android binding. Is there anything you want to do that's not available in the existing bindings? I don't know the Java world that well to understand why javacpp-presets/pytorch is needed. Could you be kind enough to explain the difference between this project and the existing bindings? Thanks again.

@cowwoc
Copy link
Author

cowwoc commented May 27, 2021

@skyline75489 Unless I missed something, it looks like you are talking exclusively about model inference from Java code. I am talking about exposing 100% of the API to Java. Meaning, users should be able to create, training and infer models without needing to interact with any language other than Java. Further, I tried digging into the documentation for the aforementioned bindings and I only saw 5 classes in the Javadoc. Something seems to be missing.

I'll let @saudet elaborate further on https://github.com/bytedeco/javacpp-presets/tree/master/pytorch since he knows it better than I.

@saudet
Copy link

saudet commented May 28, 2021

I'll let @saudet elaborate further on https://github.com/bytedeco/javacpp-presets/tree/master/pytorch since he knows it better than I.

@skyline75489 Right, how would you port to Java this "simple end-to-end example"?
https://pytorch.org/cppdocs/frontend.html#end-to-end-example

@skyline75489
Copy link
Contributor

Thanks @cowwoc and @saudet for the reply. We at Microsoft are trying to improve PyTorch and help the community. If there鈥檚 a clear path towards better Java support inside PyTorch itself, we鈥檒l consider putting some effort in the specific area.

My current understanding: there鈥檚 a reasonable amount of JNI bindings inside PyTorch but those are mainly for inferencing. In order to improve the Java support we鈥檒l need to extend/refurbish the Java bindings.

Is my understanding correct? Thanks in advance

@saudet
Copy link

saudet commented May 29, 2021

Yes, that's correct, roughly speaking the current Java API of PyTorch works for inference, but not for training. However, I was able to map the C++ API with JavaCPP and it works well. Training for MLPs, CNN, RNNs, transformers, etc is doable from Java, see bytedeco/javacpp-presets#623 (comment). The idea would be to write a high-level API on top of that. DJL did that for MXNet using JNA (which works well enough since MXNet's native API is C only, but I've offered to switch to JavaCPP since it's more efficient, see apache/mxnet#19797), while we're already using JavaCPP in the case of DL4J and TF Java since their native C APIs are not fully usable and require some C++. I would propose doing the same for PyTorch.

@gchanan
Copy link
Contributor

gchanan commented Jun 2, 2021

We don't have any current plans to officially support full API bindings in java. If there is functionality that makes third-party bindings easier to build/maintain we'd like to hear about that, though.

@Thrameos
Copy link

Thrameos commented Jun 2, 2021

I should mention that JPype has been working on the reverse bridge where Java and Python are integrated in reverse (Java calls Python rather than Python calls Java). I have had a working test branch available for a while called epypj. It is currently stalled because it is a huge number of unit tests required to validate that all valid Python commands can be accessed from Java. That and a dispute with my employer over contributing to Python development have been pretty demotivating, but if others wish to contribute to a fully function CPython in Java which can be used for any Python library (including stuff like numpy and matplotlib) it is doable.

It is basically just a FFI which exposes the CPython API in Java and used JPype so that Java objects properly have a presence in Python space. It does use code generators which use reflection in Python to identify callable structures and modules. These can be "customized" using Java classes to construct Jar files which contain all the stubs required to call Python libraries from Java. Or objects can just be treated as Python generics objects which have a fixed interface which polymorphs to implement Java concepts like Iterable or Container (basically if certain slots are found it automatically adds an interface to class definition of the object being created).

I am not sure if this meets the goal of this project, but you should at least be aware that it exists.

@saudet
Copy link

saudet commented Jun 4, 2021

We don't have any current plans to officially support full API bindings in java. If there is functionality that makes third-party bindings easier to build/maintain we'd like to hear about that, though.

Thanks! Generally speaking, anything that makes it easier to bind with tools like pybind11 is welcome. JavaCPP is essentially the equivalent of pybind11 for Java, so anything that makes it easier to bind for Python should also make it easier for Java.

/cc @HGuillemet @jbaron @wmeddie @zaleslaw @Craigacp @karllessard

@zaleslaw
Copy link

zaleslaw commented Jun 5, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A request for a proper, new feature. oncall: java triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

7 participants