Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC4: PROJ JNI Overhaul #1551

Closed
wants to merge 3 commits into from
Closed

RFC4: PROJ JNI Overhaul #1551

wants to merge 3 commits into from

Conversation

kbevers
Copy link
Member

@kbevers kbevers commented Jul 8, 2019

This RFC proposes a complete overhaul of the Java Native Interface (JNI) to PROJ.
In short, the current Java interface will be replaced by a new interface that
is based on GeoAPI and includes the new PROJ functionality described in RFC3.

I plan on submitting this for a vote with the PSC next week. Please provide your feedback before then so the RFC describes the process of updating the Java interface as best as possible.

@rouault
Copy link
Member

rouault commented Jul 8, 2019

What would be the functional scope, that is would "manual" creation of objects from their components be included (proj_experimental.h API), or only higher level operations like instanciating a CRS from WKT, creating a CRS to CRS coordinate operation and reprojecting points ?

I'm trying to have a sense of the size of this new Java binding. The current one is rather small, so makes sense to be hosted here. If that newer one is much bigger, perhaps it would be more appropriate to have its own repository ? I'm not opposed to have it here either - I don't really care, just asking questions :-). This could also been seen as a consistency question, as pyproj, R PROJ4, etc.. have their own repo.

Another question would be if the native code of those Java bindings will use the C++ API or the C API ? (that question is also linked to the intended functional scope as the C++ API is wider than the C one). If that's the C++ one, then it would probably be better to be hosted here as it might be (a bit) less stable than the C one.

You mention JNI. Did you consider alternate technologies that are more "modern", like JNA etc ? (but same here, I'm not an expert in that matter. just asking)

@kbevers
Copy link
Member Author

kbevers commented Jul 8, 2019

I'm trying to have a sense of the size of this new Java binding. The current one is rather small, so makes sense to be hosted here. If that newer one is much bigger, perhaps it would be more appropriate to have its own repository ? I'm not opposed to have it here either - I don't really care, just asking questions :-). This could also been seen as a consistency question, as pyproj, R PROJ4, etc.. have their own repo.

It could just as easily be put in a separate repository as it can be included here. I did consider both before proceding with this RFC. My main reason to include it here is that we have been providing a Java interface for 15 years and I fear that it will be too disruptive if we remove it completely. The advantage of keeping the bindings in it's own repository is that we can keep the dependencies of a full PROJ build at a minimum. I also fear that a separate repository has a higher risk of being abandoned after a short while. I am interested in hearing the opinion of others on this matter.

As for the rest of the questions I'll defer to @desruisseaux who is the expert :-)

@desruisseaux
Copy link
Contributor

The proposed scope is JNI bindings in C++ for most ISO 19111 interfaces that are implemented by PROJ 6, including the ones for "manual" creations of CRS from their components. However I expect the bindings to be very close to a one-to-one match, so it would hopefully be easy for developers familiar with the PROJ C++ API. Plain C can be used instead of C++ if the community prefers.

Regarding consistency with pyproj or R, JNI may be particular in that it requires dedicated C/C++ code (I don't know if it is also the case for Python or R). It is easy to have the Java code in a separated repository, but a little bit more difficult to have the JNI C/C++ code separated: depending how the project is compiled it may force Java to load two C++ libraries instead of one, in which case I noticed some issues in the past caused by mismatch between the PROJ version on which JNI bindings were compiled and the PROJ version available at runtime (I was using some private API; the compatibility issues disappeared after I restricted myself to public API only). Compiling the JNI together with PROJ in the same library makes easier to guarantee that the JNI are consistent with the PROJ API. However if the community prefers a separated repository, this is workable too.

JNI alternatives like JNA or JNR are independent projects developed outside the JDK. As far as I know, JNI is the only gateway between Java and C/C++; all other projects are basically JNI generators compiling native code on-the-fly, then linking that code using JNI. For example JNR generates a lot of JNI code (we can see that by adding the -verbose:jni option to the java command). On one hand, those generators free us from the need to write C/C++ code, which would make easier to provide the Java bindings in a separated repository. But on the other hand, those generators adds a dependency to the project (the "compiler" for generating JNI on-the-fly). By contrast, classical JNI works out-of-the-box on any Java installation with no particular dependency. I also believe that JNI written by human and compiled with "real" C/C++ compiler can be more efficient, but I never benchmarked this hypothesis.

@kbevers
Copy link
Member Author

kbevers commented Jul 10, 2019

including the ones for "manual" creations of CRS from their components.

and by this you mean creating an operation from a proj-string? An equivalent function to proj_create() is going to be necessary in some form or another. Can GeoAPI handle that too or is that a "side-car function"?

However I expect the bindings to be very close to a one-to-one match, so it would hopefully be easy for developers familiar with the PROJ C++ API. Plain C can be used instead of C++ if the community prefers

If the GeoAPI can be mapped to the C API without including C++ API that is preferable.

However if the community prefers a separated repository, this is workable too.

I think this is mostly a question of the preference of the project. I would like to hear the opinion of the rest of the PROJ dev team, so please speak up people :-) As I see it there are advantages and disadvantages to including the Java bindings in the PROJ repository. On the plus side is that the code will be contained in a project that is well-maintained, ease of implementation (as Martin describes above) and familiarity for users (they are used to having the JNI as part of the PROJ package). On the down side we get more code to maintain (which the current developers are not particularly interested in nor capable of maintaining) and more dependencies are added.

@rouault
Copy link
Member

rouault commented Jul 10, 2019

and by this you mean creating an operation from a proj-string?

I was referring at for example creating a projected CRS from its name, base CRS, deriving conversion and coordinate system with proj_create_projected_crs()

If the GeoAPI can be mapped to the C API without including C++ API that is preferable.

You'll have to evaluate the completeness of the C API regarding what you want to do with it. It can be extended of course. I just pushed it to the extent of what I needed for GDAL and imagined what would be useful for QGIS and similar software. But if you want to do "esoteric" stuff like creating a DerivedGeographicCRS, this isn't mapped to the C API yet.

@kbevers
Copy link
Member Author

kbevers commented Jul 10, 2019

I was referring at for example creating a projected CRS from its name, base CRS, deriving conversion and coordinate system with proj_create_projected_crs()

Okay. This can be regarded as a follow-up question then. It is quite important that a PROJ Java interface can instantiate an operation from a PROJ string. So, would this be possible with GeoAPI or do we need to add additional PROJ specific functions?

I just pushed it to the extent of what I needed for GDAL and imagined what would be useful for QGIS and similar software.

Of course. It is only natural that the C API will evolve over time. My point here is that the C API is the most stable thing we can offer, so basing the Java interface on that is likely to produce code that requires as little attention in the future as possible. This may not be true, just my hypothesis :) If we need to extend the C API a bit that is not a big issue, hopefully others will benefit as well.

@desruisseaux
Copy link
Contributor

By "manual" creation from components I mean building GeodeticReferenceFrame (for example), CoordinateSystemAxis, etc. objects, then assemble those components into a CoordinateReferenceSystem. This is useful for example when a data file does not specify its CRS by an EPSG code or a WKT string, but rather by describing the CRS components. Examples: GeoTIFF or netCDF. Those file formats are parsed by GDAL, but users may want to do a similar task in other situations too. Another use case is to add a temporal axis to the CRS read from a GeoTIFF or netCDF file.

An equivalent function to proj_create() is a separated task but will be provided too. From GeoAPI point of view, PROJ is simply another namespace for CRS codes in addition of EPSG, IGNF, ESRI, etc..

Regarding the C versus C++ API, I let the community decide. But in addition of the completeness consideration raised by Even, it could be (but I did not verified) that mapping the C++ API may result in more "natural" code since we would be mapping two object-oriented languages.

Regarding whether the code would be on PROJ repository or a separated repository, another advantage of having it on the PROJ repository would be that, if the Java build is enabled, the GeoAPI test suite could be executed at build time, which provide another source of tests (including some GIGS tests) for PROJ.

Having the bindings on the PROJ repository increases the maintenance effort, but on my side I'm more likely to occasionally use this bindings than the previous ones (I need the bindings to be a GeoAPI implementation, which was not the case of previous bindings). This bindings would replace the GeoAPI-PROJ4 bindings, so I have an interest in contributing to its maintenance for the foreseeable future.

@kbevers
Copy link
Member Author

kbevers commented Jul 15, 2019

I have put the RFC up for a PSC vote on the mailing list: https://lists.osgeo.org/pipermail/proj/2019-July/008707.html

@hobu
Copy link
Contributor

hobu commented Jul 15, 2019

Sorry I was on vacation and wasn't tracking this thread very closely. Bindings like this, especially those inside the official repository, go stale once the persons responsible for them move on. The old Java bindings in PROJ went stale. Bindings like this are out of scope for the project.

If we're refreshing the Java bindings in PROJ again, why aren't we also adding Ruby, Python, and Node ones too? Each of those languages has a healthy project that wraps PROJ in that language's idioms, and there have been multiple instances where the binding project has fallen into disrepair and a new maintainer eventually picked up the task. Sticking the binding in the official repository ties it to PROJ's release cycle. hamstrings it by PROJ's available maintenance resources, and shades any other potential binding approaches in the language from gaining traction by being seen as "official".

This RFC does not make the case for why Java is special in relation to the other languages. Please update the language of the RFC to articulate this case in relation to the points I've made.

@desruisseaux
Copy link
Contributor

One thing that make Java bindings "special" compared to other languages is that they require a counter part in C/C++. I do not know well other languages, but if I understood correctly, Phyhon bindings invoke C/C++ methods directly (without the need that we write special C/C++ code for that), is that right?

The C/C++ code required for Java bindings is easier for users if they are (optionally) in the same binary than PROJ. Having one binary for PROJ and one separated binary for Java bindings bring some difficulties. They are workable, but slightly more difficult to use.

@kbevers
Copy link
Member Author

kbevers commented Jul 15, 2019

Thanks, Howard. I was hoping to get your input on this. I have two arguments for including this in the official repository:

  1. We've provided Java bindings for ~15 years, users will reasonable expect this to also be the case in the future.
  2. I fear that stand-alone Java bindings, without the exposure of the official repository, will go stale much quicker than if included in the main project.

Both arguments are easily countered, it is mostly a matter of what we as a project want to provide in the future. Or not provide. I see both pros and cons in keeping the status quo of including Java bindings as part of PROJ. In the end I came to the conclusion that for the current situation I am facing (lots of software in the organisation uses PROJ JNI - we need to deal with that some how) what's described in this RFC is my preferred solution. This is of course a very selfish view, which is why I put this RFC up in the first place - there's a discussion to be had here. I included the first point in the background section but I can add more if that is not adequate as well as adding my second point.

In the end this really boils down to the principle question: Do we want to provide bindings for other languages or not?

@hobu
Copy link
Contributor

hobu commented Jul 15, 2019

The C/C++ code required for Java bindings is easier for users if they are (optionally) in the same binary than PROJ.

How much C API would be needed to be added for the Java bindings to exist entirely outside the PROJ base repository going forward?

@hobu
Copy link
Contributor

hobu commented Jul 15, 2019

In the end this really boils down to the principle question: Do we want to provide bindings for other languages or not?

I regret doing this in GDAL for the reasons I described above. GEOS also has a similar problem. The bindings there are seen as "official", but they're in fact just yet another implementation. Maintainers in each language family know best how to adapt the API to that language's idioms (see GDAL's Python bindings for one of the worst examples of this failure), and tying to the main library's release schedule causes bugs to pile up if the release schedule of that library lengthens (PROJ's certainly did, and so did GEOS and GDAL at various times).

@kbevers
Copy link
Member Author

kbevers commented Jul 15, 2019

I regret doing this in GDAL for the reasons I described above. GEOS also has a similar problem.

Very good points against doing this. A -1 from you is definitely warranted based on your past experiences. I will not get offended by that :-) It will be a bit more work for Martin and I but nothing that can't be handled (although the C API might need to be expanded).

@desruisseaux
Copy link
Contributor

If we use use C++ API instead of C API, there is probably no addition needed for the proposed Java bindings. But the C/C++ code that we need to write is not an extension of PROJ API. They are code internal to the Java bindings. The Java Virtual Machine expects native code matching a very specific signature. Many of those Java Native Interface (JNI) methods will just forward the call to the corresponding PROJ method. But it is easier to have those JNI methods in PROJ instead than in a separated project because it is difficult to bundle native code in a Java application, especially if we want to stay platform-independent. If PROJ is already installed on the target platform and if this installation includes JNI methods, then it frees Java developers from the need to include JNI bindings (i.e. native, platform-specific code) in their own Java application.

Regarding experience with GDAL, I agree that maintainers in each language family know best how to adapt the API to the language's idioms. But in this proposal, PROJ would not be defining a Java API since we would use GeoAPI. So this proposal is purely about the mechanical aspect of the bindings; it does not put on PROJ the responsibility to define an API.

@desruisseaux
Copy link
Contributor

It will be a bit more work for Martin and I but nothing that can't be handled

Actually it may also be more work for Java developers using PROJ, because of the requirement to include native library in user's application. Workarounds exist (including use of alternatives like JNR), but they have other inconvenient. So in summary what make Java bindings "special" (leaving historical reasons apart):

  • Requirement for C/C++ counterpart (independently of C API completeness), which make easier for Java developers (not only bindings maintainers) to have them optionally bundled with PROJ.
  • API defined outside this project, so PROJ is not encumbered with that responsibility.
  • GeoAPI is an OGC standard, so the proposed bindings is probably less arbitrary than other bindings.

@hobu
Copy link
Contributor

hobu commented Jul 15, 2019

A -1 from you is definitely warranted based on your past experiences.

I'm not vetoing if you guys can make the case. I did want to hold up the "slow" sign though to consider why we would be adding this code.

If PROJ is already installed on the target platform and if this installation includes JNI methods, then it frees Java developers from the need to include JNI bindings (i.e. native, platform-specific code) in their own Java application.

It is convenient for Python users if the PROJ API + Python API stubs are distributed with PROJ. Same with Ruby or V8. If the primary case for doing this is simply convenience of Java users, I'm not so convinced.

That the ship has already sailed and Java already has this in PROJ for a very long time means I'm not going to stand in the way against this effort, but I would ask what is missing from PROJ's API to allow a Ruby or V8 developer from constructing the same thing using public PROJ APIs. If those items are identified through this effort we should consider adding them publicly rather than the Java bindings simply taking advantage of its privileged position.

@schwehr
Copy link
Member

schwehr commented Jul 15, 2019

I just voted +1 because:

  1. PROJ already has a Java API (so it should either keep moving forward or be booted to move forward externally)
  2. Getting a solid GeoAPI implementation will be a very good thing.

@hobu Has some really good points about the Java API being a special case by living inside PROJ.

I think we need to keep at the discussion of where the Java API (and other language APIs) should live and how they are maintained, but I think that discussion should be expanded elsewhere. I would like to see the eco system around PROJ grow and an improve as much as possible, but I'm not sure what that should look like. I don't think you can take GDAL as an example. So many lessons have been learned since those swig wrappers were created.

As for C vs C++, I lean towards C++ > C++11, but would not be against C. I would be strongly against C++03 :)

I currently work in a space that has rejected the existing Java PROJ interface and for that, this is would be very positive progress

@rouault
Copy link
Member

rouault commented Jul 15, 2019

I would be strongly against C++03 :)

That wouldn't be doable anyway. The PROJ C++ API is C++11.

@kbevers
Copy link
Member Author

kbevers commented Jul 15, 2019

If those items are identified through this effort we should consider adding them publicly rather than the Java bindings simply taking advantage of its privileged position.

Absolutely. This is why I would prefer the Java bindings be based on the C API, so we can find its short comings (if there is any).

I currently work in a space that has rejected the existing Java PROJ interface and for that, this is would be very positive progress

@schwehr would your "space" be willing to jump on a PROJ Java GeoAPI implementation? Having a few organisations on board early on will definitely help with maintenance in the long run.

@schwehr
Copy link
Member

schwehr commented Jul 15, 2019

I currently work in a space that has rejected the existing Java PROJ interface and for that, this is would be very positive progress

@schwehr would your "space" be willing to jump on a PROJ Java GeoAPI implementation? Having a few organisations on board early on will definitely help with maintenance in the long run.

@kbevers Possibly. The best case is if GeoTools supports an option to use it.

@desruisseaux
Copy link
Contributor

GeoTools case is a bit particular since they do not use GeoAPI as published by OGC, but instead their own fork. They are free to do that, but should have changed the org.opengis package name to something in their own namespace (see SPDX issue #845 for a discussion on OGC licensing aspects). Alternatively they could replace their fork by OGC GeoAPI (I think it would be technically not difficult because there is very few differences compared to their fork), but that would be another discussion. If they do that, then yes an option for using PROJ in GeoTools should be easy.

@kbevers
Copy link
Member Author

kbevers commented Jul 16, 2019

In 327352d I have tried to sum up the reasoning behind updating the Java interface instead of removing it. I hope this is satisfactory.

@rouault
Copy link
Member

rouault commented Jul 16, 2019

Could you add in the RFC text that the native part of the bindings will only use public API (installed headers), that is proj.h or proj_experimental.h for the C API, or #include "proj/{foo}.hpp" files if using the C++ API, and no internal headers like proj_internal.h ? (possibly extended if needed)
That way we have a contingency plan if the bindings would become a burden for the project, so they could be relatively easily moved off to their own repo

@kbevers
Copy link
Member Author

kbevers commented Jul 16, 2019

Could you add in the RFC text that the native part of the bindings will only use public API

Yes. Added in 8b6bef4.

@kbevers
Copy link
Member Author

kbevers commented Jul 17, 2019

@rouault @hobu Are you satisfied with the RFC in its current form? If so, can I ask you to cast your votes on the mailing list?

@hobu
Copy link
Contributor

hobu commented Jul 17, 2019

I'm -0 for the reasons that @rouault provided on the list, but I'm satisfied with the responses given in the RFC insofar as I'm not vetoing or preventing the work from moving forward.

@schwehr
Copy link
Member

schwehr commented Jul 17, 2019

Are there any publicly visible users of the existing JNI code?

@desruisseaux
Copy link
Contributor

There is Apache SIS which is in a mixed situation. It provides an option for using PROJ in addition of its own build-in referencing system. But that option is built on a copy of PROJ JNI bindings, modified for better integration with GeoAPI. Having JNI bindings in Apache SIS instead than PROJ causes the following problems:

  • No Windows support because we had no volunteer for compiling on that platform.
  • Crash in C/C++ code when PROJ version at runtime is different than PROJ version at compile time (was caused by use of private API; restricting to public API resolved this problem).
  • Require copying native file somewhere on user machine at Java application runtime, which is sub-optimal and has security implications (not allowed in some environments).

The advantage of having JNI bundled with PROJ is that Java users can get them from Linux distributions, MacPort, or other systems managing PROJ installation, in which case all above-cited problems vanish. If Java bindings are provided in a separated project, a question would be: do we have a volunteer for getting at least the native part of those bindings included in Linux/MacOS/etc. package managers? If no, then above-cited problems will continue for the foreseeable future.

@kbevers
Copy link
Member Author

kbevers commented Jul 18, 2019

As stated on the mailing list, I am withdrawing this RFC.

@kbevers kbevers closed this Jul 18, 2019
@kbevers kbevers deleted the rfc4 branch April 18, 2020 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants