Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: migrate geos-sys to bindgen and prebuilt GEOS bindings #113

Merged
merged 32 commits into from Jul 22, 2022

Conversation

brendan-ward
Copy link
Contributor

This builds out a bindgen based build step roughly similar to georust/gdal, which uses prebuilt bindings for each supported version. This should make it easier to add new versions in the future and version-specific behavior to the main crate.

Adds prebuilt bindings for GEOS:

  • 3.7.3
  • 3.8.3
  • 3.9.3
  • 3.10.3
  • 3.11.0 (in progress, pending final release)

I have been unable to get GEOS 3.6.5 to build statically, and I believe 3.6.x is no longer being maintained (last update was late 2020); it might make sense to drop GEOS 3.6 support from here relatively soon.

This adds a check against a minimum supported version of GEOS, which seems like a good idea to avoid very old versions.

The build step uses either pkg-config (only available for GEOS >= 3.9) or falls back to geos-config to attempt to discover the location and version of system-installed GEOS and dynamically link to it. This seems to work OK for Linux / macOS; I don't see that there was previous support here for Windows builds, so I didn't add any in this PR (could presumably be added in a future PR). You can also set a few env vars to specify the location of GEOS.

The static build of GEOS is roughly as it was before, and should always correspond to a version for which there are bindings available (since it is based on a particular commit in the GEOS submodule). I was able to use that to build version-specific bindngs for most versions.

This removes the hand-crafted GEOS bindings and associated script. The bindgen bindings here use libc, similar to the original bindings, and also specifically exclude functions excluded previously. Overall, the bindings look nearly identical; I didn't spot meaningful differences.

This deprecates version features from geos-sys (they are retained but have no affect); these are obviated by having version-specific bindings. At some point soon (if this gets merged), version features should be deprecated and replaced by version detection similar to georust/gdal

Since this includes a breaking changes to the geos-sys build process (i.e., if autodetection of GEOS fails badly or bindings aren't available for a specific version), we should probably note that someplace, but I haven't added anything yet.

Most of the tests pass except for a few in more recent versions of GEOS that are based on underlying GEOS logic changes that will need to be better handled in those tests (e.g., coordinate order within polygons). I haven't yet investigated the valgrind related errors, but I see that is failing on the master branch too. I haven't yet seen any errors that indicate a fault in the bindings.

This includes some updates to the CI config to run on all the GEOS versions for which pre-built bindings are available. These leverage the GEOS submodule and attempt to use ccache to speed up the build, though that seems to be held up by the test failures.

Lastly, I'm pretty new to Rust, so there is always the chance that I've made dumb or non-idiomatic mistakes here.

(keeping this as draft until I can track down the failures & review memory leaks)

@GuillaumeGomez
Copy link
Member

So a few things:

  • Even though I think moving to bindgen would be a good idea, it should never be done at compile time.
  • We shouldn't detect the version, it's up to the user to decide which one they want.
  • You can't use the libgeos documentation directly because the license is incompatible.
  • You need to override the default bindgen generated types and use libc types instead.
  • Using pkg-config to confirm/link to the version specified by the user is a good idea.

@brendan-ward
Copy link
Contributor Author

Thanks for the fast response!

it should never be done at compile time

This is only done when bindings aren't already available for your version of GEOS via an opt-in feature (cargo build --features bindgen); this is the same model as georust/gdal. When you don't enable the feature and bindings aren't available, this fails the build step.

The impact of this is based on the interaction with checking against your system installed version of GEOS. Let's say that bindings are available for 3.7 - 3.10, and GEOS 3.11 is released and installed as system GEOS (a side goal of this PR). If we let you choose instead of autodetect the version of GEOS, if you choose 3.11, we could error with a message that 3.11 isn't available, but 3.10 is, so you can then choose (via a feature, I guess?) to use 3.10 instead. You just don't get to use any of the 3.11 features until pre-built bindings are available (a maintenance task in this project). So - the outcome is that the build still works (except maybe for hard-deprecated features, but those are very rare in GEOS C API), for a lesser version of GEOS.

Thus 3.11 (or latest version of GEOS) doesn't become an option until pre-built bindings are published within this crate.

Since that doesn't break the user, I guess that is OK; it would be much worse if we enforced that the bindings must match your system GEOS.

But we'd still need a build script and a feature for enabling binding generation, so I guess we'd just advertise that is intended for crate devs instead of end users?

We shouldn't detect the version, it's up to the user to decide which one they want

So this sounds like sticking with version-specific feature flags, at least during build of geos-sys?

I think there is a distinction that is potentially important with providing support for the crate: regardless of what version they select, we need to know what version of GEOS they linked against to troubleshoot things because there are version-specific GEOS bugs, fixes, and changes (i.e., some versions produce differently ordered coords, as we see here, or fix implementations underneath the API). So if I have 3.11 installed, but use version flags to select 3.9, I'd have to include 3.11 in my bug report here (not a problem, just maybe something to capture via an issue template).

The other upshot is that we or the user needs to be sure they don't select a version flag beyond their system GEOS, or bad things happen (hopefully caught at build time rather than runtime).

You can't use the libgeos documentation directly because the license is incompatible.

Good catch! We should be able to exclude these.

You need to override the default bindgen generated types and use libc types instead

I did for the libc types that were specifically marked in the hand-generated bindings but will check those more thoroughly now. These use libc::<type> in the bindings, example.

Are there specific types that I should watch for, that aren't already automatically handled by bindgen here?
(it uses libc::c_char, libc::c_uchar, libc::c_void, libc::c_int as before). I do see that c_double was used previously but is generated as f64 here, which seems consistent with the docs (unless I misunderstand, or there are very unusual target architectures not described there?).

We specifically use usize for size_t in binding generation, as does georust/gdal-sys, georust/proj, etc. (if we don't we get errors about types). My understanding is that those are equivalent. more info

It is possible that the types generated for #define values may differ by architecture; I'll have to investigate.

Is there anywhere here we are testing against multiple target architectures to verify stability of the bindings (either original hand-generated ones or bindgen ones)? I didn't see this in the CI.

Using pkg-config to confirm/link to the version specified by the user is a good idea

This interacts with first point above. If I have 3.11 installed but no bindings available, and I use a feature to pick 3.10, what should this do? Emit a warning that I'm falling back to an older binding? (build warnings don't seem easily accessible to users)

Do you instead envision this as the mechanism to prevent using a newer API than your local GEOS? E.g., I have 3.7 GEOS installed, but I choose the 3.10 feature. Seems like panicing out of the build at this point is a good idea, so pkg-config or geos-config could help us with that.

@GuillaumeGomez
Copy link
Member

This is only done when bindings aren't already available for your version of GEOS via an opt-in feature (cargo build --features bindgen); this is the same model as georust/gdal. When you don't enable the feature and bindings aren't available, this fails the build step.
...

Same answer, it should never be done at compile time. You can ask for a minimum supported version with pkg-config (that's what we do with gtk-rs). Never compile one out of the blue.

So this sounds like sticking with version-specific feature flags, at least during build of geos-sys?

Yup, exactly.

I think there is a distinction that is potentially important with providing support for the crate: regardless of what version they select, we need to know what version of GEOS they linked against to troubleshoot things because there are version-specific GEOS bugs, fixes, and changes (i.e., some versions produce differently ordered coords, as we see here, or fix implementations underneath the API). So if I have 3.11 installed, but use version flags to select 3.9, I'd have to include 3.11 in my bug report here (not a problem, just maybe something to capture via an issue template).

I think users can retrieve this information without trouble by themselves. Eventually adding it to the issue template but I'm not sure if it'll be very useful...

I did for the libc types that were specifically marked in the hand-generated bindings but will check those more thoroughly now. These use libc:: in the bindings, example.
...

You're right, sorry I focused on the f64. For f32 and f64, please use their libc equivalent as well (it's the only remaining ones I think).

I do see that c_double was used previously but is generated as f64 here, which seems consistent with the docs (unless I misunderstand, or there are very unusual target architectures not described there?).

We specifically use usize for size_t in binding generation, as does georust/gdal-sys, georust/proj, etc. (if we don't we get errors about types). My understanding is that those are equivalent.

One thing for sure, never trust C types. If you want to know why, try to answer these two questions:

  • Based on the C reference, is char signed and what is its size?
  • Based on the C reference, what is the size of long double?

So please keep using libc types, even if the conversion seems straightforward. It's always a trap. 😆

It is possible that the types generated for #define values may differ by architecture; I'll have to investigate.

It's one of the issues that I encountered with bindgen but it's rare unless the library is strongly relying on system APIs. But generally it's hidden behind types.

This interacts with first point above. If I have 3.11 installed but no bindings available, and I use a feature to pick 3.10, what should this do? Emit a warning that I'm falling back to an older binding? (build warnings don't seem easily accessible to users)

You try to link to the version asked by the user and if not available, you throw an error. To be noted, if the version asked by the user is smaller than the one currently available, then it's perfectly fine.

My reference for all this is gtk-rs which is working quite well.

@brendan-ward
Copy link
Contributor Author

I believe I've made the requested updates, but am uncertain about the architecture. From what I could tell, gtk-rs uses a different binding creation tool that seems specific to the GTK stack, but I tried to mimic my understanding of the overall approach by splitting binding generation from geos-sys compile steps.

I moved the bindgen binding generation to a separate folder geos-sys-bind, and added instructions on running this manually. It now also strips comments and migrates f64 => libc::c_double, f32 => libc::c_float.

Is this separate build step in a separate directory what you had in mind? It seems like we don't want it to live within the sys folder.

The geos-sys compile step now relies on the user specifying a version feature that is no greater than their installed version of GEOS; it will error if they request a version higher than what they have locally or if their local version is too old (<3.6).

I refactored the CI tests a little bit more; since each version builds on the one prior, I just enabled version feature flags for the specific version of GEOS being built in that run.

Tests are still failing because of:

  • existing valgrind issues?
  • GEOS > 3.8.1 has changes that break some of the tests

@GuillaumeGomez
Copy link
Member

I believe I've made the requested updates, but am uncertain about the architecture. From what I could tell, gtk-rs uses a different binding creation tool that seems specific to the GTK stack, but I tried to mimic my understanding of the overall approach by splitting binding generation from geos-sys compile steps.

Yes, we use a tool which generates the API based on a GNOME specific tool which describes the library. What I meant was mostly that the tool was not included in the build process and relied on libc for the C types.

I moved the bindgen binding generation to a separate folder geos-sys-bind, and added instructions on running this manually. It now also strips comments and migrates f64 => libc::c_double, f32 => libc::c_float.

Is this separate build step in a separate directory what you had in mind? It seems like we don't want it to live within the sys folder.

Yup, perfect! The sys crate generation should be on its own.

The geos-sys compile step now relies on the user specifying a version feature that is no greater than their installed version of GEOS; it will error if they request a version higher than what they have locally or if their local version is too old (<3.6).

Great. :)

I refactored the CI tests a little bit more; since each version builds on the one prior, I just enabled version feature flags for the specific version of GEOS being built in that run.

Tests are still failing because of:
* existing valgrind issues?
* GEOS > 3.8.1 has changes that break some of the tests

For valgrind failure, it's "normal" (not from this PR). As for the tests failing because of GEOS > 3.8.1, they will need to be updated I guess. I'll go through the changes beforehand.

- "3.8.3"
- "3.9.3"
- "3.10.3"
- "3.11.0beta1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to test not yet stable versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other projects based on GEOS (e.g., Shapely - python bindings), we test against GEOS main branch as part of the CI suite, but we allow those to fail without failing the suite. This has helped us catch incoming changes that break our tests (most often due to GEOS bug fixes or changes in coordinate order).

However, that's a bit separate than how I have it here: 3.11.0beta1 is intended to be a placeholder that gets updated to 3.11.0 when released (should be soon?) or gets removed. Sorry, I should have flagged this part as needing to be revisited prior to merge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add support for not yet stable versions. It's much simpler to just add its support once its released rather that updating code if it has breaking change between the currently supported version and the one that got actually stabilized.

@brendan-ward
Copy link
Contributor Author

I've gone ahead and removed 3.11 since it didn't get further toward release while working on this.

But I didn't see where you needed to have X.Y.Z

This is used in the GEOS build step:

git checkout "tags/${{ matrix.geos }}"

What I meant is that if we are using rustdoc (cfg(doc)) and not rustc, we should not link the libraries.

I must be doing something wrong or don't completely understand what you are asking for here, because I'm not able to get #[cfg(doc)] or #![cfg(doc)] to work for conditional compilation. Instead I used the dox feature like I saw in gtk-rs. Is that the right way to handle this?

One of my recent commits is now breaking the clippy check (raising type complexity errors in main crate) - even though those pass locally, so this isn't quite ready for code review yet.

@GuillaumeGomez
Copy link
Member

Using dox is fine yes. So when dox is enabled, there should be no linking performed.

I'm sorry if I'm a bit slow on some things (mostly on the linking part with pkg), it's things I'm not much used to so I'm trying to not miss any potential issue that could arise.

@brendan-ward brendan-ward marked this pull request as ready for review June 27, 2022 23:34
@brendan-ward
Copy link
Contributor Author

I'm seeing clippy errors on nightly (which fails in the CI for this PR) but those appear unrelated to this PR; they now fail for earlier commits that had originally passed successfully - so I think they are related to toolchain updates rather than code changes here.

These occur in the main crate:

error: very complex type used. Consider factoring parts into `type` definitions
  --> src/context_handle.rs:67:21
   |
67 |     notif_callback: Mutex<Box<dyn Fn(&str) + Send + Sync + 'a>>,
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |

@brendan-ward brendan-ward changed the title WIP: migrate geos-sys to bindgen and prebuilt GEOS bindings ENH: migrate geos-sys to bindgen and prebuilt GEOS bindings Jun 27, 2022
@GuillaumeGomez
Copy link
Member

I'm seeing clippy errors on nightly (which fails in the CI for this PR) but those appear unrelated to this PR; they now fail for earlier commits that had originally passed successfully - so I think they are related to toolchain updates rather than code changes here.

You didn't update the main crate so no need to worry. I'll send a PR to fix them so you'll just need to rebase on it.

@GuillaumeGomez
Copy link
Member

Done in #114.

@brendan-ward
Copy link
Contributor Author

Thanks for the fix to the clippy warnings! Clippy test now pass; other tests are GEOS version specific and probably should be fixed in a follow-up PR.

@brendan-ward
Copy link
Contributor Author

Added 3.11 back in now that it is officially released

sys/build.rs Outdated Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member

There still remains a big few issues with this approach:

  • You can't generate documentation for all versions at once (with nice cfg markers in the docs). And as far as I can tell, there is almost no documentation generated at it (but maybe I missed it?).
  • If an item's signature is modified in a newer version, you'll get compilation errors. Not sure how important this is though since it depends on a cfg. Let's say it's very minor for the time being.
  • It requires a crazy big build.rs. Even though I think it's fine as is.
  • The release of the geos-sys seems very complicated to do.

@brendan-ward
Copy link
Contributor Author

You can't generate documentation for all versions at once (with nice cfg markers in the docs). And as far as I can tell, there is almost no documentation generated at it (but maybe I missed it?)

I'm not seeing any documentation currently generated for the bindings in geos-sys: https://docs.rs/crate/geos-sys/2.0.4
(georust/gdal-sys has no docs, georust/proj-sys is similar to here)

The documentation for the main crate should still work properly, and running cargo doc locally produces at least doc stubs for the structs / functions in geos-sys. These do not otherwise have documentation because we specifically excluded the docstrings due to license issues.

What documentation is expected for geos-sys?

I'm also not clear on how we should approach documentation for multiple versions of the bindings, but my sense is that implementers of new rust-friendly bindings in the main create would be reading either the binding source code or GEOS C API rather than any generated docs (esp. since we can't include the GEOS docstrings due to license issues).

If an item's signature is modified in a newer version, you'll get compilation errors

Assuming that you are referring to a binding against a GEOS C API function? For ABI compatibility (which GEOS watches carefully), those signatures should not change over time, and the mapping of those to rust types should be pretty stable since they all use the libc types per your direction. I'm hopeful that the signatures for existing functions remain stable even if we generate bindings again using a newer version of bindgen.

It requires a crazy big build.rs

It's not crazy big at ~200 lines; do you see other places we could trim this? I've tried to simplify it as much as possible.

The release of the geos-sys seems very complicated to do

I've tried to document steps for generating bindings in geos-sys-bind/README.md. I'm biased since I wrote the steps, but it was quite easy to generate new bindings for 3.11 once it became available. Are there parts of that README that we can improve to make the release process easier?

I'm not clear on what other complexities there are to releasing an updated version of the `geos-sys' crate; if you could please elaborate I can see if there are ways we can improve documentation around those. Having good release instructions will make this easier to maintain. :)

@GuillaumeGomez
Copy link
Member

I'm not seeing any documentation currently generated for the bindings in geos-sys: https://docs.rs/crate/geos-sys/2.0.4 (georust/gdal-sys has no docs, georust/proj-sys is similar to here)

The documentation for the main crate should still work properly, and running cargo doc locally produces at least doc stubs for the structs / functions in geos-sys. These do not otherwise have documentation because we specifically excluded the docstrings due to license issues.

What documentation is expected for geos-sys?

Just the full list of all items. No need to have documentation on them. Currently there is no item listed with your PR.

I'm also not clear on how we should approach documentation for multiple versions of the bindings, but my sense is that implementers of new rust-friendly bindings in the main create would be reading either the binding source code or GEOS C API rather than any generated docs (esp. since we can't include the GEOS docstrings due to license issues).

A way to do it would be to make a diff between each version and then add a check in the code generating the bindings to add #[doc(cfg(v3_x_y)] above the item. It requires to have a list in the code generator directly but I "think" it's not too horrible.

Assuming that you are referring to a binding against a GEOS C API function? For ABI compatibility (which GEOS watches carefully), those signatures should not change over time, and the mapping of those to rust types should be pretty stable since they all use the libc types per your direction. I'm hopeful that the signatures for existing functions remain stable even if we generate bindings again using a newer version of bindgen.

Even if it did create compilation errors, since it's behind a cfg, I think it's acceptable since you have to set it "by hand".

It's not crazy big at ~200 lines; do you see other places we could trim this? I've tried to simplify it as much as possible.

I'm not sure. Just a concern, not a blocker. I should have precised that, sorry.

I've tried to document steps for generating bindings in geos-sys-bind/README.md. I'm biased since I wrote the steps, but it was quite easy to generate new bindings for 3.11 once it became available. Are there parts of that README that we can improve to make the release process easier?

The big issue is the std::fs::copy you're currently using. If removed, I think it will not be a problem anymore.

Just to be clear: we're getting on the last part of this. I'm very much of favor of merging it but just want to be absolutely sure all will go well. So sorry if I'm a bit annoying with the nitpicking.

@brendan-ward
Copy link
Contributor Author

Unless I'm missing something, I'm not seeing that the main crate uses version-specific documentation flags, so I'm not clear on why we'd want to add them to geos-sys. I think that diffing binding versions and then tracking which features are added in order to inject #[doc(cfg(v3_x_y)] into the bindings seems like a lot of work when really the best way to describe changes to GEOS over time is to crosslink directly to the GEOS changelog. Someone implementing new functionality in the main crate should be familiar with going directly to the GEOS source code for more information (esp. since we have to omit the docstrings here due to licensing issues).

Instead, I think we can generate docs for geos-sys based on the latest version of GEOS (which is also the bundled version used for static build feature); that way they are the most comprehensive. I'll add a note indicating these are generated from that version and crosslink to the GEOS changelog.

@brendan-ward
Copy link
Contributor Author

Docs should now be generating using the latest version; I think I must have had them disabled via cfg settings previously.

@GuillaumeGomez
Copy link
Member

It seems now ready for me. Please fix CI and clean up your commit history.

@brendan-ward
Copy link
Contributor Author

fix CI

I assume you mean the lint error? Fixed now. If you meant something else, like fixing the failing tests, please clarify. (note: I've been working on fixing those tests for a follow-on PR)

clean up your commit history

Do you mean squash all commits? I'd suggest instead using the "Squash and merge" feature in Github when you merge this; that is the common workflow used in many other projects. I'd rather avoid doing a local squash and force-push as that may disconnect the within-code review comments in this PR (keeping that history here seems useful).

@GuillaumeGomez
Copy link
Member

I assume you mean the lint error? Fixed now. If you meant something else, like fixing the failing tests, please clarify. (note: I've been working on fixing those tests for a follow-on PR)

I meant fmt and lint errors indeed.

As for clean up, I meant up clean up. Unless you're fine to have everything in one commit. In that case I can indeed just do "squash and merge".

@brendan-ward
Copy link
Contributor Author

A single squash & merge commit is just fine! The more detailed history, if ever needed, is in this PR.

Thanks for your patience with me while I worked this out, greatly appreciated. 😄

@GuillaumeGomez
Copy link
Member

It was a big PR so it was a given that there were a lot of things that needed to be looked into. Thanks for going through with this!

@GuillaumeGomez GuillaumeGomez merged commit 7388032 into georust:master Jul 22, 2022
@brendan-ward brendan-ward deleted the try_bindgen branch August 4, 2022 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants