New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: migrate geos-sys to bindgen and prebuilt GEOS bindings #113
Conversation
So a few things:
|
Thanks for the fast response!
This is only done when bindings aren't already available for your version of GEOS via an opt-in feature ( The impact of this is based on the interaction with checking against your system installed version of GEOS. Let's say that bindings are available for 3.7 - 3.10, and GEOS 3.11 is released and installed as system GEOS (a side goal of this PR). If we let you choose instead of autodetect the version of GEOS, if you choose 3.11, we could error with a message that 3.11 isn't available, but 3.10 is, so you can then choose (via a feature, I guess?) to use 3.10 instead. You just don't get to use any of the 3.11 features until pre-built bindings are available (a maintenance task in this project). So - the outcome is that the build still works (except maybe for hard-deprecated features, but those are very rare in GEOS C API), for a lesser version of GEOS. Thus 3.11 (or latest version of GEOS) doesn't become an option until pre-built bindings are published within this crate. Since that doesn't break the user, I guess that is OK; it would be much worse if we enforced that the bindings must match your system GEOS. But we'd still need a build script and a feature for enabling binding generation, so I guess we'd just advertise that is intended for crate devs instead of end users?
So this sounds like sticking with version-specific feature flags, at least during build of geos-sys? I think there is a distinction that is potentially important with providing support for the crate: regardless of what version they select, we need to know what version of GEOS they linked against to troubleshoot things because there are version-specific GEOS bugs, fixes, and changes (i.e., some versions produce differently ordered coords, as we see here, or fix implementations underneath the API). So if I have 3.11 installed, but use version flags to select 3.9, I'd have to include 3.11 in my bug report here (not a problem, just maybe something to capture via an issue template). The other upshot is that we or the user needs to be sure they don't select a version flag beyond their system GEOS, or bad things happen (hopefully caught at build time rather than runtime).
Good catch! We should be able to exclude these.
I did for the libc types that were specifically marked in the hand-generated bindings but will check those more thoroughly now. These use Are there specific types that I should watch for, that aren't already automatically handled by bindgen here? We specifically use It is possible that the types generated for Is there anywhere here we are testing against multiple target architectures to verify stability of the bindings (either original hand-generated ones or bindgen ones)? I didn't see this in the CI.
This interacts with first point above. If I have 3.11 installed but no bindings available, and I use a feature to pick 3.10, what should this do? Emit a warning that I'm falling back to an older binding? (build warnings don't seem easily accessible to users) Do you instead envision this as the mechanism to prevent using a newer API than your local GEOS? E.g., I have 3.7 GEOS installed, but I choose the 3.10 feature. Seems like panicing out of the build at this point is a good idea, so |
Same answer, it should never be done at compile time. You can ask for a minimum supported version with
Yup, exactly.
I think users can retrieve this information without trouble by themselves. Eventually adding it to the issue template but I'm not sure if it'll be very useful...
You're right, sorry I focused on the
One thing for sure, never trust C types. If you want to know why, try to answer these two questions:
So please keep using libc types, even if the conversion seems straightforward. It's always a trap. 😆
It's one of the issues that I encountered with bindgen but it's rare unless the library is strongly relying on system APIs. But generally it's hidden behind types.
You try to link to the version asked by the user and if not available, you throw an error. To be noted, if the version asked by the user is smaller than the one currently available, then it's perfectly fine. My reference for all this is |
I believe I've made the requested updates, but am uncertain about the architecture. From what I could tell, I moved the Is this separate build step in a separate directory what you had in mind? It seems like we don't want it to live within the The I refactored the CI tests a little bit more; since each version builds on the one prior, I just enabled version feature flags for the specific version of GEOS being built in that run. Tests are still failing because of:
|
Yes, we use a tool which generates the API based on a GNOME specific tool which describes the library. What I meant was mostly that the tool was not included in the build process and relied on
Yup, perfect! The
Great. :)
For valgrind failure, it's "normal" (not from this PR). As for the tests failing because of GEOS > 3.8.1, they will need to be updated I guess. I'll go through the changes beforehand. |
.github/workflows/CI.yml
Outdated
- "3.8.3" | ||
- "3.9.3" | ||
- "3.10.3" | ||
- "3.11.0beta1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to test not yet stable versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other projects based on GEOS (e.g., Shapely - python bindings), we test against GEOS main
branch as part of the CI suite, but we allow those to fail without failing the suite. This has helped us catch incoming changes that break our tests (most often due to GEOS bug fixes or changes in coordinate order).
However, that's a bit separate than how I have it here: 3.11.0beta1
is intended to be a placeholder that gets updated to 3.11.0 when released (should be soon?) or gets removed. Sorry, I should have flagged this part as needing to be revisited prior to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add support for not yet stable versions. It's much simpler to just add its support once its released rather that updating code if it has breaking change between the currently supported version and the one that got actually stabilized.
I've gone ahead and removed 3.11 since it didn't get further toward release while working on this.
This is used in the GEOS build step:
I must be doing something wrong or don't completely understand what you are asking for here, because I'm not able to get One of my recent commits is now breaking the clippy check (raising type complexity errors in main crate) - even though those pass locally, so this isn't quite ready for code review yet. |
Using I'm sorry if I'm a bit slow on some things (mostly on the linking part with |
I'm seeing clippy errors on nightly (which fails in the CI for this PR) but those appear unrelated to this PR; they now fail for earlier commits that had originally passed successfully - so I think they are related to toolchain updates rather than code changes here. These occur in the main crate: error: very complex type used. Consider factoring parts into `type` definitions
--> src/context_handle.rs:67:21
|
67 | notif_callback: Mutex<Box<dyn Fn(&str) + Send + Sync + 'a>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
You didn't update the main crate so no need to worry. I'll send a PR to fix them so you'll just need to rebase on it. |
Done in #114. |
Thanks for the fix to the clippy warnings! Clippy test now pass; other tests are GEOS version specific and probably should be fixed in a follow-up PR. |
Added 3.11 back in now that it is officially released |
There still remains a big few issues with this approach:
|
I'm not seeing any documentation currently generated for the bindings in geos-sys: https://docs.rs/crate/geos-sys/2.0.4 The documentation for the main crate should still work properly, and running What documentation is expected for I'm also not clear on how we should approach documentation for multiple versions of the bindings, but my sense is that implementers of new rust-friendly bindings in the main create would be reading either the binding source code or GEOS C API rather than any generated docs (esp. since we can't include the GEOS docstrings due to license issues).
Assuming that you are referring to a binding against a GEOS C API function? For ABI compatibility (which GEOS watches carefully), those signatures should not change over time, and the mapping of those to rust types should be pretty stable since they all use the
It's not crazy big at ~200 lines; do you see other places we could trim this? I've tried to simplify it as much as possible.
I've tried to document steps for generating bindings in I'm not clear on what other complexities there are to releasing an updated version of the `geos-sys' crate; if you could please elaborate I can see if there are ways we can improve documentation around those. Having good release instructions will make this easier to maintain. :) |
Just the full list of all items. No need to have documentation on them. Currently there is no item listed with your PR.
A way to do it would be to make a diff between each version and then add a check in the code generating the bindings to add
Even if it did create compilation errors, since it's behind a
I'm not sure. Just a concern, not a blocker. I should have precised that, sorry.
The big issue is the Just to be clear: we're getting on the last part of this. I'm very much of favor of merging it but just want to be absolutely sure all will go well. So sorry if I'm a bit annoying with the nitpicking. |
Unless I'm missing something, I'm not seeing that the main crate uses version-specific documentation flags, so I'm not clear on why we'd want to add them to Instead, I think we can generate docs for |
…, expand instructions
Docs should now be generating using the latest version; I think I must have had them disabled via |
It seems now ready for me. Please fix CI and clean up your commit history. |
I assume you mean the lint error? Fixed now. If you meant something else, like fixing the failing tests, please clarify. (note: I've been working on fixing those tests for a follow-on PR)
Do you mean squash all commits? I'd suggest instead using the "Squash and merge" feature in Github when you merge this; that is the common workflow used in many other projects. I'd rather avoid doing a local squash and force-push as that may disconnect the within-code review comments in this PR (keeping that history here seems useful). |
I meant As for clean up, I meant up clean up. Unless you're fine to have everything in one commit. In that case I can indeed just do "squash and merge". |
A single squash & merge commit is just fine! The more detailed history, if ever needed, is in this PR. Thanks for your patience with me while I worked this out, greatly appreciated. 😄 |
It was a big PR so it was a given that there were a lot of things that needed to be looked into. Thanks for going through with this! |
This builds out a bindgen based build step roughly similar to georust/gdal, which uses prebuilt bindings for each supported version. This should make it easier to add new versions in the future and version-specific behavior to the main crate.
Adds prebuilt bindings for GEOS:
I have been unable to get GEOS 3.6.5 to build statically, and I believe 3.6.x is no longer being maintained (last update was late 2020); it might make sense to drop GEOS 3.6 support from here relatively soon.
This adds a check against a minimum supported version of GEOS, which seems like a good idea to avoid very old versions.
The build step uses either
pkg-config
(only available for GEOS >= 3.9) or falls back togeos-config
to attempt to discover the location and version of system-installed GEOS and dynamically link to it. This seems to work OK for Linux / macOS; I don't see that there was previous support here for Windows builds, so I didn't add any in this PR (could presumably be added in a future PR). You can also set a few env vars to specify the location of GEOS.The static build of GEOS is roughly as it was before, and should always correspond to a version for which there are bindings available (since it is based on a particular commit in the GEOS submodule). I was able to use that to build version-specific bindngs for most versions.
This removes the hand-crafted GEOS bindings and associated script. The bindgen bindings here use
libc
, similar to the original bindings, and also specifically exclude functions excluded previously. Overall, the bindings look nearly identical; I didn't spot meaningful differences.This deprecates version features from geos-sys (they are retained but have no affect); these are obviated by having version-specific bindings. At some point soon (if this gets merged), version features should be deprecated and replaced by version detection similar to georust/gdal
Since this includes a breaking changes to the geos-sys build process (i.e., if autodetection of GEOS fails badly or bindings aren't available for a specific version), we should probably note that someplace, but I haven't added anything yet.
Most of the tests pass except for a few in more recent versions of GEOS that are based on underlying GEOS logic changes that will need to be better handled in those tests (e.g., coordinate order within polygons). I haven't yet investigated the valgrind related errors, but I see that is failing on the master branch too. I haven't yet seen any errors that indicate a fault in the bindings.
This includes some updates to the CI config to run on all the GEOS versions for which pre-built bindings are available. These leverage the GEOS submodule and attempt to use ccache to speed up the build, though that seems to be held up by the test failures.
Lastly, I'm pretty new to Rust, so there is always the chance that I've made dumb or non-idiomatic mistakes here.
(keeping this as draft until I can track down the failures & review memory leaks)