New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port the spatial join algorithm from geopandas #52
Comments
@brendan-ward what were your ideas regarding spatial joins? Since you mentioned "This should make it easier to implement spatial joins on top of pygeos." in #87 I think we should still have a spatial join algorithm in pygeos itself (as having the outer loop in c as well will still be beneficial). It could reuse the I think the API we have had in the GeoPandas cython branch (see links above) returning two arrays of indices (indices into the first and second set of geometries, respectively) is probably the most general one. |
@jorisvandenbossche agreed on all points. The general thought was just what you said, we need a vectorized way to loop over queries to the tree. Spatial join is basically a bulk query, with a predicate applied after, and possibly some smarts about the internal direction of the join to best leverage prepared geometries. A couple ways we could achieve this (without thinking it through all the way):
Above and beyond how the tree is queried, there is additional work that goes on within geopandas' |
Is this solved with the bulk_query method that was implemented in #108 ? |
Duplicate of #135 |
This is indeed essentially |
In the cython branch in GeoPandas, we had a C implementation of a spatial join algorithm (using the STRTree from GEOS). This was added in geopandas/geopandas#475
Since it's written in C directly against the GEOS C API, it would be nice to include it in this package (primarily for use in geopandas, but it might also be useful in general).
One question is how it would fit in the current API of pygeos.
The version in GeoPandas is currently written as a C function that gets 2 C arrays of geometry objects:
which is then wrapped in cython to provide a python interface with a signature of
where
left_out
andright_out
are arrays of indices into left and out, respectively, for all the matches according to the predicate.I think it is difficult to generalize this to multiple dimensions, as currently all other functions in pygeos do.
The text was updated successfully, but these errors were encountered: