New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Exclude same input geometry from output of nearest_all #324
Comments
We may use the same API as scipy's KDTree here: |
@brendan-ward that's a good insight about the cost of an equality test. In https://github.com/Toblerity/Shapely/pull/1166/files#diff-6108807fae8f58167beefb19a5c3270a442c26cc1e5b3b62e0ce25ebbe4673c3R254 I'm promising an |
This is partly why we return all equidistant results via It looks like your implementation would extend nicely to |
Performance might be a reason to choose the pointer method, but the two methods also can give a different result (there can be multiple geometries in the tree that are not identical, but are spatially equal: the one method only excludes the identical geometry, the other method excludes all equal geometries). Although I don't really have a good sense which of the two behaviours would be most preferable (probably also use case dependent) |
Agreed. I think some of this comes down to the degree of duplication of geometries present within the source geometries, and whether or not those are valid results for your analysis (fully equivalent vs functionally equivalent). Though as precedent, the shapely method uses equality of geometries. I think it depends, like you say, on what the user is trying to accomplish:
The user could use a But unless they did a spatial deduplication step prior to constructing the tree, they wouldn't be able to easily force nearest to find the nearest geometry that is not spatially equal. So - long story short - the equals method as established by Shapely is probably the best here, and perhaps we just alert the user to performance issues of equality tests for complex geometries that are spatially equal but not the same input geometry (should be pretty rare, right?). |
I agree that we should stick to the “equals” and not go for the identity (pointer equality). I don’t see a pygeos geometry as something that has an identity. They are like integers or strings: immutable and only their value gives them meaning. Within the context of a tree, we could only do an exclusion by identity if one would specify the query geometry with an index into the tree. (e.g. tree.query(geom_idx) ) but something like that will have a very limited usability. |
In this issue in Shapely, there is a need to find the nearest items in a tree constructed of the same geometries that are being used to query the tree, but return the non-self nearest neighbor(s).
One way to approach this would be to provide a new function that does not take new geometry inputs, but instead uses the underlying tree geometries in such a way that we can compare addresses of a geometry to the results returned within the nearest callback function, so as to avoid a more expensive test for equality. We would then return an arbitrarily large distance for the self-pair, to force the tree to look for the next nearest neighbor.
/cc @liunx7594
The text was updated successfully, but these errors were encountered: