New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STRtree with a reverse mapping and query_items/query_geoms methods #1112
Conversation
""" | ||
if self._n_geoms == 0: | ||
if self._tree is None or not self._rev: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising an exception here could be less ambiguous than returning None, which is easy to confuse with an empty sequence (a perfectly normal query result). But that's an API change to save for 2.0.
@jorisvandenbossche @brendan-ward what do you think of this modified version of #1064? Does it have any big problems from the C implementation perspective? It took me a while to remember how important it is to stick to storing C types when it comes to releasing the GIL. Thanks for being patient with me about that. I'd still like to work on the feature requested in #1106. I wonder if a list of integer items to exclude or skip would be the way to go. After all, these are what we're storing in the tree and they are immediately accessible to the callback without returning to Python land. Meaning we can exclude/skip while the GIL is released. |
Thanks @sgillies for further looking at this! Some quick remarks:
Ah, is this the reason you limited it to only integer items? |
@jorisvandenbossche we came to the conclusion that since items of a uniform C type are good for performance and since all Python objects can be mapped to ints via The only reason |
Yay, builds are passing again. I think we had some cache related problems a few commits back. |
shapely/strtree.py
Outdated
""" | ||
return self.query_geoms(geom) | ||
|
||
def nearest_item(self, geom: BaseGeometry) -> Union[int, None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the return type of this function (and nearest_geom) are the only things that we'll change before 2.0. We may raise an error instead of returning None.
OK, just to be clear: the only additional "feature" this PR gives (compared to default indices 0...n mapping to the position in the sequence of geometries) is that you can use custom integer values instead of 0...n ? (and no longer any custom python object) I don't necessarily want to argue for allowing any python object as id, but note that when storing indices in the tree (as we will do in shapely 2.0, and as #1064 does), there is not really a performance difference in using custom integers vs custom python objects. |
@sgillies and thanks for the updates! For a better comparison, I also updated #1064 to follow the same API as you did in this PR (but didn't yet copy over all other changes, like better docstrings etc, just updated the actual implemenation). I think both achieve the same, but the version in #1064 is closer to what we would do in the shapely-2.0 branch on top of the C-based STRtree. |
@songololo @jorisvandenbossche I've relaxed the integer constraint on the constructor's |
def test_query_enumeration_idx(geoms, query_geom, expected): | ||
"""Store enumeration idx""" | ||
with pytest.warns(ShapelyDeprecationWarning): | ||
tree = STRtree((g, i) for i, g in enumerate(geoms)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgillies based on your last comments (and the updated top post which mentions "These sequences are separate, not zipped together" and your comment at #1064 (comment)) that the idea was the only support geometries and items as separate sequences (where the items is optional), and not in the combined (zipped) form.
But this test above is testing that constructing the tree from a sequence of tuples actually works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #1172 to potentially address this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my, you're right that this test shouldn't pass! 🤦
This is very much like #1064. It stores only integer items. Exposes no callbacks. It's backwards compatible with the STRtree in shapely 1.7.0.
The differences from #1064:
Also tests and docstrings are much improved over the code in 1.8a1.