-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make geometries hashable (again) #985
Conversation
* keep tests for immutability
The hash strategy used before, and again here, just uses the object hash. This means that two equal geometries would have a different hash, which may not be expected. Should we consider improving the hash strategy on (e.g.) WKB? |
Yes, I think equal geometries (but not identical objects) should have an equal hash, ideally. So using WKB seems to be the way to go, although this also gives problems like linearring/linestring not being distinguishable (although we could hash the type id together with the wkb). Now, this WKB-based hashing is already implemented in the base pygeos Geometry: https://github.com/pygeos/pygeos/blob/a12477f93f9cdb1ed80731ac3fc26059ec1096f2/src/pygeom.c#L116-L150 |
@jorisvandenbossche I think it would be a mistake for shapely geometry object hashes to be special and unlike ordinary Python object hashes. Geometry objects might "inhabit" the same part of space but be different things and we'll get collisions when constructing dicts and sets if we treat spatially-coincident objects as the same object. |
It depends to what you compare. Most python objects don't use
In the same line, I think you can expect Otherwise I think that hash based functions (eg
Can you explain a bit more the worry about collisions here? |
I think the hashing strategy needs a bit more attention. Should this be discussed as part of shapely/shapely-rfc#1 ? Or is here sufficient? |
Personally, I would say this is more an API detail discussion we can have in Shapely itself (here on the PR, or in a dedicated issue for it), while the RFC discussed the general principles of mutability/hashability. But also happy to start a discussion on the RFC if others prefer that. |
de762e5
to
d52edde
Compare
34acbc1
to
d2b3727
Compare
9f2015b
to
62923e2
Compare
4437a92
to
6821fb8
Compare
3498c0c
to
bc2e464
Compare
2eed2d0
to
2aedc0d
Compare
Superseded by #1250 |
It just occurred to me that with #960 to remove mutable geometries that we can restore hashable geometries.
This PR reverts #320 / e7ab27e (which also had tests that always passed).
Is there anything new to consider?
See also #209 with xrefs to relevant open issues to geopandas