-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compact
behavior on duplicate input
#298
Comments
Also, what about inputs to For example, the code h1 = h3.geo_to_h3(37, -122, 10)
h2 = h3.geo_to_h3(37, -122, 9)
hexes = np.array([h1,h2], dtype='uint64')
np.array(h3.compact(hexes)) returns len(hu): 2
flag: 0
array([622204016658513919, 617700417031372799], dtype=uint64) and the code h = h3.geo_to_h3(37, -122, 10)
hexes = [h] + list(h3.h3_to_children(h))
hexes = np.array(hexes, dtype='uint64')
np.array(h3.compact(hexes)) returns len(hu): 8
flag: 0
array([622204016658513919, 626707616285855743, 626707616285859839,
626707616285863935, 626707616285868031, 626707616285872127,
626707616285876223, 626707616285880319], dtype=uint64) |
Now for If, for example, we try to uncompact the set containing a hex and its immediate children, we get duplicated outputs: h = h3.geo_to_h3(37, -122, 10)
hexes = [h] + list(h3.h3_to_children(h))
hexes = np.array(hexes, dtype='uint64')
out = np.array(h3.uncompact(hexes, 11)) we get that: >>> len(out)
14
>>> len(set(out))
7 Should we dedupe these outputs, try to detect this situation and raise an error, or simply inform the user that there will be weird output in this case? |
Similar issues with duplicate hexagons in the input to h = h3.geo_to_h3(37, -122, 1)
hexes = np.array([h]*10, dtype='uint64')
out = np.array(h3.uncompact(hexes, 2)) has result >>> len(out)
70
>>> len(set(out))
7 |
I'd imagine this is acceptable because it makes
The assumption of I would have expected the function to fail when 8 duplicates are present instead of 9. I think the reason for this is inconsistency in |
Having it be idempotent would be cool, but it doesn't currently work that way; see the second example in #298 (comment). If we want it to be idempotent, then we need a stronger restriction on input than just deduplicated. We would need to require that the parent of any cell is also absent from the set. Either that, or we modify the function to handle duplicated hexes, and hexes that have their parents in the set. |
This is a general issue, applying to e.g. This is something we ought to lock down consistently for v4. I think it's okay not to validate, as long as input requirements are clearly documented, but it's worth discussion and review of all methods. |
I'm seeing inconsistent behavior when I input a collection of duplicated hexagons to the
compact()
function.Using the
cython
branch ofh3-py
(with a few extraprint
debugging statements):When I run
foo(8)
, I getThis is mostly OK, as some of the wrappers will dedupe the output with a
set
container. However, should the Ccompact
function actually raise an error on any duplicate input?When I run
foo(9)
, I getand an
H3ValueError
, due to the-2
return flag.It seems like the C code is detecting the duplicate hexagons in this case, but not the case where
n <= 8
.compact
function just handle and ignore duplicated hexagons? (We can "compact" any duplicates into a single hexagon.)The text was updated successfully, but these errors were encountered: