New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hex and Square Grid generators #467
Conversation
This is a strange request but it would be nice to add a value about proximity or adjacency. For example, every tiling scheme can be colored so no two adjacent tiles have the same color. The minimum is 4 - https://en.m.wikipedia.org/wiki/Four_color_theorem |
I don't see the usefulness of default arguments for the grid functions, would rather drop them. The single-cell version I don't like (what's i/j for the hexagon part?). I'm sure you can call the grid generator to produce single-cell anyway. |
Would be good if the argument will just accept a geometry, and push out only the hexes/cells that ST_Intersects with it, instead of Grid. This would allow to use it both ways: to join to a table with index lookups, or just seq scan a table and output a coverage for each shape, and after that GROUP BY the hexagons. |
A generic geometry might be overkill, but maybe the code can
handle "boxes" in a special way..
|
postgis/lwgeom_hexgrid.c
Outdated
/* Ratio of hexagon edge length to height = 2*cos(pi/6) */ | ||
#define HXR 1.7320508075688774 | ||
|
||
static const double hex_x[] = {0.0, 1.0, 1.5, 1.0, 0.0, -0.5, 0.0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels unsafe - hexagons may not collide on the boundary of jump to a next double exponent value, getting a small gap or overlap between them, if you try to get to it via subtraction from next or addition from previous. It's bulletproof to build a hexagon from repeated points from adjacent hexes though.
A good test is to ST_Union a worldful of hexagons.
postgis/lwgeom_hexgrid.c
Outdated
static LWGEOM * | ||
square(double origin_x, double origin_y, double E, int cell_i, int cell_j, uint32_t srid) | ||
{ | ||
POINT4D pt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POINT4D pt; | |
POINT4D pt = {0,0,0,0}; |
This way no need to set it later.
postgis/postgis.sql.in
Outdated
-- Availability: 1.0.0 | ||
CREATE TYPE geometry_grid AS ( | ||
geom geometry, | ||
i integer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are i and j needed here? what common scenario breaks if it was just a geometry?
Agree with both of these comments. I think these functions will be a nice addition to core. |
There should be an option or example how to make square grid coincide with tile boundaries. |
postgis/lwgeom_hexgrid.c
Outdated
static const double sqr_y[] = {0.0, 0.0, 1.0, 1.0, 0.0}; | ||
|
||
static LWGEOM * | ||
square(double origin_x, double origin_y, double E, int cell_i, int cell_j, uint32_t srid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some issue with the parameter naming:
- origin_x and origin_y: They require looking at the code to know if the origin is used as top-left, bottom-right or whatever.
E
: interesting name for the size, I have no idea where it might come from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E = edge, in my hand-written notes on the math of hexagon generating, so.
the problem isn't so much the word "origin" as the "are my cell coordinates cartesian or image coordinates" which is really at the root of it. right now they are cartesian, so "origin" means "cartesian origin", however... the comment about wanting to generate equivalent Google grids raises the idea of having them be image coordinates instead. yuck.
Apart from the things that are missing (tests and documentation) and without having reviewed the code too much, I'm wondering a similar thing as @Komzpa mentioned: How would I use this to intersect with other geometries? Let's say I have a table with N points and I want to group them by hexagons of a certain size.
On the other hand, if instead I had a function that given a point / geometry and the hexagon grid properties it returns the x/y hexagon (or list) that intersect with it, then I could first group by the simpler x/y values and finally generate the polygons based on x/y. I'm not sure if this approach could be better, but it feels like it. |
postgis/lwgeom_hexgrid.c
Outdated
int cell_height; | ||
GBOX bounds; | ||
double size; | ||
uint32_t srid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, srids can be negative and use a int32_t
.
So, thinking about the expanded functionality requests...
The arbitrary shape thing, I think works OK, with a few extras
We also have some real unpleasant fun with supporting other shape types
I feel this spinning out of control a little bit. Yes, generating hexes and joining them can be inefficient, but (a) it works out of the box and (b) Carto has been doing this for almost a decade and people just seem to suck up the pain? Maybe the generic shape case can be deferred? |
The case of polygon coverage, the notion should be "hex intersects polygon". This lets you have all the hexes, and normalize values by some The optimal performance doesn't have to be there right away, but better to catch better API now. |
If the poly/hex case is a full intersection then the optimization power of being able to quickly return a grid coordinate goes away: I'd be doing a full geometry intersects anyways, which is just what a user would be doing in materializing the grid and then doing an intersects on it. It's possible to do point-in-hexgrid with a little math (looked it up in the pdal hexgrid implementation) but full intersection is full intersection. I'm still kind of intrigued by the "build a hex cover for this shape" idea, but it does imply an extra origin parameter. Optional? Required? Default origin of hex cover to 0,0 (all hex covers will align magically), or to the LL of the input shape? |
What about aggregating points / geometries on a grid (hex or square)? As far as I can tell, the proposed API is extremely awful since you need to generate the whole tile grid (which can be billions of hex/squares depending on their size and extent) to then join the 2 tables. Do you see a better approach to do this aggregation with this API? |
Well, it seems like being able do the arbitrary geometry input is the answer, then you just group by your i, j output coordinates, and ignore the generated geometry. It requires an origin point or an assumed 0,0 origin, but would seem to work... I guess I should start on the arbitrary geometry form of this, seems to be desired, if quite complex. The question of what rules to use for (a) polygons (hex intersects polygon, or hex centre intersects polygon) and (b) lines (hex... ? line)
|
When talking about API, I think it makes sense to return all grid geometries that intersect with
I'd say yes to the optional origin, but can you clarify what's lower_left of an hexagon? Is it the bottom left point of the hex shape (only valid if the hex has parallel lines to the equator, which it appear to do based on the example images)? |
For the purpose of constructing the hex grid the lower left is the left-most point of the bottom edge of the hexagon (hexagons oriented with top/bottom lines horizontal). The centroid trick doesn't work, BTW, because you don't have the hexagons to run centroid on until the calculation is finished... I mean, you can post-process them, if that's what you're talking about? |
I meant that if you want just one grid geometry per polygon you can take the polygon's centroid / oint on surface / "whatever you consider proper" and use that as input instead of the whole polygon. Same idea for lines. |
Another thing: Seeing Uber's H3 I've noticed that the orientation of the hexagons is different depending on the zoom level. Should we consider using a geometry (which includes size, origin and orientation) instead of size + origin coordinates? How complex would that API be? Also, if the function accepts a geometry (which could be either a square or a hex) we would only need a single function for both, right? |
Re poly/hex I mean the opposite: I want N hexes per polygon, but I don't want any one hex to fall in more than one polygon. |
Re altering orientations, I feel like that's not really a needed knob. |
+1 for a way of getting the cell indices which cover a geometry. Actually ++1 for handling points, since that's easy. Lines and polygons are of course harder. Seems like that is effect building a rasterizer. But at least rasterization can be done more efficiently than an external table intersection. I.e. for polygons can compute the set of cells intersected by the border, which then determines the "scanlines" of cells which intersect the interior. |
Please rebase to |
are we going anywhere with this or are you all just going to chat about it forever? |
People requested lots of changes and blah blah blah, so I can't just flat commit it, and also I don't have time right now to address comments, so just cool your jets, it'll happen. |
@pramsey I'm just doing my part of being the human stalebot pointer outer :) |
@pramsey is your hex algo the same as created by Uber's H3 (code)? To my knowledge, that's the most well known implementation, but I have also seen geohex.net (java code) and I might have seen another triangle-based one somewhere I can't find at the moment. |
…nto svn-trunk-hexgrid
Bounds is no longer the domain of the grid, instead the grid always starts at origin 0,0 in whatever SRS the bounds are. Bounds is the rectangle-to-be-filled with shapes, and only those shapes needed to fill the bounds are instantiated, along with their cell coordinates.
No, this isn't H3. It's a simple, configurable planar tiling. H3 is a set tiling of the globe, with fixed keys. Good for our buddies with key/value stores. |
:) Would it make sense to use a globe-focused hexing schema in PostGIS? Or are there benefits of using a planar one? Triangle-grid sounds interesting too btw, but triangles are a bit easier to wrap around the globe. |
As a proof-of-utility, try the existing extension https://github.com/bytesandbrains/h3-pg and see if it's useful. It has casts to postgis geometry, so it should be useful in and of itself. |
Merged to master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. I guess that you decided to rush and merge this to avoid it stagnating again.
* | ||
********************************************************************** | ||
* | ||
* Copyright 2018 Paul Ramsey <pramsey@cleverelephant.ca> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2020?
static const double hex_y[] = { 0.0, -1*H, -1*H, 0.0, H, H, 0.0}; | ||
|
||
static LWGEOM * | ||
hexagon(double origin_x, double origin_y, double size, int cell_i, int cell_j, uint32_t srid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
srid should be int32_t since it can be negative
/* ********* ********* ********* ********* ********* ********* ********* ********* */ | ||
|
||
static LWGEOM * | ||
square(double origin_x, double origin_y, double size, int cell_i, int cell_j, uint32_t srid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with the srid type
|
||
/* ********* ********* ********* ********* ********* ********* ********* ********* */ | ||
|
||
typedef enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd better to move the internal function to liblwgeom and leave only the PG implementation here, but I don't have a strong opinion on it.
{ | ||
state->done = true; | ||
} | ||
return !state->done; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value is never used, so I guess it isn't necessary.
switch (state->cell_shape) | ||
{ | ||
case SHAPE_HEXAGON: | ||
lwgeom = hexagon(0.0, 0.0, state->size, state->i, state->j, state->srid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth it to support setting the origin also in the ST_*Grid functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I contemplated, but it didn't seem necessary so I leave as a future option if someone requires it.
|
||
/* stuff done on every call of the function */ | ||
funcctx = SRF_PERCALL_SETUP(); | ||
newcontext = funcctx->multi_call_memory_ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm missing something but I don't know why this assignment is necessary if newcontext
isn't used after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it might be copied from LWGEOM_dumppoints
which also does a similar dead store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's true
_COST_MEDIUM; | ||
|
||
-- Availability: 3.0.0 | ||
CREATE OR REPLACE FUNCTION ST_HexagonGrid(size float8, bounds geometry, OUT geom geometry, OUT i integer, OUT j integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the OUT parameters in ST_HexagonGrid and ST_SquareGrid needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's my new trick to avoid having to declare "utility relations" just to support tuple-valued outputs. Works like a hot damn: declare output as anonymous record but use OUT parameters to provide name and types for the record members. If we want, we could excise all these old friends like geometry_dump
valid_detail
and such.
Also, NEWs are missing. |
See #538 |
Can be easily extended to triangles too, I'm quite pleased that the SRF machinery is shared between them. Documentation will take some work to explain how to generate nesting hierarchies, but it actually only requires doubling/halving the size parameter, as everything lines up at the origin neatly.
The single-valued functions really just come along for the ride, not sure how useful they will be, but it makes sense to be able to generate a single geometry at a particular cell location.