Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for geohex_grid over geo_shape #92999

Merged
merged 4 commits into from
Jan 24, 2023

Conversation

craigtaverner
Copy link
Contributor

The feature to add support for geohex_grid aggregations over geo_shape fields was added in #91956. This is the associated documentation for that.

Also made a few small geotile_grid docs fixes to make it more up-to-date (mostly copied from the newer geohex_grid aggregation which was written at least two to three years later).

The feature to add support for geohex_grid aggregations over geo_shape
fields was added in elastic#91956.
This is the associated documentation for that.
@craigtaverner craigtaverner added >docs General docs changes :Analytics/Geo Indexing, search aggregations of geo points and shapes v8.7.0 labels Jan 17, 2023
@github-actions
Copy link

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added Team:Docs Meta label for docs team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jan 17, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I just add a comment on why shapes are done in cartesian which I think it is more accurate, wdyt?

within the edges defined by great circles. In other words the calculation is done using spherical coordinates.
However, when aggregating over `geo_shape` data, the shapes are considered within a hexagon if they lie
within the edges defined as straight lines on an equirectangular projection. The reason for this is that
visualizing aggregation results in a map application will show surprising results when zoomed out.
Copy link
Contributor

@iverase iverase Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason is that Elasticsearch (more in particular, lucene) treats edges using the equirectangular projection at search time, therefore the mismatch between the query result and the aggregation might provided surprising results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we should do this with points too, right? I know we cannot change points due to backwards compatibility, but it might be nice to have an explanation for doing the two differently.

Of course with points we have less risk due to only the cells having edges, while with shapes we have edges for both the shapes and the H3 cells, increasing the likelihood of something looking weird. But that does not seem like sufficient reason to use spherical for points.

Copy link
Contributor

@iverase iverase Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is that we accept edges (polygons and lines) that cannot be represented in spherical coordinates (edges > 180 degrees). This alone makes impossible to resolve geo_shape aggregations using spherical geometry.

Note that we use equirectangular projection but maps are normally using mercator projection, so there is already a mismatch there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Understood. I used your initial explanation, and decided not to involve the visual artefact discussion at all. Also, the question of edges > 180 degrees could, presumably, be solved with sidedness (and orientation), but I understand from previous discussions we have a backwards compatibility issue there. So I simplified and did not bring that up here either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! that's makes more sense to me.

craigtaverner and others added 2 commits January 24, 2023 14:33
…idoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
When aggregating geohex over geoshape we use requirectangular because
underlying lucene index indexes and searches the polygons in that way.
Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I think there is a typo, otherwise thanks for the iteration!

However, when aggregating over `geo_shape` data, the shapes are considered within a hexagon if they lie
within the edges defined as straight lines on an equirectangular projection.
The reason is that Elasticsearch and Lucene treat edges using the equirectangular projection at index and search time.
In order to ensure that search results and aggregation results are aligned, we therefor also use equirectangular
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/therefor/therefore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I thought it was an alternative spelling (US vs British), but according to grammarly, "therefor" is not an alternative spelling of "therefore". We should use the conjunctive form here.

See https://www.grammarly.com/blog/therefore-vs-therefor/

within the edges defined by great circles. In other words the calculation is done using spherical coordinates.
However, when aggregating over `geo_shape` data, the shapes are considered within a hexagon if they lie
within the edges defined as straight lines on an equirectangular projection. The reason for this is that
visualizing aggregation results in a map application will show surprising results when zoomed out.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! that's makes more sense to me.

According to grammarly, "therefor" is not an alternative spelling
of "therefore". We should use the conjunctive form here.

See https://www.grammarly.com/blog/therefore-vs-therefor/
@craigtaverner craigtaverner merged commit e8b4de9 into elastic:main Jan 24, 2023
@craigtaverner craigtaverner deleted the geohex_grid_docs branch January 24, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >docs General docs changes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Docs Meta label for docs team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants