Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce variable encoding to EdgeTreeReader #49349

Merged
merged 8 commits into from Nov 22, 2019

Conversation

talevy
Copy link
Contributor

@talevy talevy commented Nov 20, 2019

This PR modifies the EdgeTree in the geoshape-doc-values initiative to encode the
points in a variable fashion. It also adds caching to reduce the number of new Edge
objects created and reduce the number of deserializations needed when an aggregation
queries the shape multiple times like it does in geogrid aggregations

The modifications include:

  • delta encoding of edge's coordinates using delta-encoding based on maxX, maxY of the Extent
  • remove Edge object construction and in-line all the deserialization of the edge contents within each method

after these changes, two aspects of the GeometryTree feel like TODOs

  • reduce serialized size of Extent and simplify the checkExtent logic
  • compress Point2D tree

@talevy talevy added the :Analytics/Geo Indexing, search aggregations of geo points and shapes label Nov 20, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Geo)

@talevy talevy requested a review from iverase November 22, 2019 03:15
Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments but in general lgtm

@talevy talevy merged commit 5d9b86b into elastic:geoshape-doc-values Nov 22, 2019
@talevy talevy deleted the gdv-varencode branch November 22, 2019 22:49
talevy added a commit that referenced this pull request Nov 22, 2019
This PR modifies the EdgeTree in the [geoshape-doc-values initiative](#37206) to encode the 
points in a variable fashion. It also adds caching to reduce the number of new Edge 
objects created and reduce the number of deserializations needed when an aggregation
queries the shape multiple times like it does in geogrid aggregations

The modifications include:

- delta encoding of edge's coordinates using delta-encoding based on maxX, maxY of the Extent
- remove Edge object construction and in-line all the deserialization of the edge contents within each method

after these changes, two aspects of the GeometryTree feel like TODOs

- reduce serialized size of Extent and simplify the `checkExtent` logic
- compress Point2D tree
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants