Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8620: LatLonShape contains #608

Closed
wants to merge 5 commits into from
Closed

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Mar 13, 2019

LatLonShape's implementation for spatial relationship CONTAINS.

@iverase
Copy link
Contributor Author

iverase commented Mar 13, 2019

This change has three main features:

  1. For supporting CONTAINS, we need to add to the encoding if a triangle edge belongs to the original shape or not. This is done by using three spare bits on the encoding byte array. In order to be able to easily decode the triangle a new class LatLonShape.Triangle has been added and a static method in LatLonShape to read the encoded values.

  2. The Tessellator has been modified to capture the information above. This required more effort that initially expected, in particular I need to add a new method that tells me if a new edge is actually belonging to the polygon. I have added several tests.

  3. A new method has been added to our shapes to compute if the shape is Within the provided triangle. This method returns the Within relationship which can have three possible values:

  • CANDIDATE: Either the shape is inside the provided triangle or it crosses only edges that do not belong to the original shape.
  • NOTWITHIN: The shape crosses an edge that belongs to the original shape
  • DISJOINT: The shape is disjoint with the triangle.

The result for a CONTAINS query will be all the shapes that has been marked as CANDIDATE removing all the shapes that has been marked as NOTWITHIN.

Note that this method only support shapes with a unique component as it will required much more effort to capture that every component is actually within. Therefore when building a CONTAINS query with a multi-shape, it is translated to a boolean query.

cc @nknize @jpountz

@nknize
Copy link
Member

nknize commented Mar 26, 2019

This is coming along nicely and I like the overall approach. A few general thoughts:

  1. Would it make sense to split this PR into a couple separate issues? Maybe create a PR for the Tessellator logic that labels the Polygon edges along with the changes to LatLonShape.encodeTriangle? I think that will help make the changes more incremental and make it easier to review?
  2. I'd also like to make sure we get the componentRelateTriangle and componentRelate logic tight in LUCENE-8736 first, and probably the Polygon quantization issues corrected, before fully merging this PR. IMHO it makes sense to ensure the polygon relation logic is fully sound before adding all of the CONTAINS changes?

@iverase
Copy link
Contributor Author

iverase commented Mar 26, 2019

Yes, I already thought about splitting the PR and your proposal is similar to what I had in mind.

+1

I will prepare a new PR with only the changes on the Tessellator and the edges characterisation.

@iverase
Copy link
Contributor Author

iverase commented Mar 26, 2019

@nknize I created:

#622

The PR contains the updated Tessellator logic and the decoding logic in place.

@iverase
Copy link
Contributor Author

iverase commented Sep 11, 2019

I created #872 so I will close this PR.

@iverase iverase closed this Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants