New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for selecting percolator query candidate matches containing geo_point based queries #26040
Add support for selecting percolator query candidate matches containing geo_point based queries #26040
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that these queries are worth extracting. Currently you interleave bits of the lat and lon in order to create a long range, but I think it would be worse than eg. only indexing the latitude or the longitude all the time in terms of number of candidates to verify. @nknize maybe you have more background about this?
I think we also need to decide whether we want to keep a single extraction field for ranges like you did here, that works in both the 1D and 2D cases, or whether we should have a dedicated field for the 2D case based on the work that @nknize is doing with bounding-box fields. Presumably the latter would be more efficient at the cost of an additional extraction field.
import org.apache.lucene.search.Query; | ||
|
||
/** | ||
* Workaround to get access to {@link LatLonPointDistanceQuery} properties due to the fact that this class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this hack is not going to work with Java 9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason why this work is because this class is loaded by parent class loader that also loads the Lucene jars and the module class loader just uses that. If this class would be in the class loader used the percolator module then this hack would fail (integ tests would tell us that). What java9 change will not allow this hack to work?
I will replace this hack once the LatLonPointDistanceQuery and LatLonPointInPolygonQuery classes have been made public.
If we aim at being able to index geo-shape queries eventually, then I think we should go with a dedicated field for the 2D case. |
Makes sense. The bounding box should also perform better than the current approach in this PR. |
e968d41
to
408261c
Compare
…ning `geo_bounding_box`, `geo_distance` and `geo_polygon` queries. The percolator field mapper will try to extract a bounding box from the listed queries and indexed into the binary range field that this field mapper wraps. The top left and bottom right lat long tuples are converted into binary morton codes and that is then indexed as a binary range. This way the bounding box can be handled like other how range queries are handled.
…d that uses Lucene's LatLonBoundingBox.
1c5ce92
to
ccd81a8
Compare
@jpountz I've updated the PR to store the bbox in a separate field. |
Sorry @martijnvg I had missed your ping. Should we try to rebase before going through another review? |
@jpountz Sorry, I forgot about this PR, but I have updated the PR now. |
This PR has diverged too much with the master branch and the implementation is based on not up to date geo stuff. |
Added support for
geo_bounding_box
,geo_distance
andgeo_polygon
queries.The percolator field mapper will try to extract a bounding box from the listed queries and indexed into the binary range field that
this field mapper wraps. The top left and bottom right lat long tuples are converted into binary morton codes and that is then
indexed as a binary range. This way the bounding box can be handled like other how range queries are handled.
This PR add work arounds for the fact that LatLonPointDistanceQuery and LatLonPointPolygonQuery have package private visibility in order to extract bounding boxes from the
geo_distance
andgeo_polygon
queries.