Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for selecting percolator query candidate matches containing geo_point based queries #26040

Closed

Conversation

martijnvg
Copy link
Member

Added support for geo_bounding_box, geo_distance and geo_polygon queries.

The percolator field mapper will try to extract a bounding box from the listed queries and indexed into the binary range field that
this field mapper wraps. The top left and bottom right lat long tuples are converted into binary morton codes and that is then
indexed as a binary range. This way the bounding box can be handled like other how range queries are handled.

This PR add work arounds for the fact that LatLonPointDistanceQuery and LatLonPointPolygonQuery have package private visibility in order to extract bounding boxes from the geo_distance and geo_polygon queries.

@martijnvg martijnvg added :Search/Percolator Reverse search: find queries that match a document >enhancement review v6.1.0 v7.0.0 labels Aug 3, 2017
@martijnvg martijnvg mentioned this pull request Aug 3, 2017
9 tasks
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that these queries are worth extracting. Currently you interleave bits of the lat and lon in order to create a long range, but I think it would be worse than eg. only indexing the latitude or the longitude all the time in terms of number of candidates to verify. @nknize maybe you have more background about this?

I think we also need to decide whether we want to keep a single extraction field for ranges like you did here, that works in both the 1D and 2D cases, or whether we should have a dedicated field for the 2D case based on the work that @nknize is doing with bounding-box fields. Presumably the latter would be more efficient at the cost of an additional extraction field.

import org.apache.lucene.search.Query;

/**
* Workaround to get access to {@link LatLonPointDistanceQuery} properties due to the fact that this class
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this hack is not going to work with Java 9.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why this work is because this class is loaded by parent class loader that also loads the Lucene jars and the module class loader just uses that. If this class would be in the class loader used the percolator module then this hack would fail (integ tests would tell us that). What java9 change will not allow this hack to work?

I will replace this hack once the LatLonPointDistanceQuery and LatLonPointInPolygonQuery classes have been made public.

@jpountz
Copy link
Contributor

jpountz commented Aug 10, 2017

If we aim at being able to index geo-shape queries eventually, then I think we should go with a dedicated field for the 2D case.

@martijnvg
Copy link
Member Author

If we aim at being able to index geo-shape queries eventually, then I think we should go with a dedicated field for the 2D case.

Makes sense. The bounding box should also perform better than the current approach in this PR.

@martijnvg martijnvg changed the title Added support for selecting percolator query candidate matches containing geo_point based queries Add support for selecting percolator query candidate matches containing geo_point based queries Sep 5, 2017
@martijnvg martijnvg force-pushed the percolator_geo_queries_support branch 2 times, most recently from e968d41 to 408261c Compare September 6, 2017 05:04
@martijnvg martijnvg added stalled and removed review labels Sep 8, 2017
@martijnvg martijnvg added v6.2.0 and removed v6.1.0 labels Nov 22, 2017
…ning `geo_bounding_box`, `geo_distance` and `geo_polygon` queries.

The percolator field mapper will try to extract a bounding box from the listed queries and indexed into the binary range field that
this field mapper wraps. The top left and bottom right lat long tuples are converted into binary morton codes and that is then
indexed as a binary range. This way the bounding box can be handled like other how range queries are handled.
@martijnvg
Copy link
Member Author

@jpountz I've updated the PR to store the bbox in a separate field.

@jpountz
Copy link
Contributor

jpountz commented Mar 14, 2018

Sorry @martijnvg I had missed your ping. Should we try to rebase before going through another review?

@martijnvg
Copy link
Member Author

@jpountz Sorry, I forgot about this PR, but I have updated the PR now.

@lcawl lcawl added v6.4.1 and removed v6.4.0 labels Aug 23, 2018
@rjernst rjernst removed the review label Oct 10, 2018
@martijnvg
Copy link
Member Author

This PR has diverged too much with the master branch and the implementation is based on not up to date geo stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Percolator Reverse search: find queries that match a document v6.4.1 v7.0.0-rc1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants