-
-
Notifications
You must be signed in to change notification settings - Fork 4
HSEARCH-3087 Spatial support for querying and sorting #66
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my comments. It's really nice, in particular adding support for bounding boxes and polygons!
As for the projection stuff, we either need a projection DSL to be able to define the location or we could only support it for the circle predicate and find a hackyish way to push the information from the query to the projection stuff (not very excited by the latter).
Right. I created https://hibernate.atlassian.net/browse/HSEARCH-3190 so we don't forget.
I'm still a bit torn about the API for circle queries: it's more consistent this way but I wonder if user won't search for distance queries or proximity queries rather than having the entities in a circle.
We can review that later if you want. Add it to the Search 6 todo document, maybe?
|
||
public static void assertNotNull(Object object, String objectDescription) { | ||
if ( object == null ) { | ||
throw log.mustNotBeNull( objectDescription ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method does not exist at this point. You should move the changes to the Log interface from the commit "HSEARCH-3087 Add support for within circle spatial predicates" to this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, fixed.
import org.hibernate.search.v6poc.search.dsl.sort.SortOrder; | ||
|
||
abstract class AbstractScalarLuceneFieldSortContributor implements LuceneFieldSortContributor { | ||
abstract class AbstractSimpleLuceneFieldSortContributor implements LuceneFieldSortContributor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You used the term "standard" to mean "everything but GeoPoint" in StandardFieldPredicateBuilderFactory
.
Maybe you could use the same term in both cases? I don't really care if it's "simple" or "standard", but at least let's stick to one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, done.
import org.hibernate.search.v6poc.search.predicate.spi.SpatialWithinCirclePredicateBuilder; | ||
|
||
|
||
public abstract class AbstractSpatialWithinCirclePredicateBuilder<T> extends AbstractSearchPredicateBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I see the point of these abstract classes (AbstractSpatialWithinCirclePredicateBuilder
, AbstractSpatialWithinBoundingBoxPredicateBuilder
, AbstractSpatialWithinPolygonPredicateBuilder
), since they each only have one subclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the point of them is for the hopefully upcoming support of storing polygons or other spatial types in the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it is perfectly safe to use exactly the same DSL interfaces and builder interface for those, since there may be parameters that make sense in one case and not another. But ok, let's leave this for now: it's a problem for another day...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow you: if you want to search a polygon in a circle, you will need to define the circle. That's all that is in AbstractSpatialWithinCirclePredicateBuilder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mainly talking about additional, optional parameters. I can imagine searching for polygons within a circle being similar, but slightly different from search for points within a circle.
In particular, when searching for points within a circle, you will use a distance query, which involves picking some distance computation algorithm (arc
or plane
in Elasticsearch).
When searching for polygons within a circle, you will not use a distance query, at least not in Elasticsearch; you will use a geo_shape
query. And this query does not allow you to pick a distance computation algorithm (maybe because it doesn't use one).
As a result, if we decide to expose a .distanceAlgorithm(DistanceAlgorithmEnum)
method on the object returned by a call to circle
, our API will be dodgy: in some cases ("point" fields) that method can be called, in some other cases ("shape" fields) it cannot.
That's just an example though: looking for points within a circle is not exactly the same as looking for shapes within a circle, so I would expect other differences that could translate into different APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if need be and these classes ends up being useless, we will be able to remove them.
I kinda liked the symmetry with the other types too.
/** | ||
* Distance units. | ||
*/ | ||
public enum DistanceUnit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to move this and the other classes in this package to another package... e.g. org.hibernate.search.v6poc.spatial
.
Up until now it was almost acceptable to have them in a backend-related package, since you basically had to be dealing with low-level interfaces to need GeoPoint
and such, but now these classes also appear in the Search DSL, so it's becoming really weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/** | ||
* Distance units. | ||
*/ | ||
public enum DistanceUnit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have a look at JSR-108? From what I recall it's probably too generic, but I may be wrong...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at this one: https://github.com/unitsofmeasurement .
It's quite complex and I don't think it's worth adding a dependency for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
@Override | ||
protected Query buildQuery() { | ||
return LatLonPoint.newBoxQuery( absoluteFieldPath, boundingBox.getBottomRight().getLatitude(), boundingBox.getTopLeft().getLatitude(), | ||
boundingBox.getTopLeft().getLatitude(), boundingBox.getBottomRight().getLongitude() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see three calls to getLatitude
here, is it normal? Also, if the tests didn't catch that, you might want to add some tests.
If it's normal, maybe a comment would help here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test.
|
||
@Override | ||
public void order(SortOrder order) { | ||
// TODO contribute the support of descending order to Lucene |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a ticket for that if there isn't one already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To future me: https://hibernate.atlassian.net/browse/HSEARCH-3193
public DistanceSortBuilder<LuceneSearchSortCollector> distance(String absoluteFieldPath, GeoPoint location) { | ||
LuceneIndexSchemaFieldNode<?> schemaNode = searchTargetModel.getSchemaNode( absoluteFieldPath ); | ||
if ( !schemaNode.getSortContributor().supportsSortingByDistance() ) { | ||
throw log.distanceOperationsNotSupportedByFieldType( absoluteFieldPath ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the idea of introducing a spatial-specific getter just for that, but more importantly I don't see the point. For field sorts on geo point fields, we already throw an error late (when calling contribute
on the builder), so why wouldn't we do the same for distance sorts on non-geopoint fields, and throw an error when contributeDistanceSort is called?
Alternatively, if you think early failure has value in this case, we could introduce SearchSortBuilderFactories
like we do for predicates. In this case that would only require two factories:
- a "simple"/"standard" one with a "field sort" factory method that returns a builder and a "distance sort" factory method that throws an error
- a "geopoint" one with a "field sort" factory method that throws an error and a "distance sort" factory method that returns a builder
I think that would be more future-proof regardless of whether we want early errors or not, by the way, but that's just my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
|
||
private void initData() { | ||
ChangesetIndexWorker<? extends DocumentElement> worker = indexManager.createWorker( sessionContext ); | ||
// Important: do not index the documents in the expected order after sorts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, apparently "first/second/third" is not the expected order after sorts here... Maybe something is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message was originally put in a test where we would index in this order: "second", "first", "third"; and then (in the test method) run a search query that would result in this order: "first", "second", "third".
Here the "first", "second", "third" labels do not correspond to anything... Not the indexing order, not the sorted order. So I was just pointing out that maybe the constants should be renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK :). Fixed then.
query = simpleQuery( b -> b.byDistance( "geoPoint", 45.757864, 4.834496 ) ); | ||
|
||
DocumentReferencesSearchResultAssert.assertThat( query ) | ||
.hasReferencesHitsExactOrder( indexName, FIRST_ID, THIRD_ID, SECOND_ID, EMPTY_ID ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe comment here, explaining why we don't test descending sorts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…actStandardLuceneFieldSortContributor
Merged! Thanks a lot :) |
This is the initial support for spatial queries. It contains querying and sorting but does not contain the projection stuff yet.
As for the projection stuff, we either need a projection DSL to be able to define the location or we could only support it for the circle predicate and find a hackyish way to push the information from the query to the projection stuff (not very excited by the latter).
I'm still a bit torn about the API for circle queries: it's more consistent this way but I wonder if user won't search for distance queries or proximity queries rather than having the entities in a circle.