Skip to content
This repository was archived by the owner on Sep 7, 2018. It is now read-only.

Conversation

gsmet
Copy link
Member

@gsmet gsmet commented May 29, 2018

This is the initial support for spatial queries. It contains querying and sorting but does not contain the projection stuff yet.

As for the projection stuff, we either need a projection DSL to be able to define the location or we could only support it for the circle predicate and find a hackyish way to push the information from the query to the projection stuff (not very excited by the latter).

I'm still a bit torn about the API for circle queries: it's more consistent this way but I wonder if user won't search for distance queries or proximity queries rather than having the entities in a circle.

@coveralls
Copy link

coveralls commented May 29, 2018

Pull Request Test Coverage Report for Build 488

  • 312 of 341 (91.5%) changed or added relevant lines in 47 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 84.174%

Changes Missing Coverage Covered Lines Changed/Added Lines %
backend-elasticsearch/src/main/java/org/hibernate/search/v6poc/backend/elasticsearch/search/sort/impl/SearchSortFactoryImpl.java 2 3 66.67%
backend-elasticsearch/src/main/java/org/hibernate/search/v6poc/backend/elasticsearch/types/codec/impl/ElasticsearchFieldCodec.java 0 1 0.0%
backend-lucene/src/main/java/org/hibernate/search/v6poc/backend/lucene/search/sort/impl/DistanceSortBuilderImpl.java 10 11 90.91%
backend-lucene/src/main/java/org/hibernate/search/v6poc/backend/lucene/types/codec/impl/GeoPointFieldCodec.java 5 6 83.33%
backend-lucene/src/main/java/org/hibernate/search/v6poc/backend/lucene/types/sort/impl/AbstractStandardLuceneFieldSortContributor.java 2 3 66.67%
backend-lucene/src/main/java/org/hibernate/search/v6poc/backend/lucene/types/sort/impl/GeoPointFieldSortContributor.java 4 5 80.0%
engine/src/main/java/org/hibernate/search/v6poc/search/dsl/sort/impl/DistanceSortContextImpl.java 9 11 81.82%
engine/src/main/java/org/hibernate/search/v6poc/search/dsl/sort/spi/DelegatingSearchSortContainerContextImpl.java 0 2 0.0%
engine/src/main/java/org/hibernate/search/v6poc/spatial/ImmutableGeoBoundingBox.java 12 21 57.14%
engine/src/main/java/org/hibernate/search/v6poc/spatial/ImmutableGeoPolygon.java 26 36 72.22%
Totals Coverage Status
Change from base Build 483: 0.3%
Covered Lines: 8377
Relevant Lines: 9952

💛 - Coveralls

Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are my comments. It's really nice, in particular adding support for bounding boxes and polygons!

As for the projection stuff, we either need a projection DSL to be able to define the location or we could only support it for the circle predicate and find a hackyish way to push the information from the query to the projection stuff (not very excited by the latter).

Right. I created https://hibernate.atlassian.net/browse/HSEARCH-3190 so we don't forget.

I'm still a bit torn about the API for circle queries: it's more consistent this way but I wonder if user won't search for distance queries or proximity queries rather than having the entities in a circle.

We can review that later if you want. Add it to the Search 6 todo document, maybe?


public static void assertNotNull(Object object, String objectDescription) {
if ( object == null ) {
throw log.mustNotBeNull( objectDescription );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method does not exist at this point. You should move the changes to the Log interface from the commit "HSEARCH-3087 Add support for within circle spatial predicates" to this commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed.

import org.hibernate.search.v6poc.search.dsl.sort.SortOrder;

abstract class AbstractScalarLuceneFieldSortContributor implements LuceneFieldSortContributor {
abstract class AbstractSimpleLuceneFieldSortContributor implements LuceneFieldSortContributor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You used the term "standard" to mean "everything but GeoPoint" in StandardFieldPredicateBuilderFactory.
Maybe you could use the same term in both cases? I don't really care if it's "simple" or "standard", but at least let's stick to one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, done.

import org.hibernate.search.v6poc.search.predicate.spi.SpatialWithinCirclePredicateBuilder;


public abstract class AbstractSpatialWithinCirclePredicateBuilder<T> extends AbstractSearchPredicateBuilder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I see the point of these abstract classes (AbstractSpatialWithinCirclePredicateBuilder, AbstractSpatialWithinBoundingBoxPredicateBuilder, AbstractSpatialWithinPolygonPredicateBuilder), since they each only have one subclass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the point of them is for the hopefully upcoming support of storing polygons or other spatial types in the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is perfectly safe to use exactly the same DSL interfaces and builder interface for those, since there may be parameters that make sense in one case and not another. But ok, let's leave this for now: it's a problem for another day...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow you: if you want to search a polygon in a circle, you will need to define the circle. That's all that is in AbstractSpatialWithinCirclePredicateBuilder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mainly talking about additional, optional parameters. I can imagine searching for polygons within a circle being similar, but slightly different from search for points within a circle.

In particular, when searching for points within a circle, you will use a distance query, which involves picking some distance computation algorithm (arc or plane in Elasticsearch).
When searching for polygons within a circle, you will not use a distance query, at least not in Elasticsearch; you will use a geo_shape query. And this query does not allow you to pick a distance computation algorithm (maybe because it doesn't use one).

As a result, if we decide to expose a .distanceAlgorithm(DistanceAlgorithmEnum) method on the object returned by a call to circle, our API will be dodgy: in some cases ("point" fields) that method can be called, in some other cases ("shape" fields) it cannot.

That's just an example though: looking for points within a circle is not exactly the same as looking for shapes within a circle, so I would expect other differences that could translate into different APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if need be and these classes ends up being useless, we will be able to remove them.

I kinda liked the symmetry with the other types too.

/**
* Distance units.
*/
public enum DistanceUnit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to move this and the other classes in this package to another package... e.g. org.hibernate.search.v6poc.spatial.

Up until now it was almost acceptable to have them in a backend-related package, since you basically had to be dealing with low-level interfaces to need GeoPoint and such, but now these classes also appear in the Search DSL, so it's becoming really weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/**
* Distance units.
*/
public enum DistanceUnit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a look at JSR-108? From what I recall it's probably too generic, but I may be wrong...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at this one: https://github.com/unitsofmeasurement .

It's quite complex and I don't think it's worth adding a dependency for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

@Override
protected Query buildQuery() {
return LatLonPoint.newBoxQuery( absoluteFieldPath, boundingBox.getBottomRight().getLatitude(), boundingBox.getTopLeft().getLatitude(),
boundingBox.getTopLeft().getLatitude(), boundingBox.getBottomRight().getLongitude() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see three calls to getLatitude here, is it normal? Also, if the tests didn't catch that, you might want to add some tests.
If it's normal, maybe a comment would help here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test.


@Override
public void order(SortOrder order) {
// TODO contribute the support of descending order to Lucene
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a ticket for that if there isn't one already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public DistanceSortBuilder<LuceneSearchSortCollector> distance(String absoluteFieldPath, GeoPoint location) {
LuceneIndexSchemaFieldNode<?> schemaNode = searchTargetModel.getSchemaNode( absoluteFieldPath );
if ( !schemaNode.getSortContributor().supportsSortingByDistance() ) {
throw log.distanceOperationsNotSupportedByFieldType( absoluteFieldPath );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the idea of introducing a spatial-specific getter just for that, but more importantly I don't see the point. For field sorts on geo point fields, we already throw an error late (when calling contribute on the builder), so why wouldn't we do the same for distance sorts on non-geopoint fields, and throw an error when contributeDistanceSort is called?

Alternatively, if you think early failure has value in this case, we could introduce SearchSortBuilderFactories like we do for predicates. In this case that would only require two factories:

  • a "simple"/"standard" one with a "field sort" factory method that returns a builder and a "distance sort" factory method that throws an error
  • a "geopoint" one with a "field sort" factory method that throws an error and a "distance sort" factory method that returns a builder

I think that would be more future-proof regardless of whether we want early errors or not, by the way, but that's just my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.


private void initData() {
ChangesetIndexWorker<? extends DocumentElement> worker = indexManager.createWorker( sessionContext );
// Important: do not index the documents in the expected order after sorts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, apparently "first/second/third" is not the expected order after sorts here... Maybe something is wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hu?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message was originally put in a test where we would index in this order: "second", "first", "third"; and then (in the test method) run a search query that would result in this order: "first", "second", "third".
Here the "first", "second", "third" labels do not correspond to anything... Not the indexing order, not the sorted order. So I was just pointing out that maybe the constants should be renamed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK :). Fixed then.

query = simpleQuery( b -> b.byDistance( "geoPoint", 45.757864, 4.834496 ) );

DocumentReferencesSearchResultAssert.assertThat( query )
.hasReferencesHitsExactOrder( indexName, FIRST_ID, THIRD_ID, SECOND_ID, EMPTY_ID );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment here, explaining why we don't test descending sorts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@yrodiere yrodiere merged commit 58e0578 into hibernate:master May 31, 2018
@yrodiere
Copy link
Member

Merged! Thanks a lot :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants