Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply boost only once for distance_feature query #63767

Merged

Conversation

mayya-sharipova
Copy link
Contributor

Currently if distance_feature query contains boost,
it incorrectly gets applied twice: in AbstractQueryBuilder::toQuery and
we also pass this boost to Lucene's LongPoint.newDistanceFeatureQuery.
As a result we get incorrect scores.

This fixes this error to ensure that boost is applied only once.

Closes #63691

Currently if distance_feature query contains boost,
it incorrectly  gets applied twice: in AbstractQueryBuilder::toQuery and
we also pass this boost to Lucene's LongPoint.newDistanceFeatureQuery.
As a result we get incorrect scores.

This fixes this error to ensure that boost is applied only once.

Closes elastic#63691
@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Oct 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 15, 2020
@mayya-sharipova mayya-sharipova added >bug v7.1.1 v8.0.0 and removed Team:Search Meta label for search team labels Oct 15, 2020
@mayya-sharipova
Copy link
Contributor Author

I will update release notes for 7.11 with this change.

@mayya-sharipova
Copy link
Contributor Author

@nik9000 I was wondering if we need to adjust also the corresponding LongScriptFieldDistanceFeatureQuery, looks like there boost is applied twice as well.

@nik9000
Copy link
Member

nik9000 commented Oct 15, 2020

LongScriptFieldDistanceFeatureQuery

Probably!

@nik9000
Copy link
Member

nik9000 commented Oct 15, 2020

Probably!

Do you want to do it as part of this PR or should I take it as a follow up?

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Oct 15, 2020

@nik9000

Do you want to do it as part of this PR or should I take it as a follow up?

Let's do it in a follow up. If you have time, please do it. If you are busy, I don't mind doing this.

@nik9000
Copy link
Member

nik9000 commented Oct 15, 2020

Let's do it in a follow up. If you have time, please do it. If you are busy, I don't mind doing this.

My stack is non-trivial at the moment but I've added it. It'll be next week probably.

@jtibshirani jtibshirani self-requested a review October 15, 2020 19:34
Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too ! Just left a suggestion around testing.

@@ -112,7 +112,8 @@ protected Query doToQuery(QueryShardContext context) throws IOException {
if (fieldType == null) {
return Queries.newMatchNoDocsQuery("Can't run [" + NAME + "] query on unmapped fields!");
}
return fieldType.distanceFeatureQuery(origin.origin(), pivot, boost, context);
// As we already apply boost in AbstractQueryBuilder::toQuery, we always passing a boost of 1.0 to distanceFeatureQuery
return fieldType.distanceFeatureQuery(origin.origin(), pivot, 1.0f, context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we fix LongScriptFieldDistanceFeatureQuery, I think we could remove the boost parameter altogether from MappedFieldType#distanceFeatureQuery ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I or Nik will do that.

@@ -1626,6 +1627,33 @@ public void testRangeQueryWithLocaleMapping() throws Exception {
assertHitCount(searchResponse, 2L);
}

public void testDistanceFeatureQuery() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to rely on unit tests like DistanceFeatureQueryBuilderTests or a REST test instead of an integration test. We try to avoid new ESIntegTestCase cases unless a full cluster is really needed.

Copy link
Contributor Author

@mayya-sharipova mayya-sharipova Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani Thanks for the suggestion. I was trying to make this test in REST yml tests, but yml tests don't have nice comparison of float scores (you can't set up delta, and also there are errors with double -> float conversions).

I also considered adding this test in DistanceFeatureQueryBuilderTests, but then I need to work on the Lucene level there. But what I wanted to test is how ES parses this query and what scores are returned. I could not find a smart way to do that in that test. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yml tests don't have nice comparison of float scores

Got it, that's good to know.

But what I wanted to test is how ES parses this query and what scores are returned.

I guess we already test in DistanceFeatureQueryBuilderTests that the query is parsed correctly, since we check LatLonPointDistanceFeatureQuery has a boost of 1.0f. I also don't see a convenient way to test query scoring in a unit test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani

I guess we already test in DistanceFeatureQueryBuilderTests that the query is parsed correctly, since we check LatLonPointDistanceFeatureQuery has a boost of 1.0f.

Indeed, the tests in DistanceFeatureQueryBuilderTests should be enough for this PR. I've removed a test from SearchQueryIT.java in 7cbe160. We can introduce this test on scores later, when we need to test the correct scores.

@jimczi jimczi added v7.10.1 and removed v7.1.1 labels Oct 16, 2020
@mayya-sharipova mayya-sharipova merged commit f42d179 into elastic:master Oct 16, 2020
@mayya-sharipova mayya-sharipova deleted the distance-feature-correct-boost branch October 16, 2020 13:49
mayya-sharipova added a commit that referenced this pull request Oct 16, 2020
Currently if distance_feature query contains boost,
it incorrectly  gets applied twice: in AbstractQueryBuilder::toQuery and
we also pass this boost to Lucene's LongPoint.newDistanceFeatureQuery.
As a result we get incorrect scores.

This fixes this error to ensure that boost is applied only once.

Closes #63691
mayya-sharipova added a commit that referenced this pull request Oct 16, 2020
Currently if distance_feature query contains boost,
it incorrectly  gets applied twice: in AbstractQueryBuilder::toQuery and
we also pass this boost to Lucene's LongPoint.newDistanceFeatureQuery.
As a result we get incorrect scores.

This fixes this error to ensure that boost is applied only once.

Closes #63691
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 20, 2020
This drops the `boost` parameter of the `distance_feature` query builder
internally, relying on our query building infrastructure to wrap the
query in a `boosting` query.

Relates to elastic#63767
nik9000 added a commit that referenced this pull request Oct 21, 2020
This drops the `boost` parameter of the `distance_feature` query builder
internally, relying on our query building infrastructure to wrap the
query in a `boosting` query.

Relates to #63767
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 21, 2020
This drops the `boost` parameter of the `distance_feature` query builder
internally, relying on our query building infrastructure to wrap the
query in a `boosting` query.

Relates to elastic#63767
nik9000 added a commit that referenced this pull request Oct 21, 2020
This drops the `boost` parameter of the `distance_feature` query builder
internally, relying on our query building infrastructure to wrap the
query in a `boosting` query.

Relates to #63767
@andreidan andreidan added v7.10.0 and removed v7.10.1 labels Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

distance_feature queries get their boost applied twice
7 participants