Use `dfs_query_then_fetch` to make elasticsearch scoring more consistent #6112

diox · 2018-10-22T10:55:45Z

By default, Elasticsearch scoring for documents is only calculated per shard for performance reasons. This means if you're unlucky and some of your shards contains more documents relevant to your query than the others, the score and therefore the position of each document in the results can be inconsistent with how relevant they are.

Elasticsearch provides an optional query type, dfs_query_then_fetch that makes the scoring consistent by using all shards to compute terms frequency and not just individual ones, at a small performance cost.

We briefly had that through a waffle but reverted it because the waffle itself had an additional performance cost, as it was executing a database/cache query in the search API. We should try enabling dfs_query_then_fetch unconditionally instead.

The text was updated successfully, but these errors were encountered:

diox · 2018-11-08T14:10:51Z

QA: this is hard to test, but make sure search is still behaving correctly and there are no huge unexpected performance regressions with the search API.

AlexandraMoga · 2018-11-12T14:54:04Z

I didn't find regressions around search relevancy or performance on AMO -dev.
I'm marking the issue as verified fixed on AMO -dev with FF63, WIn10x64.

diox added component:api component:search labels Oct 22, 2018

diox self-assigned this Oct 22, 2018

addons-robot added the triaged label Oct 23, 2018

diox mentioned this issue Oct 23, 2018

Different results after refresh in search API response for relevance #3531

Closed

diox modified the milestones: 2018.11.01, 2018.11.08, 2018.11.15 Oct 24, 2018

This was referenced Nov 8, 2018

Use dfs_query_then_fetch to make scoring consistent across shards mozilla/addons-server#9932

Merged

Terms in description are not found #731

Closed

diox closed this as completed in mozilla/addons-server#9932 Nov 8, 2018

AlexandraMoga added the state:verified_fixed label Nov 12, 2018

KevinMind added the migration:no-jira label May 4, 2024

KevinMind transferred this issue from mozilla/addons-server May 4, 2024

KevinMind added repository:addons-server Issue relating to addons-server migration:2024 labels May 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `dfs_query_then_fetch` to make elasticsearch scoring more consistent #6112

Use `dfs_query_then_fetch` to make elasticsearch scoring more consistent #6112

diox commented Oct 22, 2018

diox commented Nov 8, 2018 •

edited

AlexandraMoga commented Nov 12, 2018

Use dfs_query_then_fetch to make elasticsearch scoring more consistent #6112

Use dfs_query_then_fetch to make elasticsearch scoring more consistent #6112

Comments

diox commented Oct 22, 2018

diox commented Nov 8, 2018 • edited

AlexandraMoga commented Nov 12, 2018

Use `dfs_query_then_fetch` to make elasticsearch scoring more consistent #6112

Use `dfs_query_then_fetch` to make elasticsearch scoring more consistent #6112

diox commented Nov 8, 2018 •

edited