Use dfs_query_then_fetch
to make elasticsearch scoring more consistent
#6112
Labels
component:api
component:search
migration:no-jira
migration:2024
repository:addons-server
Issue relating to addons-server
state:verified_fixed
triaged
Milestone
By default, Elasticsearch scoring for documents is only calculated per shard for performance reasons. This means if you're unlucky and some of your shards contains more documents relevant to your query than the others, the score and therefore the position of each document in the results can be inconsistent with how relevant they are.
Elasticsearch provides an optional query type,
dfs_query_then_fetch
that makes the scoring consistent by using all shards to compute terms frequency and not just individual ones, at a small performance cost.We briefly had that through a waffle but reverted it because the waffle itself had an additional performance cost, as it was executing a database/cache query in the search API. We should try enabling
dfs_query_then_fetch
unconditionally instead.The text was updated successfully, but these errors were encountered: