Skip to content

Support multiplicative boosts #328

@mkr

Description

@mkr

The Querqy rewrite chain currently allows boosting result documents that fulfill one or more boost queries. These boost queries are collected in the rewrite chain as boostUpQueries and boostDownQueries on the querqy.model.ExpandedQuery. During query conversion they are either added as optional clauses to the Lucene main query (as Occur.SHOULD queries), or by wrapping the main query in a QuerqyReRankQuery where the scores of main and boost queries are added in a separate re-rank step. In both cases boost query scores are added to the scores of the main query.

As score ranges can vary widely between queries depending on the terms of the query, the document frequency and the specifics of your index, the effect of additive boosts becomes hard to predict. A boost of 3 would dominate if the top 3 results have scores of 0.9, 0.5 and 0.2, respectively, but would have no impact if your top 3 results have scores of 90, 50 and 20, respectively. Multiplicative boosts are often the preferred solution. A *3 (conditional to matching a boost query) has the same impact, no matter the magnitude of the scores of the main query.

As a first step to support multiplicative boosts in Querqy I suggest:

  • A multiplicative up/down boost can be already modeled in a query.model.BoostQuery. We use the float boost to encode the boost factor. A value 0 <= n < 1 is a down-boost, a value n > 1 is an up-boost.
  • We extend the querqy.model.ExpandedQuery with a Collection<BoostQuery> multiplicativeBoostQueries to keep additive and multiplicative boosts separate and be fully backwards-compatible.
  • We support generating these multiplicative boosts in the querqy.rewrite.commonrules.CommonRulesRewriter.
  • We extend the querqy.rewrite.commonrules.model.BoostInstruction and add an enum parameter BoostMethod (either ADDITIVE or MULTIPLICATIVE).
  • We extend the querqy.rewrite.rules.factory.config.RuleParserConfig and add a flag createMultiplicativeBoosts. If it is set to true(default false) the parser generates BoostInstructions with method MULTIPLICATIVE, otherwise ADDITIVE.
  • The instructions are parsed as they are now but we interpret an UP(10) as a * 10 and a DOWN(10) as a * 1/10.
  • In the querqy.lucene.QueryParsingController we convert the BoostQueries to ValueSources If(query, boostFactor, 1) which we wrap into MultiplicativeBoostValueSources and combine them via FunctionScoreQuery with the score of the main query, i.e. we boost by the boost factor if a document matches the boost query but ignore the BM25 score of the boost query match.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions