-
Notifications
You must be signed in to change notification settings - Fork 50
Description
The Querqy rewrite chain currently allows boosting result documents that fulfill one or more boost queries. These boost queries are collected in the rewrite chain as boostUpQueries and boostDownQueries on the querqy.model.ExpandedQuery. During query conversion they are either added as optional clauses to the Lucene main query (as Occur.SHOULD queries), or by wrapping the main query in a QuerqyReRankQuery where the scores of main and boost queries are added in a separate re-rank step. In both cases boost query scores are added to the scores of the main query.
As score ranges can vary widely between queries depending on the terms of the query, the document frequency and the specifics of your index, the effect of additive boosts becomes hard to predict. A boost of 3 would dominate if the top 3 results have scores of 0.9, 0.5 and 0.2, respectively, but would have no impact if your top 3 results have scores of 90, 50 and 20, respectively. Multiplicative boosts are often the preferred solution. A *3 (conditional to matching a boost query) has the same impact, no matter the magnitude of the scores of the main query.
As a first step to support multiplicative boosts in Querqy I suggest:
- A multiplicative up/down boost can be already modeled in a
query.model.BoostQuery. We use thefloat boostto encode the boost factor. A value 0 <= n < 1 is a down-boost, a value n > 1 is an up-boost. - We extend the
querqy.model.ExpandedQuerywith aCollection<BoostQuery> multiplicativeBoostQueriesto keep additive and multiplicative boosts separate and be fully backwards-compatible. - We support generating these multiplicative boosts in the
querqy.rewrite.commonrules.CommonRulesRewriter. - We extend the
querqy.rewrite.commonrules.model.BoostInstructionand add an enum parameterBoostMethod(eitherADDITIVEorMULTIPLICATIVE). - We extend the
querqy.rewrite.rules.factory.config.RuleParserConfigand add a flagcreateMultiplicativeBoosts. If it is set totrue(defaultfalse) the parser generatesBoostInstructionswith methodMULTIPLICATIVE, otherwiseADDITIVE. - The instructions are parsed as they are now but we interpret an
UP(10)as a* 10and a DOWN(10) as a* 1/10. - In the
querqy.lucene.QueryParsingControllerwe convert the BoostQueries to ValueSourcesIf(query, boostFactor, 1)which we wrap intoMultiplicativeBoostValueSourcesand combine them viaFunctionScoreQuerywith the score of the main query, i.e. we boost by the boost factor if a document matches the boost query but ignore the BM25 score of the boost query match.