Function score query functions needs offset param #18537

JnBrymn-EB · 2016-05-24T02:55:28Z

The current implementation of function_value_factor accepts factor and modifier to shape and scale the resulting function, but missing from this is the ability to offset the value of the function.

Consider the following scenario: Documents represent events that one might attend. For a given query the total document score should be a function of the base query score, the distance, and the popularity. The query score is based solely upon the text match; distance uses a geo-based decay function; and popularity is based upon a function_value_factor function with modifier: "sqrt". The function score query allows us to create a boost by either adding or multiplying the distance and popularity values. For our purposes it doesn't make sense to sum popularity and distance values -- you can have a distant event that no one can attend, but that will nonetheless rank highly based solely upon its high popularity. So instead we will multiply the popularity and distance boosts.

But there's a problem - popularity is based upon number of tickets already sold and if 0 tickets are sold, then the popularity will be 0. Since we are multiplying the boosts together and since the query score is multiplied by the boosts, this means that the total score for those events will also be 0. Thus new events with no tickets sold are all but eliminated from the search results.

We are prepared to resolve this issue with a script_score function, but this is not ideal. I propose introducing an offset parameter to be included in the function_value_factor so that the value of the function would be factor*modifier(field) + offset. This would ensure that popularity could never be zero.

The problem here also exists with decay functions. When using functions multiplicatively there are time when it would be beneficial to have a non-zero minimum value.

The text was updated successfully, but these errors were encountered:

clintongormley · 2016-05-24T09:34:10Z

@brwe what do you think about this? I'm loathe to add more parameters unless they genuinely generically useful, given that this can all be achieved with a script (and will be improved with #17116)

JnBrymn-EB · 2016-05-24T14:56:15Z

@clintongormley your prior issue #6955 is relevant here. There you needed a weight for all of the functions - but when using the functions multiplicatively, just having weights does not make sense. Consider that in the multiplicative use case the total score of a document is query_score*(weight1*field_value1)*(weight2*field_value2) - you can see that final score with weighting is equal to the original non-weighted score times weight1*weight2.

Because weight doesn't have any effect when the the function values are multiplied together I think there is a need for an offset parameter.

Consider also that the most basic definition for a linear function is: y = m*x + b. I think we're missing the b.

rjernst · 2016-05-24T19:23:27Z

I don't see the point, given, as Clint said, that this can be done with a script, and even with an expression script, which will be very fast (I benchmarked this when adding expressions, specifically comparing to function_value_factor and the perf was identical).

brwe · 2016-05-25T12:17:04Z

I also do not think that we can or should cover too many score combinations with the functions we provide so far. The idea of function score originally was to allow basic functionality out of the box and leave more sophisticated stuff to script_score. I somewhat agree that we are missing the b and that the weights do not make sense when the functions are multiplied in the end. But on the other hand the offset does not make sense either when the functions are summed up...not sure. I am more inclined to work on #17116 and leave the factor function as is.

JnBrymn-EB · 2016-05-25T14:21:28Z

Ok - presuming I'm on Elasticsearch 2.3.x which scripting approach should I use? Groovy? Painless? Lucene Expression Script? My impression is that Groovy is insecure (which actually probably doesn't matter for my case); Painless is not available yet; Lucene Expression Script is marked as undergoing development.

clintongormley · 2016-05-25T14:27:35Z

Expressions are very fast and stable, I'll remove that warning from the docs. I'd definitely use expressions if it does what you need (which it sounds like it will). The only downside is it may not support all the syntax you need, in which case your only option for the moment is Groovy. Painless will be the lang to move to once 5.0 is out.

clintongormley · 2016-05-25T14:28:56Z

Removed in cf7b13d

JnBrymn-EB · 2016-05-25T22:26:25Z

thanks, All

-John

On Wed, May 25, 2016 at 9:31 AM, Clinton Gormley notifications@github.com
wrote:

Removed in cf7b13d
cf7b13d

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#18537 (comment)

clintongormley added discuss :Query DSL labels May 24, 2016

clintongormley closed this as completed May 25, 2016

clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function score query functions needs offset param #18537

Function score query functions needs offset param #18537

JnBrymn-EB commented May 24, 2016

clintongormley commented May 24, 2016

JnBrymn-EB commented May 24, 2016

rjernst commented May 24, 2016

brwe commented May 25, 2016

JnBrymn-EB commented May 25, 2016

clintongormley commented May 25, 2016

clintongormley commented May 25, 2016

JnBrymn-EB commented May 25, 2016

Function score query functions needs offset param #18537

Function score query functions needs offset param #18537

Comments

JnBrymn-EB commented May 24, 2016

clintongormley commented May 24, 2016

JnBrymn-EB commented May 24, 2016

rjernst commented May 24, 2016

brwe commented May 25, 2016

JnBrymn-EB commented May 25, 2016

clintongormley commented May 25, 2016

clintongormley commented May 25, 2016

JnBrymn-EB commented May 25, 2016