Query DSL: custom_filters_score allow score_mode for "min" and "multiply" #1560

Closed
apatrida opened this Issue Dec 22, 2011 · 5 comments

Projects

None yet

2 participants

@apatrida

Add min and multiply as score modes for custom_filters_score.

Would be nice to use custom_filters_score for a negative boost as well, so having a "min" score to pick the worst boost is sometimes desirable. You can fake it with "first" by ordering them from lowest to highest, but would be easier for people to comprehend as "min" (you can also use "first" to cause the same as "max" but we don't force people to do that either).

Assuming this works with fractional boost, which I'm assuming it does since it is multiplicative.

@apatrida

Also "multiply" as a score_mode to multiply the values against each other.

@kimchy kimchy closed this in 61b2562 Dec 22, 2011
@apatrida

Working on another issue, but need clarification first. What is the intended result when you have a custom_filters_score and you total or average the items. Does it do this:

  • for all matching filters, and on each doc, calculate the aggregate boost
  • then apply resulting aggregate boost multiplied times document score

Or does it do this:

  • for all matching filters, and on each doc, calculate the boost * the document score
  • then aggregate those calculations.

For total it doesn't matter as the net result is the same when min, max, total and average are used since the math works out that this doesn't matter. For example, these don't care:

FIRST: B1 * score
MIN: B2 * score
MAX: B3 * score
AVG: (B1 + B2 + B3)/3 * score == (B1_score + B2_score + B3_score)/3
TOTAL: (B1 + B2 + B3) * score == B1_score + B2_score + B3_score

but, MULTIPLY would care. Is it:

B1 * B2 * B3 * score

or is it:

B1_score * B2_score * B3*score

The first answer seems right, so that you can do things like discount 10% for something using boost of 0.9 and discount further by 20% using another 0.8.

Think I need to patch my previous #1560 patch to change how multiple works. I think have a new patch now for #1561 and I can include it there, but in case that more complicated patch doesn't happen soon I should do this one on its own. Advice?

@apatrida

So multiply doesn't make sense in this context without a change to take the subquery score out of the picture until the end. Might be sane to do that for all of the types, then multiply it in after. Would clean up the explain plan a lot as well to do this. Leaving Multiply broken while finishing patch for #1561 then will come back and fix this after so that I'm not on two conflicting versions of the same code.

@kimchy
Member
kimchy commented Dec 26, 2011

I agree that the first option feels better, but its trickier to implement because of how ScoreFunction works. If you have a nice solution for it, then have a pull request only for it. #1561 is a different case and its good to separate the two.

@apatrida

doing #1561 first, I'm 95% done there and want to get it out of the way. It'll need some feedback, as it is tricky as well, and causing more work and more code. I don't know how things fit into your world yet (I am about to be trained by your feedback)... It covers issues with query parsing and builders not present in the codebase so far (groups that aren't real queries or filters themselves). So leaving this one for a bit and coming back to it.

@alambert alambert added a commit to spindlelabs/elasticsearch that referenced this issue Jan 5, 2012
@apatrida apatrida Query DSL: custom_filters_score allow score_mode for "min" and "multi…
…ply", closes #1560

(cherry picked from commit 61b2562)
afb5214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment