-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scorer should sum up scores into a double #12682
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into it! I suggested not doing 2 changes that you suggested, but the 2 other ones look good to me.
@@ -266,7 +265,7 @@ public float score() throws IOException { | |||
score += optScorer.score(); | |||
} | |||
|
|||
return score; | |||
return (float) score; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually your change doesn't help here since this sums up two floats at most and summing up two floats is already guaranteed to be as accurate as a float can be. Let's revert changes on this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me! I think in that case we should remove the TODO as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the TODO still makes sense, it refers to BS1 being able to handle a mix of MUST and SHOULD clauses. If it happened, then it could have more than 2 clauses so casting into a double would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll keep it then
float normValue = normTable[(int) (norm & 0xFF)]; | ||
return raw * normValue; // normalize for field | ||
return (float) (raw * normValue); // normalize for field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise here, float multiplication is already guaranteed to give a result that is as accurate as a float can give.
One could argue that we could get more accuracy by casting into a double before multiplying in the first multiplication, ie. final double raw = (double) tf(freq) * queryWeight;
. But I don't think we should do it as similarity scores are a bit fuzzy by nature, and this would very unlikely improve ranking effectiveness. The main reason why we cast into doubles when summing up scores in not really to get better accuracy, but more so that the other in which clauses are evaluated doesn't have an impact on the final score.
Let's revert changes on this file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for clarifying! I'll revert changes to this file too.
e471943
to
c2f090f
Compare
Thanks @jpountz for the review! I have addressed the comments in the new revision. |
c2f090f
to
36f446a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the approval @jpountz ! |
### Description Addresses #12675 . Along with `MultiSimilarity.MultiSimScorer` found some others candidate scorer implementations for this fix.
Description
Addresses #12675 . Along with
MultiSimilarity.MultiSimScorer
found some others candidate scorer implementations for this fix.