-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing probability field from answers in favor of score field #1340
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all good to me. Two questions:
- I assume Summarizer, Re-Ranker and classifier did not use any probabilities before?
- Should we scale
score
always between 0-1 now? I think this would simplify communication to users a lot. I assume this scaling is currently not the case for Elasticsearch (see comment).
Thanks for the feedback! Summarizer, Re-Ranker and Classifier don't have a The requested change in the elasticsearch document store is included now and I reverted the manual changes to the |
Proposed changes:
This PR drops the field
probability
from answers and documents in favor of the fieldscore
. The default behavior is now that the fieldscore
contains the value that was previously stored in probability. However, the FARMReader also allows switching back to the old scores by settinguse_confidence_scores
to False (default is True).closes #1220
Main changes:
haystack/haystack/reader/base.py
Line 45 in dda0e85
haystack/haystack/reader/farm.py
Line 55 in dda0e85
haystack/haystack/reader/farm.py
Line 577 in dda0e85
@tholor I came across some points in the code where a plain dictionary could be replaced with an Answer object:
haystack/haystack/reader/transformers.py
Line 134 in dda0e85
haystack/haystack/generator/transformers.py
Line 278 in dda0e85
haystack/haystack/reader/base.py
Line 44 in dda0e85
haystack/haystack/reader/farm.py
Line 576 in dda0e85
haystack/haystack/pipeline.py
Line 602 in dda0e85
Status (please check what you already did):
use_confidence_scores
in FARMReader. Ifuse_confidence_scores
is set to False, the old scores are used, which can be larger than 1.0