New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement CalibrationAnalysis #417
Comments
@neubig, this is also one feature I have been expecting. The only complicated thing is we need |
At the moment, if an analysis is not applicable it returns |
Yes, I noticed that. But this also will result in potential bugs when deploying the web platforms, which I have debugged for a pretty long time. |
I don't think returning |
I just created #418 around the topic of Basically I think
|
I think it is better that |
Here are some draft ideas of implementing calibration analysis:
|
One comment: we may want to view a |
@neubig Subclassing and composition should be used when there is semantically meaningful relationship between both classes. If we need only the same code, implementing separate ones is better to organize the whole thing. |
This issue was resolved. |
Calibration is whether a system's confidence is well-correlated with whether the system got the answer right or not. It would be nice if we could do analyses related to calibration, such as calculating expected calibration error: https://arxiv.org/abs/1706.04599
I think this should probably be implemented as an additional variety of analysis, which would be simple and self-contained: https://github.com/neulab/ExplainaBoard/blob/main/explainaboard/analysis/analyses.py#L45
The text was updated successfully, but these errors were encountered: