New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doubt Reason Based on Entropy #11
Comments
I wonder ... what's a reasonable threshold here? |
I see two ways to use this:
I've been thinking about your question about the threshold, but I haven't been able to figure out a reasonable threshold value. I've been combing through some litterature related to this, but if such a threshold is used, it is often just a hyperparameter that is tuned, without a theoretical argument. |
Normalized entropy, as described here seems like a sound idea! Thanks for the mention 👍 I think I'm fine with keeping the threshold as a hyperparameter in this entropy-reason if that prevents adding an assumption to the stack. I think it'd be good to gather feedback anyway.
I'm wondering ... is this something best addressed via WrongPredictionReason. We may want to add a hyperparameter there for this use-case. |
Hi! I created a PR for version 1 of the entropy reason here. I went for a threshold of 0.5, just because it worked well for the iris dataset. 0.2 would have produced way too many non-zeros. Best |
Another way to tackle the "wtf should the threshold be" problem: Maybe we can specify a quantile instead of an absolute threshold like 0.5. This means that we can specify some quantile alpha and then only just flag a share of alpha samples having the highest normalized Shannon entropies. |
We also the |
Part of me likes the idea. But I'm worried that we may introduce a lot of hyperparams and that at the moment it's unclear how much more useful doubt based on entropy will be compared to the margin-based reason. |
I think it's possible to use Hoover index instead of entropy: it's easier to compute, it is always in 0-1 range and has clear explanation (0 - equality/uniformity, 1 - inequality). There is also a bigger problem with this approach in multiclass setting: assume you have 4 classes, if your probas are 0.25-0.25-0.25-0.25 then entropy/uniformity measure will correctly find them, but if you have something like 0-0 5-0.5-0 than it will fail, but this sample still could be mislabeled. This problem becomes even more sever with more classes. Straightforward solution would be to use one-vs-rest scheme. |
I'm wondering ... can we come up with a situation where entropy based doubt can adress issues that the other reasons cannot? |
Fixed by #24 |
If a machine learning model is very "confident" then the proba scores will have low entropy. The most uncertain outcome is a uniform distribution which would contain high entropy. Therefore, it could be sensible to add entropy as a reason for doubt.
The text was updated successfully, but these errors were encountered: