New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RuleMetrics] Overall precision should be calculated from overall correct/incorrect #1045
Comments
dcfidalgo
added
the
type: bug
Indicates an unexpected problem or unintended behavior
label
Jan 27, 2022
leiyre
added a commit
that referenced
this issue
Feb 2, 2022
frascuchon
pushed a commit
that referenced
this issue
Feb 2, 2022
frascuchon
pushed a commit
that referenced
this issue
Feb 2, 2022
frascuchon
pushed a commit
that referenced
this issue
Feb 2, 2022
leiyre
added a commit
that referenced
this issue
Feb 2, 2022
frascuchon
pushed a commit
that referenced
this issue
Feb 2, 2022
frascuchon
pushed a commit
that referenced
this issue
Feb 2, 2022
dvsrepo
added a commit
that referenced
this issue
Feb 10, 2022
* 'master' of https://github.com/recognai/rubrix: (33 commits) fix(#1045): fix overall precision (#1087) fix(#1081): prevent add records of different task (#1085) fix(#1045): calculate overall precision from overall correct/incorrect in rules (#1086) fix(#924): parse new error format in UI (#1082) fix(#1054): Optimize Long records (#1080) docs(#949): change note to admonition (#1071) fix(#1053): metadata modal position (#1068) fix(#1067): fix rule definition link when no labels are defined (#1069) fix(#1065): 'B' tag for beginning tokens (#1066) feat(#1054): optimize long records view (#1064) feat(#924): parse validation error, including submitted information (#1056) fix(#1058): sort by % data in rules list (#1062) fix(#1050): generalizes entity span validation (#1055) fix: missing Optional import fix(cleanlab): set cleanlab n_jobs=1 as default (#1059) feat(#982): Show filters in labelling rules view (#1038) feat(#932): label models now modify the prediction_agent when calling LabelModel.predict (#1049) fix(#821): Token classifier QA 2 (#1057) ci: fix path filter condition refactor(#924): normalize API error responses (#1031) ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For example, in the screenshot below, the overall precision should be 170 / (170+45) = 0.79 .
I think this is more meaningful than the average of the precisions, since it weighs each precision with its annot. coverage. For example, if I have one 100% precise rule that covers almost the whole dataset, and one 0% precise rule that only covers one record, now it would show me a 50% precision, but the weak labels from a simple majority voter would almost always be correct.
Right now, I think we compute the average of the precisions, but taking the NaNs as 0 into account, which seems like a bug.
@frascuchon @leiyre Not sure who to assign this one to?
The text was updated successfully, but these errors were encountered: