Replies: 1 comment
-
Our recommendation is just try to produce the most accurate model you can (measured according to your noisy given labels). Our research papers (unsurprisingly) show that cleanlab is generally able to better detect label errors with a more-accurate model, even when accuracy is measured with respect to given noisy labels. Model-Agnostic Label Quality Scoring to Detect Real-World Label Errors Confident Learning: Estimating Uncertainty in Dataset Labels There is no hard threshold for what is "good enough", except your model should certainly be better than a dummy predictor which either emits uniform-random predictions or always predicts the class that is most common overall in the dataset. With a better model, you will get better label error detection with cleanlab. You can then address these label errors and retrain your same model to get an even better version of the model! |
Beta Was this translation helpful? Give feedback.
-
Since we need to train a model before applying cleanlab to find label errors, what is the bench mark saying that my model is good enough to apply cleanlab.
Beta Was this translation helpful? Give feedback.
All reactions