-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors found by cleanlab are mostly correct actually. #25
Comments
Hi @NLPpupil . Can you please share (1) examples of your psx, and matching s, (2) how you computer psx, and (3) a minimum working example of your code? |
@cgnorthcutt Thank you, showing you the examples and code is a bother for you. I will double check first. |
Hi @cgnorthcutt , could you please tell me how to use cleanlab.models.fasttext.py to find label errors in details? I have a train file which is of fasttext format and I want to find the labels errors in the train file. Thank you very much . |
Hi @NLPpupil . Create an instance of the object |
I tried, but the model trained is just like the normal model trained by fasttext command line.Below is my code: |
Please provide the full error stack. Also |
�I figured out the reason. The reason why "errors found by cleanlab are mostly correct" is that my data is almost clean ! If I randomly replace 10% of the label with an incorrect label, and check the outputs of ordered_label_errors = get_noise_indices(), I found that that 97% of the top 100 instances are really noises and only 9% of the last 100 instances are noises! Thank your for your excellent work. |
I used the method in tutorial:
ordered_label_errors = get_noise_indices( s=numpy_array_of_noisy_labels, psx=numpy_array_of_predicted_probabilities, sorted_index_method='normalized_margin', # Orders label errors )
then the outputs that supposed to be error labels are actually correct, what actions could I take to figure out the reason?
The text was updated successfully, but these errors were encountered: