New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Motivation of the method #3
Comments
Hello, thanks for your interest. We agree that class prototype and our methods are quite similar under clean data settings. However, we think that prototype can be more contaminated towards label noise while ours does not. Furthermore, with the use of eigenvector, the perturbation is more correctly estimated. Your question looks great discussion topic for further research. |
Yes, I understand that the first eigenvector might be less contaminated by label noise. However, in the scenario of high noise_rate (e.g., 90% in the CIFAR10 dataset), it means the ratio of clean data in each class is quite small, will this avoids the eigenvector from represeanting the real distribution of the class? |
That is a good point. In the pilot experiment, we find that eigenvectors are more robust than other prototype-based methods such as prototype, anchor generated from Mahalanobis distribution, or Minimum covariance detector (MCD) method. In my thought, eigendecomposition seems quite robust towards noisy representations compared to prototype which is based on the concept of simple averaging |
Thanks, I totally understand that the eigenvectors are more robust than prototype-based methods. However, in my experiments, I find the FINE sampling method performs much worse than the simplest small-loss methods. Have you met this delimma in your experiments? |
In our experiments, we have not met such situations before...., but can be dependent on the kinds of datasets. The benefits of eigenvectors can be affected by the ability of backbone network. So, how about using warmup stage for the early training with significant magnitude of weight decay (or other regularization)? and then how about applying the FINE method with such pretrained arch? |
Hello,
I have read your paper and find it very interesting. However, I may have some confusion about your method. If I understood correctly, the first eigenvector represents the latent distribution of a class, which is similar as the function of a prototype. And I also saw some methods utilize the similarity between a sample and the class-prototype to select clean samples. I would like to know what is the advantage of using the eigenvector over prototypes.
Thanks.
The text was updated successfully, but these errors were encountered: