Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss function used to train baseline classifier g #5

Closed
hssandriss opened this issue Feb 11, 2021 · 1 comment
Closed

Loss function used to train baseline classifier g #5

hssandriss opened this issue Feb 11, 2021 · 1 comment

Comments

@hssandriss
Copy link

hssandriss commented Feb 11, 2021

Hello,

Great work guys,

I have a small question regarding the baseline classifier g.

Which loss are you using to train it? Is it Vanilla CE Loss? Because when analyzing accuracy per class on the unbalanced training data using the baseline classifier g with the weights that you provided, I found these already good accuracy values for unbalanced training: [99.8200, 99.8332, 99.2205, 97.8644, 98.6047, 97.1576, 97.8448, 99.2806, 97.5904, 96.0000].

Does it make sense to train it on more sophisticated losses like Weighted CE or so, to allow the model pick the features related to the minority samples better? Because on my dateset, the baseline classifier trained with CE performs really badly with a low accuracy near 5% on the minority sample of the training set. i.e. How well should the baseline classifier perform on the minority samples of the training data?

It would be great if you share the training procedure for reproducing the baseline classifier g (schedule, sampling methods, loss..).

I look forward to hearing from.

Many many thanks
Best,
Hssan

@bbuing9
Copy link
Collaborator

bbuing9 commented Jul 7, 2021

Hi, thanks for interest in our work!

As we have noted in the paper, we used a vanilla classifier, i.e., trained with vanilla CE loss without mitigating imbalance as g. The training script is already given in our "run.sh" file.

In our experiments, more sophisticated losses do not improve the performance of our algorithm, M2m.
Because capturing the features of the majority class is be more important for the baseline classifier g as one can see in Table 3.
Here, it is verified that the diversity of seed samples (majority class) directly affects the improvement from M2m.

But, if vanilla CE can't learn the training dataset with high accuracy, then I recommend 1) to increase the capacity of the network or 2) try sophisticated losses. Although the later one was not effective in our case, it can be different in your case.

Best,
Jaehyung.

@bbuing9 bbuing9 closed this as completed Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants