New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imbalanced learning: mu-parameter not used, leads to unweighted crossentropy-function in "mildly" unbalanced cases #94
Comments
That is definitely an error I made while coding. Would you be able to raise a PR? I'll merge it right away on to the main branch |
Sure, will do. Do you want to keep the current default mu=0.15? |
Sorry for the late reply. Got caught up with some other commitments. So, the method is actually from this stackoverflow post. There is no explanation to why 0.15, but there is on kaggle notebook shows why 0.14 is an okay default. But, this is strictly empirical and should be treated as such. |
Thanks for the links. |
I agree... I was playing with this a bit yesterday and 0.15 is too small for binary classification... How about we keep |
Completely agree, that is for the more common binary case the better default. Will change the PR. |
PR done |
merged by @manujosephv |
Hi,
The utils-function '''get_class_weighted_cross_entropy(y_train, mu=0.15)''' does not actually use the mu-parameter, but sets it to 0.15 regardless.
See line 29: "weights = _make_smooth_weights_for_balanced_classes(y_train, mu=0.15)"
pytorch_tabular/pytorch_tabular/utils.py
Line 29 in 9092543
In my binary-classification case with a 1:10 imbalance, this leads to "weights" of 1 to 1 for the two classes...
Also, you might want to think about setting mu higher by default, to get actual weights for non-extreme imbalances like mine.
I am using a mu > 1 to actually get different weights, does not work due to the bug (setting weights manually for now)
To Reproduce
Run "get_class_weighted_cross_entropy(y_train, mu=2)" with an 1:10 imbalanced, binary y_train.
Expected behavior
Get different weights for the two classes. Returns crossentropy with weight=[1,1] instead.
The text was updated successfully, but these errors were encountered: