New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which loss function works in multi-label classification task? #10371
Comments
The standard way to train a multilabel classifier is with sigmoid + binary_crossentropy, |
@ismaeIfm |
As far as I understand subset accuracy needs the explicit classes {0, 1}, but your model outputs probabilities, how did you choose the threshold to binarize the labels? Have you tried using LRAP to evaluate your model? |
For the multi-label classification, you can try tanh+hinge with {-1, 1} values in labels like (1, -1, -1, 1). |
@ismaeIfm |
@daniel410 |
I found an implementation of multi-label focal loss here: https://github.com/Umi-you/FocalLoss EDIT: Seems like his implementation doesn't work. |
The multi-label focal loss equation doesn't seem to work. |
@dberma15 focal loss doesn't work as in, it doesn't converge or implementation error? I feel it is the latter. Because, of 2 major issues. It shouldn't use numpy and implementation of cross entropy loss is flawed |
import tensorflow as tf
K = tf.keras.backend
class FocalLoss(object):
def __init__(self, gamma=2, alpha=0.25):
self._gamma = gamma
self._alpha = alpha
def compute_loss(self, y_true, y_pred):
cross_entropy_loss = K.binary_crossentropy(y_true, y_pred, from_logits=False)
p_t = ((y_true * y_pred) +
((1 - y_true) * (1 - y_pred)))
modulating_factor = 1.0
if self._gamma:
modulating_factor = tf.pow(1.0 - p_t, self._gamma)
alpha_weight_factor = 1.0
if self._alpha is not None:
alpha_weight_factor = (y_true * self._alpha +
(1 - y_true) * (1 - self._alpha))
focal_cross_entropy_loss = (modulating_factor * alpha_weight_factor *
cross_entropy_loss)
return K.mean(focal_cross_entropy_loss, axis=-1) @MrSnappingTurtle and @dberma15 |
@daniel410 Hi, would you mind sharing how you implement your focal loss for the multi-label task, if it's not too much trouble? |
it give me error |
You can try my implementation and let me know if it works. https://github.com/sushanttripathy/Keras_loss_functions/blob/master/focal_loss.py |
@sushanttripathy: I tried your code and it works but the output is a tensor focal_loss_tensor is of 2d array. Should I take a mean to arrive at the final loss? |
@Vishnux0pa I am not sure if the auto-differentiation requires me to provide the loss per sample (instead of per batch). I looked at categorical_crossentropy, and it seemed like that's what it was doing. I did not get convergence with the earlier version of the loss (the one that yielded a scalar). It does converge with this one though. |
I need to train a multi-label classifier for text topic classification task. Having searched around the internet, I follow the suggestion to use sigmoid + binary_crossentropy. But I can't get good results (i.e. subset accuracy) on the validation set although the loss is very small. After reading the source codes in Keras, I find out that the binary_crossentropy loss is implemented like this,
My doubt is whether it makes sense to use the average in the case of multi-label classification task. Suppose that the dimension of label set is 30, and each training sample has only two or three of the labels. Since most of the labels are zeros in the most of the samples, I guess this loss will encourage the classifier to predict a tiny probability in each output dimension.
Following the idea here, #2826, I also give a try to categorial_crossentropy but still have no such luck.
Any tips on choosing the loss function for multi-label classification task is beyond welcome. Thanks in advance.
The text was updated successfully, but these errors were encountered: