Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss function for multi-label problem #2166

Closed
kingfengji opened this issue Apr 2, 2016 · 21 comments
Closed

loss function for multi-label problem #2166

kingfengji opened this issue Apr 2, 2016 · 21 comments
Labels

Comments

@kingfengji
Copy link

@kingfengji kingfengji commented Apr 2, 2016

Hi there, How to choose loss function for multi-label problem

it's different from multi-class output, the former output is a 0/1 vector with multiple ones, whereas the multi-class output is a single one-hot vector.

Thanks

@kingfengji

This comment has been minimized.

Copy link
Author

@kingfengji kingfengji commented Apr 2, 2016

the multi-class entropy thing is for multi-class problem, I suppose?

@NasenSpray

This comment has been minimized.

Copy link

@NasenSpray NasenSpray commented Apr 2, 2016

the multi-class entropy thing is for multi-class problem, I suppose?

Yes. What you want is binary_crossentropy

@kingfengji

This comment has been minimized.

Copy link
Author

@kingfengji kingfengji commented Apr 2, 2016

@NasenSpray binary_crossentropy is for multiclass, but not multilabel, right?

@NasenSpray

This comment has been minimized.

Copy link

@NasenSpray NasenSpray commented Apr 2, 2016

categorical_crossentropy: 1-of-N (one-hot)
binary_crossentropy: 1-or-more 0/1 labels

@keunwoochoi

This comment has been minimized.

Copy link
Contributor

@keunwoochoi keunwoochoi commented Apr 2, 2016

MSE/MAE also work, binary crossentropy would be preferred in general though.

@alyato

This comment has been minimized.

Copy link

@alyato alyato commented Jul 13, 2016

@kingfengji i'm doing multi-label and do you share your code how to do it. Or Do you give one simple example how to implement multi-label classification.Thanks

@suraj-deshmukh

This comment has been minimized.

Copy link

@suraj-deshmukh suraj-deshmukh commented Nov 22, 2016

@michelleowen

This comment has been minimized.

Copy link

@michelleowen michelleowen commented Feb 3, 2017

If using binary_crossentropy as loss function, does it mean we are minimizing the average of all cross-entropies over all classes?

@keunwoochoi

This comment has been minimized.

Copy link
Contributor

@keunwoochoi keunwoochoi commented Feb 3, 2017

@lipeipei31

This comment has been minimized.

Copy link

@lipeipei31 lipeipei31 commented Mar 31, 2017

@keunwoochoi Could you explain why binary crossentropy is preferred for multi-label classification? I thought binary crossentropy was only for binary classification where y label is only 0 or 1. Now that the y label is in the format of [1,0,1,0,1..], do you know how the loss is calculated with binary crossentropy?

@xidongbo

This comment has been minimized.

Copy link

@xidongbo xidongbo commented Apr 14, 2017

Thanks,My last layer is softmax layer. I use 'binary_crossentropy' as loss and I get 99% accuracy, while I use other loss function, I get only 10% accuracy.
I want to know how the accuracy is calculated?

@xidongbo

This comment has been minimized.

Copy link

@xidongbo xidongbo commented Apr 14, 2017

I get high accuracy,but when I see the predict labels. I find the labels are all-zeros.

@keunwoochoi

This comment has been minimized.

Copy link
Contributor

@keunwoochoi keunwoochoi commented Apr 14, 2017

@1064950364 Yes, that's the definition of accuracy and that's why accuracy doesn't matter with many multi-label problems. In your true labels, there are so many zeros, right?
@lipeipei31 More precisely, crossentropy is preferred over MAE/MSE. It's bounded, it's loss computation (which is in proportion to the gradient applied) is more plausible. In that case it computes crossetnropy over each output and then compute their average.

@alyato

This comment has been minimized.

Copy link

@alyato alyato commented May 1, 2017

@1064950364 Do you compare the two different layer in the last layer,softmax or sigmoid.

@stale

This comment has been minimized.

Copy link

@stale stale bot commented Jul 30, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot closed this Aug 29, 2017
@sarthakahuja11

This comment has been minimized.

Copy link

@sarthakahuja11 sarthakahuja11 commented Jun 18, 2018

I need to classify attributes in a face like colour of eye, hair, skin; facial hair, lighting and so on. Each has few sub-categories in it. So should I directly apply sigmoid on all the labels or separately apply softmax on each subcategory like hair/eye colour etc?
Which one will be better in this case?
Or should I combine both as some subclasses are binary?

@lipeipei31

This comment has been minimized.

Copy link

@lipeipei31 lipeipei31 commented Jun 18, 2018

@sarthakahuja11, sounds like you have a multi-output problem where each output is binary or multi-class classification. I think you should have different loss functions for different outputs.

@sarthakahuja11

This comment has been minimized.

Copy link

@sarthakahuja11 sarthakahuja11 commented Jun 19, 2018

@lipeipei31 You have identified the problem correctly. So I should choose binary cross entropy for binary-class classification and categorical-cross entropy for multi-class classification? And combine them together afterwards in the same model?

@lipeipei31

This comment has been minimized.

Copy link

@lipeipei31 lipeipei31 commented Jun 20, 2018

@sarthakahuja11 Yes, that's right. And you can easily do that with the keras functional api:
https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models. And the loss functions can be a list or dictionary if you named the outputs.

@sarthakahuja11

This comment has been minimized.

Copy link

@sarthakahuja11 sarthakahuja11 commented Jun 20, 2018

Thanks! @lipeipei31

@tomkastek

This comment has been minimized.

Copy link

@tomkastek tomkastek commented Mar 25, 2019

hi,

if binary cross entropy is working in Keras for multi-label problems, will categorical_crossentropy work for multi one-hot encoded classes as well?

My example output is:

[
    [0,0,1,0]
    [0,0,0,1]
    [1,0,0,0]
]

So I have three one hot encoded vectors. For a single on the loss function to choose would be categorical cross entropy. What will Keras do in a case like these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants
You can’t perform that action at this time.