Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency CrossEntropyLoss vs BCELoss regarding logits/probability space #128493

Open
ego-thales opened this issue Jun 12, 2024 · 2 comments
Open
Labels
module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@ego-thales
Copy link

ego-thales commented Jun 12, 2024

馃殌 The feature, motivation and pitch

The loss CrossEntropyLoss expects logits inputs. On the other hand, there exist BCELoss and BCEWithLogitsLoss. But there is no counterpart to CrossEntropyLoss that expects inputs in the probability space.

Alternatives

I think it would make more sense if CrossEntropyLoss expected probability inputs and if a new CrossEntropyWithLogitsLoss was introduced. But this would be a dramatic change in the API.

As such, I would propose to introduce an option with_logits (that defaults to True to keep compatibility) to CrossEntropyLoss.

What do you think?

Additional context

Finally, I ask the following questions:

  • Is it possible to easily train a neural net that outputs softmax probabilities and not logits? It seems like since there is no CrossEntropyLoss counterpart for probability space, the user is forced to add a extra step and use NLLLoss or write a custom loss.
  • Is the fact that CrossEntropyLoss doesn't work with probability space due to under/overflow handling in general?

Thanks!

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

@mikaylagawarecki mikaylagawarecki added module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 13, 2024
@rajveer43
Copy link
Contributor

Can I Work on it?

@jbschlosser
Copy link
Contributor

Note that nn.NLLLoss takes in inputs as log-probabilities, and operating on these is generally more numerically stable.

Is the fact that CrossEntropyLoss doesn't work with probability space due to under/overflow handling in general?

So I'd say yes to this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: To pick up
Development

No branches or pull requests

4 participants