Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use logsigmoid at multilabel_soft_margin_loss, and change output from shape=(N, C)to (N,) #9965

Closed

Conversation

@weiyangfb
Copy link
Contributor

weiyangfb commented Jul 27, 2018

  • fixes #9141, #9301
  • use logsigmoid at multilabel_soft_margin_loss to make it more stable (NOT fixing legacy MultiLabelSoftMarginCriterion)
  • return (N) instead of (N, C) to match the same behavior as MultiMarginLoss
  • Note that with this PR, the following behavior is expected:
loss = F.multilabel_soft_margin_loss(outputs, labels, reduction='none')
loss_mean = F.multilabel_soft_margin_loss(outputs, labels, reduction='elementwise_mean')
loss_sum = F.multilabel_soft_margin_loss(outputs, labels, reduction='sum')

loss.sum() == loss_sum  # True
loss.mean() == loss_mean  # True
Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

if reduction == 'sum':
return loss.sum()
else: # elementwise_mean
return loss.mean()

This comment has been minimized.

Copy link
@soumith

soumith Jul 30, 2018

Member

this is not element-wise mean, it's global mean.

This comment has been minimized.

Copy link
@weiyangfb

weiyangfb Jul 30, 2018

Author Contributor

The doc and test suggest this should be a global mean. So yeah, the args name is confusing...

This comment has been minimized.

Copy link
@weiyangfb

weiyangfb Jul 30, 2018

Author Contributor

created an issue #10034 for the doc, we will fix this at the docathons

Copy link
Member

soumith left a comment

approved. change comment.

input = torch.sigmoid(input)
return binary_cross_entropy(input, target, weight, None, None, reduction)

loss = - (target * logsigmoid(input) + (1 - target) * logsigmoid(-input))

This comment has been minimized.

Copy link
@soumith

soumith Jul 30, 2018

Member

Isn't it

loss = - ((target * logsigmoid(input) + (1 - target) * logsigmoid(-input)))

i.e. negative sign comes outer?

This comment has been minimized.

Copy link
@weiyangfb

weiyangfb Jul 30, 2018

Author Contributor

Yes, the neg sign should be outer, and it is:

loss = - ( target * logsigmoid(input) + (1 - target) * logsigmoid(-input) )
              |<                                                                                                   >|    
Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@weiyangfb weiyangfb force-pushed the weiyangfb:stable_multilabel_soft_margin_loss branch from 1376a8b to 210f88e Jul 31, 2018
Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

if weight is not None:
loss = loss * weight

if reduction != 'none':

This comment has been minimized.

Copy link
@li-roy

li-roy Jul 31, 2018

Contributor

The two lines at the top of the function parses reduce and size_average into reduction, so you only need to handle the cases for reduction.

if reduction == 'sum':
return loss.sum()
else: # global mean
return loss.mean()

This comment has been minimized.

Copy link
@li-roy

li-roy Jul 31, 2018

Contributor

Let's resolve #9301 and #10034 (both are the same issue) here.

The equation in the docs as well as the reduction arg doc both indicate that it should be a elementwise mean (because there are N elements in the output, not N*C). Let's go with the doc behavior and make this a proper elementwise mean.

Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@weiyangfb weiyangfb changed the title use logsigmoid at multilabel_soft_margin_loss [wip] use logsigmoid at multilabel_soft_margin_loss, fix 'elementwise_mean' Aug 1, 2018
@weiyangfb weiyangfb force-pushed the weiyangfb:stable_multilabel_soft_margin_loss branch from 9ea862e to 5a10812 Aug 2, 2018
@li-roy

This comment has been minimized.

Copy link
Contributor

li-roy commented Aug 2, 2018

Why are we changing the behavior of all these losses?

@weiyangfb weiyangfb force-pushed the weiyangfb:stable_multilabel_soft_margin_loss branch from 5a10812 to c2b71e0 Aug 2, 2018
Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@weiyangfb weiyangfb changed the title [wip] use logsigmoid at multilabel_soft_margin_loss, fix 'elementwise_mean' use logsigmoid at multilabel_soft_margin_loss, and change output from shape=(N, C)to (N,) Aug 2, 2018
return binary_cross_entropy(input, target, weight, None, None, reduction)

loss = -(target * logsigmoid(input) + (1 - target) * logsigmoid(-input))
loss.sum(dim=0) # only return N loss values

This comment has been minimized.

Copy link
@li-roy

li-roy Aug 2, 2018

Contributor

I think this should be something like
loss = loss.sum(dim=1)

This comment has been minimized.

Copy link
@weiyangfb

weiyangfb Aug 2, 2018

Author Contributor

oh, you are right, I will change that

weiyangfb added 2 commits Aug 2, 2018
@weiyangfb

This comment has been minimized.

Copy link
Contributor Author

weiyangfb commented Aug 3, 2018

@li-roy I made changes accordingly, maybe take a look when you get a chance?

Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

li-roy left a comment

Looks good for the most part, just need to be consistent in the behavior.

loss = loss.sum(dim=1) # only return N loss values

if reduction == 'none':
return loss

This comment has been minimized.

Copy link
@li-roy

li-roy Aug 3, 2018

Contributor

I'm still a bit confused by the behavior we have here. It looks like if reduction is none, we're returning a tensor of N losses, and each individual loss is the sum of the losses for each class of that sample. However, when reduction is elementwise_mean or sum, it seems that you're now averaging the losses for each class, rather than summing them. We should choose one of these and stick to it, I believe summing is the legacy behavior.

This comment has been minimized.

Copy link
@weiyangfb

weiyangfb Aug 3, 2018

Author Contributor

oh, you're right. Yeah, I will change that to follow the legacy behavior, so that

loss = F.multilabel_soft_margin_loss(outputs, labels, reduction='none')
loss_mean = F.multilabel_soft_margin_loss(outputs, labels, reduction='elementwise_mean')
loss_sum = F.multilabel_soft_margin_loss(outputs, labels, reduction='sum')

loss.sum() == loss_sum  # True
loss.mean() == loss_mean  # True

This comment has been minimized.

Copy link
@li-roy

li-roy Aug 3, 2018

Contributor

Unless a loss is storing its elements in a weird way, the general rule of thumb is: if loss is the unreduced loss, then loss.mean() is the elementwise and loss.sum() is the sum.

weiyangfb added 2 commits Aug 3, 2018
@li-roy

This comment has been minimized.

Copy link
Contributor

li-roy commented Aug 3, 2018

If you're planning to go with averaging across classes, don't forget to change the formula in the doc in torch/nn/modules/loss.py

@li-roy
li-roy approved these changes Aug 3, 2018
Copy link
Contributor

facebook-github-bot left a comment

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
… shape=(N, C)to (N,) (pytorch#9965)

Summary:
- fixes pytorch#9141, pytorch#9301
- use logsigmoid at multilabel_soft_margin_loss to make it more stable (NOT fixing legacy MultiLabelSoftMarginCriterion)
- return (N) instead of (N, C) to match the same behavior as MultiMarginLoss
- Note that with this PR, the following behavior is expected:
```
loss = F.multilabel_soft_margin_loss(outputs, labels, reduction='none')
loss_mean = F.multilabel_soft_margin_loss(outputs, labels, reduction='elementwise_mean')
loss_sum = F.multilabel_soft_margin_loss(outputs, labels, reduction='sum')

loss.sum() == loss_sum  # True
loss.mean() == loss_mean  # True
```
Pull Request resolved: pytorch#9965

Differential Revision: D9038402

Pulled By: weiyangfb

fbshipit-source-id: 0fa94c7b3cd370ea62bd6333f1a0e9bd0b8ccbb9
@ezyang ezyang added the merged label Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.