Add new reduction mode in kl_div #14457

ailzhang · 2018-11-28T06:48:18Z

Fixes #6622 .
We used to average over all elements for kl divergence, which is not aligned with its math definition.
This PR corrects the default reduction behavior of KL divergence that it now naverages over batch dimension.

In KL, default behavior reduction=mean averages over batch dimension. While for most other loss functions, reduction=mean averages over all elements.
We used to support scalar tensor as well. For BC purpose, we still support it, no reduction is performed on scalar tensor.
Added a new reduction mode called batchmean which has the correct behavior for KL. Add a warning to make batchmean as default for KL instead of mean in next major release.
[deprecated]I chose to not add a new reduction option, since "mean over batch dimension" is kinda special, and it only makes sense in few cases like KL. We don't want to explain why there's a option "batchmean" but it's not applicable for all other functions. I'm open to discussion on this one, as I cannot think of a perfect solution for this.

ssnl

can we get an expect test that this averages along batch dimension?

ailzhang · 2018-11-28T17:02:24Z

@ssnl I added a test to compare it with reduction='none'. Let me know what you think, thanks!

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

soumith · 2018-11-29T03:44:17Z

torch/nn/functional.py

-            specifying either of those two args will override :attr:`reduction`. Default: 'mean'
+            'none' | 'batchmean' | 'sum' | 'mean'. 'none': no reduction will be applied,
+            'batchmean': the sum of the output will be divided by the number of
+            batches in the output, 'sum': the output will be summed, 'mean': the output will be


"the number of batches" -> "the batch size" or "the number of elements in the input batch"

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

yf225 · 2018-12-03T18:58:28Z

@pytorchbot retest this please

ssnl

I don't think batch_mean should be an enum element. It will also introduce weird behavior for the losses still defined in TH if used with batch_mean since if-statements there are not written with batch_mean in mind.

ssnl · 2018-12-04T03:38:35Z

aten/src/ATen/core/Reduction.h

  None,             // Do not reduce
  Mean,             // (Possibly weighted) mean of losses
  Sum,              // Sum losses
+  BatchMean,        // Mean over batches. = Sum / batchsize


Can't you just hack this up in python? It should only be a workaround for kl_div and doesn't make sense for other losses.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ssnl

I'm now wondering if reduction='none' is wrong.. But this generally LGTM.

ssnl · 2018-12-04T05:53:48Z

torch/nn/functional.py

        reduction_enum = _Reduction.legacy_get_enum(size_average, reduce)
    else:
+        if reduction == 'mean':
+            warnings.warn("reduction=mean doesn't give the true kl divergence value. "


Better to put quotes around mean and batchmean. Also, I think this can be worded more clearly like

reduction: "mean" divides the total loss by both the batch size and the support size. "batchmean" divides only by the batch size, and aligns with the KL divergence math definition. "mean" will be changed to behave the same as "batchmean" in the next major release.

ssnl · 2018-12-04T05:55:05Z

torch/nn/functional.py

+            'mean': the output will be divided by the number of elements in the output
+            Note: :attr:`size_average` and :attr:`reduce` are in the process of being deprecated,
+            and in the meantime, specifying either of those two args will override :attr:`reduction`.
+            Note: `reduction='mean'` doesn't return the true kl divergence value, please use


I would make this a more obvious note using .. note::. Same for the module doc.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ailzhang added 1.0 module: bc-breaking Related to a BC-breaking change labels Nov 28, 2018

ailzhang requested a review from ssnl November 28, 2018 06:53

fix kl_div

93a3949

ailzhang force-pushed the fix_kl branch from 7d068b4 to 93a3949 Compare November 28, 2018 06:57

ssnl reviewed Nov 28, 2018

View reviewed changes

Ailing Zhang added 2 commits November 28, 2018 08:05

fix typo

03469db

add test to make sure it's averaged along batch dimension

14632bb

facebook-github-bot reviewed Nov 28, 2018

View reviewed changes

sum over observation dimension

4128b7d

facebook-github-bot reviewed Nov 28, 2018

View reviewed changes

add new mode batchmean

ff590a7

facebook-github-bot reviewed Nov 28, 2018

View reviewed changes

keep mean as default

bf46873

ailzhang changed the title ~~Fix kl_div default behavior~~ Add new reduction mode in kl_div Nov 28, 2018

ailzhang removed the module: bc-breaking Related to a BC-breaking change label Nov 28, 2018

facebook-github-bot reviewed Nov 28, 2018

View reviewed changes

fix test

803a1ab

soumith reviewed Nov 29, 2018

View reviewed changes

address comment

1c36da4

facebook-github-bot reviewed Nov 29, 2018

View reviewed changes

ssnl requested changes Dec 4, 2018

View reviewed changes

ailzhang force-pushed the fix_kl branch from b2df73b to 8d77ce9 Compare December 4, 2018 05:42

implement batchmean in python

c8a37c3

ailzhang force-pushed the fix_kl branch from 8d77ce9 to c8a37c3 Compare December 4, 2018 05:43

facebook-github-bot reviewed Dec 4, 2018

View reviewed changes

Merge branch 'master' of https://github.com/pytorch/pytorch into fix_kl

2a7cc67

ssnl approved these changes Dec 4, 2018

View reviewed changes

address comment

08e2d9c

facebook-github-bot reviewed Dec 4, 2018

View reviewed changes

fix lint

cc5fc22

facebook-github-bot reviewed Dec 4, 2018

View reviewed changes

jit only supports return at end of the function; workaround

8e90ddd

facebook-github-bot reviewed Dec 4, 2018

View reviewed changes

facebook-github-bot closed this in ef91cfd Dec 4, 2018

ailzhang deleted the fix_kl branch December 7, 2018 00:03

ezyang added this to the 1.0 milestone Apr 1, 2019

ezyang added the merged label Jun 25, 2019

Add new reduction mode in kl_div #14457

Add new reduction mode in kl_div #14457

Uh oh!

Conversation

ailzhang commented Nov 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

ailzhang commented Nov 28, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

soumith Nov 29, 2018

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

yf225 commented Dec 3, 2018

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

ssnl Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

ssnl Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

ssnl Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ailzhang commented Nov 28, 2018 •

edited

Loading