Fix the kl_div docs #67443

lezcano · 2021-10-28T14:28:51Z

Stack from ghstack:

-> Fix the kl_div docs #67443

After discussing the linked issue, we resolved that F.kl_div computes
the right thing as to be consistent with the rest of the losses in
PyTorch.

To avoid any confusion, these docs add a note discussing how the PyTorch
implementation differs from the mathematical definition and the reasons
for doing so.

These docs also add an example that may further help understanding the
intended use of this loss.

cc @brianjo @mruberry

Differential Revision: D32136888

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. [ghstack-poisoned]

pytorch-probot · 2021-10-28T14:28:54Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/14665818a3204e01df7b3e4de9e68c77e06cf1f5/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/xla`	✅ triggered
linux-vulkan-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-dynamic	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3.6-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`	✅ triggered
linux-xenial-py3.6-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`	✅ triggered
linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
docker-builds	`ciflow/all`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
linux-xenial-py3-clang5-mobile-code-analysis	`ciflow/all`, `ciflow/linux`, `ciflow/mobile`	🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-10-28T14:28:57Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/67443
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 1466581 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. ghstack-source-id: 348e8cd Pull Request resolved: #67443

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. ghstack-source-id: 6fe51a8 Pull Request resolved: #67443

pmeier

Some style suggestions and fixes inline and one larger possible mixup. Let's check this carefully to not mess this up again.

pmeier · 2021-10-29T05:55:39Z

torch/nn/modules/loss.py

+            is set to `False`, the losses are instead summed for each minibatch. Ignored
+            when :attr:`reduce` is `False`. Default: `True`


RST needs double backticks for inline code. Please revert here and all other occurrences.

Double backticks are broken in the PyTorch theme. See pytorch/pytorch_sphinx_theme#130
We adopted in linalg the single backtick partial solution until this problem is solved.
See also the first two posts in #54878

IIUC, the formatting is only broken for inline code with spaces. You "fixed" single words here, so this should not be an issue.

This is for consistency with the line

log-space if :attr:`log_target`\ `= True`.

Note that when there's assignment, the previous formatting was just wrong. There is an :attr: (which, btw, are also broken), then an equal without any formatting and then the value. This happened for example in the line :attr:`reduction` = ``'mean'`` .

To avoid these problems and others, we went provisionally with this formatting until the issue above is fixed. I agree that if the formatting were right, we should just avoid single backticks and :attr: altogether, but here we are.

Can't you do something like log_target=True? This is the PEP8 style when passing kwargs, e.g. criterion = nn.KLDivLoss(log_target=True).

:attr: is not broken, you are using it wrong. It is used to reference attributes of a class and not the parameters of a callable. Sphinx doesn't generate any links for parameters, so there is nothing to cross-link against. Your example should be reduction="mean" if you want to express that the user needs to set something or reduction=="mean" for a conditional. Both should be put in double backticks, respectively.

torch/nn/modules/loss.py

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

lezcano · 2021-10-29T08:16:20Z

Addressed the points. Thanks @pmeier!

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. ghstack-source-id: ea73138 Pull Request resolved: #67443

pmeier · 2021-10-29T14:04:02Z

torch/nn/modules/loss.py

        >>> # Sample a batch of distributions. Usually this would come from the dataset
        >>> def normalize(p):
-                return p / p.sum(1, keepdim=True)
+        >>>     return p / p.sum(1, keepdim=True)


Indented lines need to start with ...

Suggested change

>>> return p / p.sum(1, keepdim=True)

... return p / p.sum(1, keepdim=True)

pmeier

Only nitpicks and style fixes left.

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. ghstack-source-id: 6eb16cb Pull Request resolved: #67443

jbschlosser

Thanks for the update! Added a few comments below

torch/nn/modules/loss.py

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. cc brianjo mruberry [ghstack-poisoned]

Fixes #57459 After discussing the linked issue, we resolved that `F.kl_div` computes the right thing as to be consistent with the rest of the losses in PyTorch. To avoid any confusion, these docs add a note discussing how the PyTorch implementation differs from the mathematical definition and the reasons for doing so. These docs also add an example that may further help understanding the intended use of this loss. ghstack-source-id: ed55415 Pull Request resolved: #67443

lezcano · 2021-11-02T09:34:30Z

Thanks for the detailed review @jbschlosser ! I just addressed the comments. Could you have a second look at this, please?

jbschlosser

LGTM! Thanks for the in-depth improvements :)

jbschlosser · 2021-11-03T14:26:13Z

@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-11-04T14:10:41Z

@jbschlosser merged this pull request in 8e1ead8.

lezcano requested review from albanD and jbschlosser as code owners October 28, 2021 14:28

pytorch-probot bot added the ciflow/default label Oct 28, 2021

facebook-github-bot added the cla signed label Oct 28, 2021

lezcano requested review from mruberry and pmeier and removed request for albanD October 28, 2021 14:29

lezcano added the module: docs Related to our documentation, both in docs/ and docblocks label Oct 28, 2021

pytorchbot added the open source label Oct 28, 2021

pmeier requested changes Oct 29, 2021

View reviewed changes

lezcano requested a review from pmeier October 29, 2021 08:15

pmeier reviewed Oct 29, 2021

View reviewed changes

pmeier approved these changes Oct 29, 2021

View reviewed changes

pmeier mentioned this pull request Oct 29, 2021

add OpInfo for torch.nn.functional.kl_div #65469

Closed

jbschlosser reviewed Oct 29, 2021

View reviewed changes

lezcano requested a review from jbschlosser November 2, 2021 09:33

jbschlosser approved these changes Nov 3, 2021

View reviewed changes

facebook-github-bot closed this in 8e1ead8 Nov 4, 2021

facebook-github-bot added the Merged label Nov 4, 2021

facebook-github-bot deleted the gh/Lezcano/23/head branch November 7, 2021 15:16

tqbl mentioned this pull request Jan 8, 2022

kl_div formula in documentation #32278

Closed

		is set to `False`, the losses are instead summed for each minibatch. Ignored
		when :attr:`reduce` is `False`. Default: `True`

	>>> return p / p.sum(1, keepdim=True)
	... return p / p.sum(1, keepdim=True)

Fix the kl_div docs #67443

Fix the kl_div docs #67443

Uh oh!

Conversation

lezcano commented Oct 28, 2021 • edited by jbschlosser Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-probot bot commented Oct 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Oct 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

pmeier left a comment

Choose a reason for hiding this comment

Uh oh!

pmeier Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

lezcano Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

pmeier Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

lezcano Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

pmeier Oct 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lezcano commented Oct 29, 2021

Uh oh!

pmeier Oct 29, 2021

Choose a reason for hiding this comment

Uh oh!

pmeier left a comment

Choose a reason for hiding this comment

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lezcano commented Nov 2, 2021

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

jbschlosser commented Nov 3, 2021

Uh oh!

facebook-github-bot commented Nov 4, 2021

Uh oh!

Uh oh!

lezcano commented Oct 28, 2021 •

edited by jbschlosser

Loading

pytorch-probot bot commented Oct 28, 2021 •

edited

Loading

facebook-github-bot commented Oct 28, 2021 •

edited

Loading

pmeier Oct 29, 2021 •

edited

Loading