DOC Improves shape documentation for *Flatten #60980

thomasjpfan · 2021-06-29T19:58:30Z

Fixes #60841

facebook-github-bot · 2021-06-29T19:58:38Z

💊 CI failures summary and remediations

As of commit cad4578 (more details on the Dr. CI page and at hud.pytorch.org/pr/60980):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_clang5_asan_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 05 15:34:54 SUMMARY: UndefinedBehaviorSanit.../jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in

Jul 05 15:34:54     #9 0x55c4718698f2 in PyEval_EvalCode /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval.c:731
Jul 05 15:34:54     #10 0x55c4718d1cd5 in run_mod /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:1025
Jul 05 15:34:54     #11 0x55c4718d3d5d in PyRun_StringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:949
Jul 05 15:34:54     #12 0x55c4718d3dbb in PyRun_SimpleStringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:445
Jul 05 15:34:54     #13 0x55c4718d4926 in run_command /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:301
Jul 05 15:34:54     #14 0x55c4718d4926 in Py_Main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:749
Jul 05 15:34:54     #15 0x55c47180e196 in main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Programs/python.c:69
Jul 05 15:34:54     #16 0x7fc3fb0c083f in __libc_start_main /build/glibc-S7Ft5T/glibc-2.23/csu/../csu/libc-start.c:291
Jul 05 15:34:54     #17 0x55c47189e33d in _start (/opt/conda/bin/python3.6+0x1a733d)
Jul 05 15:34:54 
Jul 05 15:34:54 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in 
Jul 05 15:34:54 + retcode=1
Jul 05 15:34:54 + set -e
Jul 05 15:34:54 + return 1
Jul 05 15:34:54 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX-* ]]
Jul 05 15:34:54 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX2-* ]]
Jul 05 15:34:54 + '[' -n https://github.com/pytorch/pytorch/pull/60980 ']'
Jul 05 15:34:54 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 != *coverage* ]]
Jul 05 15:34:54 ++ mktemp
Jul 05 15:34:54 + DETERMINE_FROM=/tmp/tmp.pjAmaL36eo
Jul 05 15:34:54 + file_diff_from_base /tmp/tmp.pjAmaL36eo

Preview docs built from this PR

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

jbschlosser

Thanks for the update! Mind making the analogous change in Flatten for consistency?

jbschlosser · 2021-06-30T19:51:25Z

torch/nn/modules/flatten.py

-        - Input: :math:`(N, *dims)`
-        - Output: :math:`(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})`
+        - Input: :math:`(*)`. Input can be any shape.
+        - Output: :math:`(*, *unflattened\_size, *)`.


Any way to indicate that *unflattened_size is expanded at dim? Maybe something like (*, dim, *) before and (*, *unflattened\_size, *) after (where * represents any number of dimensions, including none)- wdyt?

(*, dim, *) would be inconsistent with how the size is normally specified in other Shape descriptions.

What do you think of using dim as a subscript: (*, S_{dim}, *), where S_{dim} is the "size at dimension dim".

Cool, I agree with you that using dim as a subscript is more accurate. We could even explicitly point out that S_{dim} = \prod *unflattened\_size must hold - thoughts on that?

jbschlosser

Looking good! Not to spend too much time on this, but since a bunch of doc updates are coming down the pipe soon, probably good to get a consistent set of rules in place. What are your opinions on:

Defining explicitly what S_{i} represents (e.g. where S_{i} is the size at dimension i) - does this add anything or is it obvious?
Using N_{i} instead of S_{i} here - does N_{i} fit with other docs or is it confusing?
Defining explicitly what * represents - should we do this in the shape section of every doc?
Explicitly indicating that the rest of output matches the input shape like the Linear docs do - does this add anything or is it obvious?
Maybe there should be a section somewhere in the PyTorch docs defining the notation used in Shape sections? vs. defining everything in-place.

I guess in general there is a spectrum of notational rigor from loose + possibly implicit to tightly defined + explicit, with the end goal of having docs that are readable, understandable, and unambiguous to at least a reasonable extent. I think the strategy up until now has been taking it on a case-by-case basis, and this unfortunately has lead to some inconsistencies throughout the docs

jbschlosser · 2021-07-01T15:06:13Z

torch/nn/modules/flatten.py

-        - Input: :math:`(N, *dims)`
-        - Output: :math:`(N, \prod *dims)` (for the default case).
+        - Input: :math:`(*, S_{start}, *, S_{end}, *)`
+        - Output: :math:`(*, \prod_{i=start}^{end} S_{i}, *)` (for the default case).


nit: I don't think "(for the default case)" is needed now since the more precise edit should cover all cases I believe

thomasjpfan · 2021-07-01T16:16:27Z

Looking good! Not to spend too much time on this, but since a bunch of doc updates are coming down the pipe soon, probably good to get a consistent set of rules in place.

I think it's good to discuss now and spend a little time on this. Looks like we are going to end up working on #59566 for the "Shape" sections, while updating for no-batch.

Defining explicitly what S_{i} represents (e.g. where S_{i} is the size at dimension i) - does this add anything or is it obvious?

I prefer explicit standing it.

Using N_{i} instead of S_{i} here - does N_{i} fit with other docs or is it confusing?

N is commonly used for "batch". But as we move forward with "no batch", we would be consistently removing N and replacing it with a *. This means N_{i} can be used. I proposed "S" because of "size" and "tensor.size(1)".

Defining explicitly what * represents - should we do this in the shape section of every doc?

I think we need to do this. Sometimes * can mean "any number of dimensions". But other times it could mean "only none or 1". For example ReflectionPad2d would be (*, C, H_{in}, W_{in}) where * is one dimension or none.

Maybe there should be a section somewhere in the PyTorch docs defining the notation used in Shape sections? vs. defining everything in-place.

Having a section defining the notation is good to have for developers regardless of if we define it in place. I do like having definitions in-place for users.

Overall I think the "Shape" section should be more on the explicit slide.

thomasjpfan · 2021-07-01T16:19:10Z

Explicitly indicating that the rest of output matches the input shape like the Linear docs do - does this add anything or is it obvious?

For the Linear case, I think it is obvious because of how the * lines up. I think we can restrict * to only be used to be at the start and ends?

Something like this:

jbschlosser · 2021-07-01T19:09:53Z

For the Linear case, I think it is obvious because of how the * lines up. I think we can restrict * to only be used to be at the start and ends?

Something like this:

Ah yes, I like this a lot!

jbschlosser

Flatten update looks good to me! Unflatten is pretty good and I do agree with your thoughts that explicit is better in general.

Only thing I'm wondering is if a U_{i} notation will help readability- wdyt? something like:

Input: (*, S_{dim}, *), where S_{dim} is the size at dimension dim
Output: (*, U_1, ..., U_n, *), where U = unflattened_size and \prod_i^n U_i = S_{dim}

torch/nn/modules/flatten.py

jbschlosser

LGTM!

facebook-github-bot · 2021-07-01T21:23:22Z

@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ssnl

Nice! Thanks for fixing. Btw, the formatting will look slightly nicer if you use \text{start} and \text{end}. No worries if you want to merge this as-is, since this is already drastic improvement.

facebook-github-bot · 2021-07-06T15:29:43Z

@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-07-06T17:48:34Z

@jbschlosser merged this pull request in 5503a4a.

DOC Improves documentation for Unflatten

4e1151e

thomasjpfan added module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 29, 2021

thomasjpfan requested a review from jbschlosser June 29, 2021 19:58

thomasjpfan requested a review from albanD as a code owner June 29, 2021 19:58

facebook-github-bot added the cla signed label Jun 29, 2021

pytorchbot added the open source label Jun 29, 2021

thomasjpfan changed the title ~~DOC Improves documentation for Unflatten~~ DOC Improves shape documentation for Unflatten Jun 29, 2021

jbschlosser reviewed Jun 30, 2021

View reviewed changes

thomasjpfan added 2 commits July 1, 2021 10:48

DOC Updates Flatten

a2a1941

DOC Apply suggestion

4ce26bf

thomasjpfan changed the title ~~DOC Improves shape documentation for Unflatten~~ DOC Improves shape documentation for *Flatten Jul 1, 2021

jbschlosser reviewed Jul 1, 2021

View reviewed changes

thomasjpfan added 2 commits July 1, 2021 12:23

DOC Be more explicit

fc84ba5

DOC Include * description

c85bbb9

jbschlosser reviewed Jul 1, 2021

View reviewed changes

torch/nn/modules/flatten.py Outdated Show resolved Hide resolved

torch/nn/modules/flatten.py Outdated Show resolved Hide resolved

torch/nn/modules/flatten.py Outdated Show resolved Hide resolved

DOC Be very explicit with docstrings

4cf4b9d

jbschlosser approved these changes Jul 1, 2021

View reviewed changes

ssnl reviewed Jul 4, 2021

View reviewed changes

thomasjpfan added 2 commits July 5, 2021 10:41

DOC Adds \text to make latex slightly nicer

626cff3

Merge remote-tracking branch 'upstream/master' into unflatten_doc_update

cad4578

facebook-github-bot closed this in 5503a4a Jul 6, 2021

facebook-github-bot added the Merged label Jul 6, 2021

DOC Improves shape documentation for *Flatten #60980

DOC Improves shape documentation for *Flatten #60980

Uh oh!

Conversation

thomasjpfan commented Jun 29, 2021

Uh oh!

facebook-github-bot commented Jun 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_py3_clang5_asan_test2 (1/1)

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

jbschlosser Jun 30, 2021

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jul 1, 2021

Choose a reason for hiding this comment

Uh oh!

jbschlosser Jul 1, 2021

Choose a reason for hiding this comment

Uh oh!

jbschlosser left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbschlosser Jul 1, 2021

Choose a reason for hiding this comment

Uh oh!

thomasjpfan commented Jul 1, 2021

Uh oh!

thomasjpfan commented Jul 1, 2021

Uh oh!

jbschlosser commented Jul 1, 2021

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbschlosser left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 1, 2021

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 6, 2021

Uh oh!

facebook-github-bot commented Jul 6, 2021

Uh oh!

Uh oh!

facebook-github-bot commented Jun 29, 2021 •

edited

Loading

jbschlosser left a comment •

edited

Loading