Skip to content

DOC Improve documentation for LayerNorm #63144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

aspenstarss
Copy link
Contributor

In this commit and issue, the Line 134 will overwrite the "embedding" variate which would cause an error when initiating nn.LayerNorm function.

I suggest renaming the "embedding" in Line 133 to "embedding_dim".

The final example is:

batch, sentence_length, embedding_dim = 20, 5, 10
embedding = torch.randn(batch, sentence_length, embedding_dim)
layer_norm = nn.LayerNorm(embedding_dim)

Fixes #{59178}

In this [commit](pytorch@7026995), the [Line 134](https://github.com/deniskokarev/pytorch/blob/47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L134) will overwrite the "embedding" variate which would cause an error when initiating `nn.LayerNorm` function.

I suggest renaming the "embedding" in [Line 133](https://github.com/deniskokarev/pytorch/blob/47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L133) to "embedding_dim".

The final example is:
```
batch, sentence_length, embedding_dim = 20, 5, 10
embedding = torch.randn(batch, sentence_length, embedding_dim)
layer_norm = nn.LayerNorm(embedding_dim)
```
@facebook-github-bot
Copy link
Contributor

Hi @aspenstarss!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 12, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit db20e44 (more details on the Dr. CI page):


  • 8/8 failures possibly* introduced in this PR
    • 1/8 non-scanned failure(s)

🕵️ 4 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (1/4)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Aug 12 18:53:07 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Aug 12 18:53:07 ++++ extract_trap_cmd
Aug 12 18:53:07 ++++ printf '%s\n' ''
Aug 12 18:53:07 +++ printf '%s\n' cleanup
Aug 12 18:53:07 ++ trap -- '
Aug 12 18:53:07 cleanup' EXIT
Aug 12 18:53:07 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-build != *win-* ]]
Aug 12 18:53:07 ++ which sccache
Aug 12 18:53:07 ++ sccache --stop-server
Aug 12 18:53:07 ++ true
Aug 12 18:53:07 ++ rm /var/lib/jenkins/sccache_error.log
Aug 12 18:53:07 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Aug 12 18:53:07 ++ true
Aug 12 18:53:07 ++ [[ -n '' ]]
Aug 12 18:53:07 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-build == *rocm* ]]
Aug 12 18:53:07 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Aug 12 18:53:07 ++ SCCACHE_IDLE_TIMEOUT=1200
Aug 12 18:53:07 ++ RUST_LOG=sccache::server=error
Aug 12 18:53:07 ++ sccache --start-server
Aug 12 18:53:07 sccache: Starting the server...
Aug 12 18:53:07 ++ sccache --zero-stats
Aug 12 18:53:07 Compile requests                      0

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) (2/4)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-08-12T19:57:07.0233221Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T19:57:06.9056037Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1125038136/build-results/
2021-08-12T19:57:06.9120943Z ++ cygpath -w /c/1125038136/build-results/
2021-08-12T19:57:06.9224488Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1125038136\build-results\'
2021-08-12T19:57:06.9225043Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-08-12T19:57:06.9225462Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T19:57:06.9225863Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T19:57:06.9226459Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/build/torch
2021-08-12T19:57:06.9651681Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts
2021-08-12T19:57:06.9652581Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts
2021-08-12T19:57:06.9847479Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts/*'
2021-08-12T19:57:07.0233221Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T19:57:07.0236106Z + '[' -n '' ']'
2021-08-12T19:57:07.0236866Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/.jenkins/pytorch/win-test-helpers
2021-08-12T19:57:07.0237861Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/.jenkins/pytorch/win-test-helpers
2021-08-12T19:57:07.0242839Z + IN_PULL_REQUEST=
2021-08-12T19:57:07.0243139Z + '[' -n '' ']'
2021-08-12T19:57:07.0243499Z + [[ win-vs2019-cpu-py3 == *cuda11* ]]
2021-08-12T19:57:07.0245262Z + run_tests
2021-08-12T19:57:07.0246175Z + for path in '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe' /c/Windows/System32/nvidia-smi.exe
2021-08-12T19:57:07.0247004Z + [[ -x /c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe ]]
2021-08-12T19:57:07.0260923Z + '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe'

See GitHub Actions build win-vs2019-cuda10.1-py3 / test (default, 1, 1, windows.8xlarge.nvidia.gpu) (3/4)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-08-12T20:48:27.1444853Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T20:48:27.0645671Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1125038134/build-results/
2021-08-12T20:48:27.0731979Z ++ cygpath -w /c/1125038134/build-results/
2021-08-12T20:48:27.0871004Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1125038134\build-results\'
2021-08-12T20:48:27.0872225Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-08-12T20:48:27.0873198Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T20:48:27.0874099Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T20:48:27.0875505Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/build/win_tmp/build/torch
2021-08-12T20:48:27.1076634Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/build/win_tmp/ci_scripts
2021-08-12T20:48:27.1078313Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/build/win_tmp/ci_scripts
2021-08-12T20:48:27.1341992Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/build/win_tmp/ci_scripts/*'
2021-08-12T20:48:27.1444853Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/build/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T20:48:27.1449003Z + '[' -n '' ']'
2021-08-12T20:48:27.1450528Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/.jenkins/pytorch/win-test-helpers
2021-08-12T20:48:27.1452779Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038134/.jenkins/pytorch/win-test-helpers
2021-08-12T20:48:27.1454137Z + IN_PULL_REQUEST=
2021-08-12T20:48:27.1454719Z + '[' -n '' ']'
2021-08-12T20:48:27.1455495Z + [[ win-vs2019-cuda10.1-py3 == *cuda11* ]]
2021-08-12T20:48:27.1456386Z + run_tests
2021-08-12T20:48:27.1457209Z + for path in '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe' /c/Windows/System32/nvidia-smi.exe
2021-08-12T20:48:27.1458206Z + [[ -x /c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe ]]
2021-08-12T20:48:27.1467052Z + '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe'

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 2, 2, windows.4xlarge) (4/4)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-08-12T19:56:43.8869532Z ls: cannot access ...d/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T19:56:43.8282429Z + PYTORCH_FINAL_PACKAGE_DIR=/c/1125038136/build-results/
2021-08-12T19:56:43.8346653Z ++ cygpath -w /c/1125038136/build-results/
2021-08-12T19:56:43.8450573Z + PYTORCH_FINAL_PACKAGE_DIR_WIN='C:\1125038136\build-results\'
2021-08-12T19:56:43.8451138Z + export PYTORCH_FINAL_PACKAGE_DIR_WIN
2021-08-12T19:56:43.8451551Z + export PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T19:56:43.8451933Z + PYTORCH_TEST_SKIP_NOARCH=1
2021-08-12T19:56:43.8452504Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/build/torch
2021-08-12T19:56:43.8602722Z + CI_SCRIPTS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts
2021-08-12T19:56:43.8603644Z + mkdir -p /c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts
2021-08-12T19:56:43.8799019Z ++ ls '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts/*'
2021-08-12T19:56:43.8869532Z ls: cannot access '/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/build/win_tmp/ci_scripts/*': No such file or directory
2021-08-12T19:56:43.8872004Z + '[' -n '' ']'
2021-08-12T19:56:43.8872797Z + export SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/.jenkins/pytorch/win-test-helpers
2021-08-12T19:56:43.8873782Z + SCRIPT_HELPERS_DIR=/c/actions-runner/_work/pytorch/pytorch/pytorch-1125038136/.jenkins/pytorch/win-test-helpers
2021-08-12T19:56:43.8874396Z + IN_PULL_REQUEST=
2021-08-12T19:56:43.8874680Z + '[' -n '' ']'
2021-08-12T19:56:43.8875010Z + [[ win-vs2019-cpu-py3 == *cuda11* ]]
2021-08-12T19:56:43.8875362Z + run_tests
2021-08-12T19:56:43.8875946Z + for path in '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe' /c/Windows/System32/nvidia-smi.exe
2021-08-12T19:56:43.8876683Z + [[ -x /c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe ]]
2021-08-12T19:56:43.8877293Z + '/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe'

3 failures not recognized by patterns:

Job Step Action
GitHub Actions Lint / mypy Run mypy 🔁 rerun
GitHub Actions Lint / flake8-py3 Fail if there were any warnings 🔁 rerun
GitHub Actions win-vs2019-cuda10.1-py3 / render_test_results (default) Unzip test reports 🔁 rerun

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@aspenstarss
Copy link
Contributor Author

I have signed the corporate CLA.

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@vadimkantorov
Copy link
Contributor

I suggest, docs should also specify what is the shape of the used mean/var and affine weights.

Copy link
Contributor

@jbschlosser jbschlosser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I like @vadimkantorov's suggestion above to add shape information for the learnable affine weights.

Ideally there should be an Attributes: section in the docs mentioning that weight and bias are defined when elementwise_affine=True with shape normalized_shape. (see Linear's docs as an example)

add an `Attributes` section in the docs to specify  the shape of `weight` and `bias`
@aspenstarss
Copy link
Contributor Author

Greate! I have added an Attributes section in the docs to specify the shape of weight and bias following the Linear's docs.

This is my first-time commit code to an open-source repository. Although I improve documentation only, it's a great process.

Thanks for your reply! @vadimkantorov @jbschlosser

delete a blank line that contains whitespace which seems can't pass the automatic testing.
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 12, 2021
@albanD albanD removed their request for review August 12, 2021 20:53
Copy link
Contributor

@jbschlosser jbschlosser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the fix :)

@facebook-github-bot
Copy link
Contributor

@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

>>> # Activate module
>>> layer_norm(embedding)
>>>
>>> # Image Example
>>> N, C, H, W = 20, 5, 10, 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is this image example realistic? I think almost all the time, LayerNorm is applied over only the embedding dim (and aggregates away the temporal / spatial dims)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. As the figure mentioned in here, LayerNorm normalizes over the last three (C, H, W) dimensions in compute vision. In my opinion, LayerNorm should normalize over the last two (seq_len, emb_dim) dimensions in NLP following the original paper. However, the implementation of LayerNorm in the most popular NLP project (such as AllenNLP is applied over only the embedding dim.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the GroupNorm paper isn't a good reference, because LayerNorm seems to be used differently these days (as you mention, usually in NLP only normalising over embedding_dim).

I think for image example, some realistic example is needed from some established / representative / recent model / paper. Otherwise, it is confusing.

Problem with normalising over height and width as well is that the affine parameters are then tied to fixed height and width and won't support other dimensions. I don't think this is the most common scenario. Or at least this should be discussed.

@facebook-github-bot
Copy link
Contributor

@jbschlosser merged this pull request in 72bc6dc.

alanwaketan pushed a commit that referenced this pull request Aug 17, 2021
Summary:
In this [commit](7026995) and [issue](#59178 (comment)), the [Line 134](https://github.com/deniskokarev/pytorch/blob/47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L134) will overwrite the "embedding" variate which would cause an error when initiating `nn.LayerNorm` function.

I suggest renaming the "embedding" in [Line 133](https://github.com/deniskokarev/pytorch/blob/47e286d024c183cb26a464447b34fde88b80d17d/torch/nn/modules/normalization.py#L133) to "embedding_dim".

The final example is:
```
batch, sentence_length, embedding_dim = 20, 5, 10
embedding = torch.randn(batch, sentence_length, embedding_dim)
layer_norm = nn.LayerNorm(embedding_dim)
```

Fixes #{59178}

Pull Request resolved: #63144

Reviewed By: bdhirsh

Differential Revision: D30288778

Pulled By: jbschlosser

fbshipit-source-id: e74b11430e302dae5661bf6e830ee5ac6c1838c4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants