-
Notifications
You must be signed in to change notification settings - Fork 25k
[pytorch] correct input size check for GroupNorm #33008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D19723407 |
I'm not really the right reviewer for this, but this needs a test at least. |
@v0dro, you added the checks, could you please take a look at this? Thanks! |
Summary: Pull Request resolved: pytorch#33008 Corrects D19373507 to allow valid use cases that fail now. Multiplies batch size by the number of elements in a group to get the correct number of elements over which statistics are computed. **Details**: The current implementation disallows GroupNorm to be applied to tensors of shape e.g. `(1, C, 1, 1)` to prevent cases where statistics are computed over 1 element and thus result in a tensor filled with zeros. However, in GroupNorm the statistics are calculated across channels. So in case where one has an input tensor of shape `(1, 256, 1, 1)` for `GroupNorm(32, 256)`, the statistics will be computed over 8 elements and thus be meaningful. One use case is [Atrous Spatial Pyramid Pooling (ASPPPooling)](https://github.com/pytorch/vision/blob/791c172a337d98012018f98ffde93b1020ba3ed5/torchvision/models/segmentation/deeplabv3.py#L50), where GroupNorm could be used in place of BatchNorm [here](https://github.com/pytorch/vision/blob/791c172a337d98012018f98ffde93b1020ba3ed5/torchvision/models/segmentation/deeplabv3.py#L55). However, now this is prohibited and results in failures. Proposed solution consists in correcting the computation of the number of elements over which statistics are computed. The number of elements per group is taken into account in the batch size. Test Plan: check that existing tests pass Differential Revision: D19723407 fbshipit-source-id: 8d241d5fd8ceb32b6c985e1a501e0c806f2626bf
82b9d60
to
9983b73
Compare
This pull request was exported from Phabricator. Differential Revision: D19723407 |
💊 CircleCI build failures summary and remediationsAs of commit 9983b73: None of the build failures appear to be your fault.
Detailed failure analysisOne may explore the probable reasons each build failed interactively on the Dr. CI website. ❄️ 1 failure recognized as flakyThe following build failures have been detected as flaky and may not be your fault:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The reason given for the changes seems legit.
This pull request has been merged in cfb4862. |
Summary: Pull Request resolved: pytorch#33008 Corrects D19373507 to allow valid use cases that fail now. Multiplies batch size by the number of elements in a group to get the correct number of elements over which statistics are computed. **Details**: The current implementation disallows GroupNorm to be applied to tensors of shape e.g. `(1, C, 1, 1)` to prevent cases where statistics are computed over 1 element and thus result in a tensor filled with zeros. However, in GroupNorm the statistics are calculated across channels. So in case where one has an input tensor of shape `(1, 256, 1, 1)` for `GroupNorm(32, 256)`, the statistics will be computed over 8 elements and thus be meaningful. One use case is [Atrous Spatial Pyramid Pooling (ASPPPooling)](https://github.com/pytorch/vision/blob/791c172a337d98012018f98ffde93b1020ba3ed5/torchvision/models/segmentation/deeplabv3.py#L50), where GroupNorm could be used in place of BatchNorm [here](https://github.com/pytorch/vision/blob/791c172a337d98012018f98ffde93b1020ba3ed5/torchvision/models/segmentation/deeplabv3.py#L55). However, now this is prohibited and results in failures. Proposed solution consists in correcting the computation of the number of elements over which statistics are computed. The number of elements per group is taken into account in the batch size. Test Plan: check that existing tests pass Reviewed By: fmassa Differential Revision: D19723407 fbshipit-source-id: c85c244c832e6592e9aedb279d0acc867eef8f0c
Summary:
Corrects D19373507 to allow valid use cases that fail now. Multiplies batch size by the number of elements in a group to get the correct number of elements over which statistics are computed.
Details:
The current implementation disallows GroupNorm to be applied to tensors of shape e.g.
(1, C, 1, 1)
to prevent cases where statistics are computed over 1 element and thus result in a tensor filled with zeros.However, in GroupNorm the statistics are calculated across channels. So in case where one has an input tensor of shape
(1, 256, 1, 1)
forGroupNorm(32, 256)
, the statistics will be computed over 8 elements and thus be meaningful.One use case is Atrous Spatial Pyramid Pooling (ASPPPooling), where GroupNorm could be used in place of BatchNorm here. However, now this is prohibited and results in failures.
Proposed solution consists in correcting the computation of the number of elements over which statistics are computed. The number of elements per group is taken into account in the batch size.
Test Plan: check that existing tests pass
Differential Revision: D19723407