[decomp] Use var_mean in native_batch_norm decomposition #94140

peterbell10 · 2023-02-04T18:43:26Z

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2023-02-04T18:43:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94140

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm jobs fail to access AMD apt repo

✅ No Failures

As of commit c9e2a65:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: efcfcf7 Pull Request resolved: pytorch#94140

[ghstack-poisoned]

ghstack-source-id: 9f99f25 Pull Request resolved: pytorch#94140

ngimel · 2023-02-10T00:09:51Z

Did you check that inductor perf didn't change?

ngimel · 2023-02-10T00:10:27Z

test/test_decomp.py

        (torch.float16, torch.ops.aten._native_batch_norm_legit.no_stats): 1e-5,
        (torch.bfloat16, torch.ops.aten.linalg_vector_norm.default): 1e-4,
        (torch.float16, torch.ops.aten.linalg_vector_norm.default): 1e-4,
+        (torch.bfloat16, torch.ops.aten.var_mean.correction): 5e-7,


why does tolerance change here?

aten.var_mean uses a different algorithm from aten.mean which ends up being slightly more precise. 5e-7 is still incredibly good for half-precision though. The default rtol for torch.testing is 1e-5.

peterbell10 · 2023-02-10T01:04:04Z

Did you check that inductor perf didn't change?

This improves perf by removing the duplicate mean calculation. However it's very slight since the second mean was being fused with the sum of square deviations in the variance. In the following example, I see a 0.6% speedup from 366 us to 364 us

import torch
import torch._dynamo
from torch._inductor import config
config.debug = True

a = torch.nn.BatchNorm3d(10).train().cuda()
b = torch.rand(10, 10, 16, 64, 64, device="cuda")

@torch._dynamo.optimize()
def fn(x):
    return a(x)

_ = fn(b)
%timeit fn(b)

[decomp] Use var_mean in native_batch_norm decomposition

bb2e051

[ghstack-poisoned]

This was referenced Feb 4, 2023

[decomp] Decompose std/std_mean into aten.var/var_mean #94072

Closed

[inductor] Avoid re-computing mean in lowering for aten.var_mean #94139

Closed

peterbell10 mentioned this pull request Feb 4, 2023

std/var: support floating point correction value #94073

Closed

pytorchbot added the open source label Feb 4, 2023

Update on "[decomp] Use var_mean in native_batch_norm decomposition"

f4d0c93

[ghstack-poisoned]

peterbell10 mentioned this pull request Feb 4, 2023

[inductor] Count bytes can't read from buffers that are never written #94142

Closed

peterbell10 added the topic: not user facing topic category label Feb 4, 2023

peterbell10 added a commit to peterbell10/pytorch that referenced this pull request Feb 4, 2023

[decomp] Use var_mean in native_batch_norm decomposition

c4d74b3

ghstack-source-id: efcfcf7 Pull Request resolved: pytorch#94140

Update on "[decomp] Use var_mean in native_batch_norm decomposition"

c9e2a65

[ghstack-poisoned]

peterbell10 mentioned this pull request Feb 9, 2023

WelfordOps: Remove combine_t and use acc_scalar_t instead #94522

Closed

peterbell10 marked this pull request as ready for review February 10, 2023 00:03

peterbell10 requested a review from ngimel February 10, 2023 00:04

peterbell10 added a commit to peterbell10/pytorch that referenced this pull request Feb 10, 2023

[decomp] Use var_mean in native_batch_norm decomposition

1d5d059

ghstack-source-id: 9f99f25 Pull Request resolved: pytorch#94140

ngimel approved these changes Feb 10, 2023

View reviewed changes

pytorchmergebot added the Merged label Feb 10, 2023

pytorchmergebot closed this in e22e323 Feb 10, 2023

facebook-github-bot deleted the gh/peterbell10/517/head branch June 8, 2023 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[decomp] Use var_mean in native_batch_norm decomposition #94140

[decomp] Use var_mean in native_batch_norm decomposition #94140

Uh oh!

peterbell10 commented Feb 4, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 4, 2023 •

edited

Loading

Uh oh!

ngimel commented Feb 10, 2023

Uh oh!

ngimel Feb 10, 2023

Uh oh!

peterbell10 Feb 10, 2023

Uh oh!

peterbell10 commented Feb 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[decomp] Use var_mean in native_batch_norm decomposition #94140

[decomp] Use var_mean in native_batch_norm decomposition #94140

Uh oh!

Conversation

peterbell10 commented Feb 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94140

❗ 1 Active SEVs

✅ No Failures

Uh oh!

ngimel commented Feb 10, 2023

Uh oh!

ngimel Feb 10, 2023

Choose a reason for hiding this comment

Uh oh!

peterbell10 Feb 10, 2023

Choose a reason for hiding this comment

Uh oh!

peterbell10 commented Feb 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

peterbell10 commented Feb 4, 2023 •

edited

Loading

pytorch-bot bot commented Feb 4, 2023 •

edited

Loading