[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204

huiguoo · 2021-03-17T20:11:44Z

Stack from ghstack:

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204 [jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode

Differential Revision: D27134348

… mode [ghstack-poisoned]

facebook-github-bot · 2021-03-17T20:11:51Z

💊 CI failures summary and remediations

As of commit 80161bf (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 1/2 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

clang-tidy (1/1)

Step: "Add annotations" (full log | diagnosis details | 🔁 rerun)

2021-03-23T03:03:51.4993713Z update-alternatives: error: no alternatives for nsight-sys

2021-03-23T03:03:51.4531092Z Setting up x11proto-damage-dev (1:2018.4-4) ...
2021-03-23T03:03:51.4557069Z Setting up cuda-nvjpeg-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.4626440Z Setting up libxcb-dri3-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-23T03:03:51.4652923Z Setting up libcublas10 (10.2.3.254-1) ...
2021-03-23T03:03:51.4681220Z Setting up libcublas-dev (10.2.3.254-1) ...
2021-03-23T03:03:51.4711141Z Setting up cuda-cufft-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.4783640Z Setting up cuda-nsight-compute-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.4812669Z Setting up nsight-systems-2020.4.3 (2020.4.3.7-10543b6) ...
2021-03-23T03:03:51.4905754Z update-alternatives: using /opt/nvidia/nsight-systems/2020.4.3/target-linux-x64/nsys to provide /usr/local/bin/nsys (nsys) in auto mode
2021-03-23T03:03:51.4982237Z update-alternatives: error: alternative path /opt/nvidia/nsight-systems/2020.4.3/host-linux-x64/nsight-sys doesn't exist
2021-03-23T03:03:51.4993713Z update-alternatives: error: no alternatives for nsight-sys
2021-03-23T03:03:51.5052400Z update-alternatives: using /opt/nvidia/nsight-systems/2020.4.3/host-linux-x64/nsys-ui to provide /usr/local/bin/nsys-ui (nsys-ui) in auto mode
2021-03-23T03:03:51.5093605Z Setting up libxcb-shape0-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-23T03:03:51.5122847Z Setting up cuda-cusparse-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.5194743Z Setting up cuda-cuobjdump-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.5222038Z Setting up libxcb-glx0-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-23T03:03:51.5278933Z Setting up libgles2:amd64 (1.0.0-2ubuntu2.3) ...
2021-03-23T03:03:51.5310230Z Setting up libglu1-mesa:amd64 (9.0.0-2.1build1) ...
2021-03-23T03:03:51.5339589Z Setting up libegl-mesa0:amd64 (20.0.8-0ubuntu1~18.04.1) ...
2021-03-23T03:03:51.5422164Z Setting up cuda-sanitizer-api-10-2 (10.2.89-1) ...
2021-03-23T03:03:51.5493774Z Setting up cuda-nvjpeg-dev-10-2 (10.2.89-1) ...

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

… mode ghstack-source-id: 2393e6582316f865f83c9291d5e5b8e25b7c9d5d Pull Request resolved: #54204

ZolotukhinM

Looks good, a couple of comments inline!

ZolotukhinM · 2021-03-18T01:42:13Z

torch/csrc/jit/tensorexpr/kernel.cpp

+
+            // parameter list: input, weight, bias, mean, var, training,
+            // momentum, eps, cudnn_enabled
+            Tensor* input = tensors_.at(n->inputs()[0]->unique());


Nit: use n->input(X) instead of n->inputs()[X].

ZolotukhinM · 2021-03-18T01:45:54Z

torch/csrc/jit/tensorexpr/kernel.cpp

+              Tensor* weight = tensors_.at(n->inputs()[1]->unique());
+              Tensor* bias = tensors_.at(n->inputs()[2]->unique());
+
+              auto inv_var = rsqrt(var->call(c) + eps);
+              auto weight_v = weight->call(c);
+              auto bias_v = bias->call(c);
+              auto alpha = inv_var * weight_v;
+              auto beta = bias_v - mean->call(c) * alpha;
+              auto output = input->call(axes) * alpha + beta;
+              return demoteOutput(output, n->output());


I wonder if we could merge these four cases using weight_v = 1 and bias_v = 0 as default values.

bertmaher

Looks nice! Can you check two things before landing:

How does that training boolean parameter get set? Is it from the eval() state of the model? What happens if the user switches the model between eval() and train()?
The promoteInputs question I have below; make sure we "do the right thing" if the argument types don't match.

bertmaher · 2021-03-18T04:07:23Z

torch/csrc/jit/tensorexpr/kernel.cpp

+            ExprHandle eps = constant(n->inputs()[7]);
+
+            // axes: N, C, H, W
+            std::vector<VarHandle> c(axes.begin() + 1, axes.begin() + 2);


You're just getting one axis here, right (c)? I think I'd be inclined to do VarHandle c = axes[1]; here instead of constructing a one-element vector. call has a variadic overload so nothing below should have to change (at least, I don't think so)

bertmaher · 2021-03-18T04:14:13Z

torch/csrc/jit/tensorexpr/kernel.cpp

+              auto weight_v = weight->call(c);
+              auto alpha = inv_var * weight_v;
+              auto output = input->call(axes) * alpha - mean->call(c) * alpha;
+              return demoteOutput(output, n->output());


I wonder if we need some promoteInputs as well. I'm not certain here b/c pytorch type promotion is complicated, but I wonder what you get if, e.g., input is a FloatTensor and mean is a DoubleTensor. (Not something that's likely to happen but we should handle it correctly all the same).

huiguoo · 2021-03-18T23:59:40Z

@bertmaher

How does that training boolean parameter get set? Is it from the eval() state of the model? What happens if the user switches the model between eval() and train()?

Great question! Has investigated on this and below is what I found.
There are two types of calls to aten::batch_norm in a pytorch program :
(1) nn.functional.batch_norm(input, mean, var, training=false)
(2) m = nn.BatchNorm2d(c) output=m(input)
In the first case, training parameter in aten::batch_norm is a constant and set to false in default. The state change between eval() and train() doesn't affect its value. We'll check the value of training parameter and only fuse batch_norm when training is false.
In the second case, the training parameter is determined by the model state: false in eval() and true in train(). If the user switches the model between eval() and train(), the value of training parameter will change correspondingly. In the torchscript graph, training is a prim::GetAttr node - we would not be able to fuse batch_norm unless we freeze the graph first so training becomes a constant.

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

… mode ghstack-source-id: 5507e00060377a2f519fd392359e792cd9374c16 Pull Request resolved: #54204

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

… mode ghstack-source-id: df5e2fdea5ade06ac43c7e28418cd2daff595013 Pull Request resolved: #54204

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

… mode ghstack-source-id: 459b504f3065716efc6d3a49a4c0120ed3b2640e Pull Request resolved: #54204

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

… mode ghstack-source-id: 20b079b389b951acfc16651f4b7f5aeeeac5815f Pull Request resolved: #54204

facebook-github-bot · 2021-03-23T11:44:12Z

@huiguoo merged this pull request in 2a53897.

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

0b34dc3

… mode [ghstack-poisoned]

facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Mar 17, 2021

huiguoo added a commit that referenced this pull request Mar 17, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

cbf214e

… mode ghstack-source-id: 2393e6582316f865f83c9291d5e5b8e25b7c9d5d Pull Request resolved: #54204

huiguoo requested review from navahgar, ZolotukhinM and bertmaher March 17, 2021 20:58

ZolotukhinM approved these changes Mar 18, 2021

View reviewed changes

bertmaher reviewed Mar 18, 2021

View reviewed changes

Update on "[jit][tensorexpr] Added aten::batch_norm into fuser when i…

bd7731d

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 19, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

799e3d1

… mode ghstack-source-id: 5507e00060377a2f519fd392359e792cd9374c16 Pull Request resolved: #54204

Update on "[jit][tensorexpr] Added aten::batch_norm into fuser when i…

54e5f7b

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 19, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

e6a300a

… mode ghstack-source-id: df5e2fdea5ade06ac43c7e28418cd2daff595013 Pull Request resolved: #54204

Update on "[jit][tensorexpr] Added aten::batch_norm into fuser when i…

06408a5

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 23, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

db5dd6c

… mode ghstack-source-id: 459b504f3065716efc6d3a49a4c0120ed3b2640e Pull Request resolved: #54204

Update on "[jit][tensorexpr] Added aten::batch_norm into fuser when i…

7b445fd

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

Update on "[jit][tensorexpr] Added aten::batch_norm into fuser when i…

80161bf

…n inference mode" Differential Revision: [D27134348](https://our.internmc.facebook.com/intern/diff/D27134348) [ghstack-poisoned]

huiguoo added a commit that referenced this pull request Mar 23, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference…

28b0aee

… mode ghstack-source-id: 20b079b389b951acfc16651f4b7f5aeeeac5815f Pull Request resolved: #54204

facebook-github-bot closed this in 2a53897 Mar 23, 2021

facebook-github-bot added the Merged label Mar 23, 2021

facebook-github-bot deleted the gh/huiguoo/12/head branch March 26, 2021 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204

huiguoo commented Mar 17, 2021 •

edited

facebook-github-bot commented Mar 17, 2021 •

edited

ZolotukhinM left a comment

ZolotukhinM Mar 18, 2021

ZolotukhinM Mar 18, 2021

bertmaher left a comment

bertmaher Mar 18, 2021

bertmaher Mar 18, 2021

huiguoo commented Mar 18, 2021

facebook-github-bot commented Mar 23, 2021

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204

[jit][tensorexpr] Added aten::batch_norm into fuser when in inference mode #54204

Conversation

huiguoo commented Mar 17, 2021 • edited

facebook-github-bot commented Mar 17, 2021 • edited

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

clang-tidy (1/1)

ZolotukhinM left a comment

Choose a reason for hiding this comment

ZolotukhinM Mar 18, 2021

Choose a reason for hiding this comment

ZolotukhinM Mar 18, 2021

Choose a reason for hiding this comment

bertmaher left a comment

Choose a reason for hiding this comment

bertmaher Mar 18, 2021

Choose a reason for hiding this comment

bertmaher Mar 18, 2021

Choose a reason for hiding this comment

huiguoo commented Mar 18, 2021

facebook-github-bot commented Mar 23, 2021

huiguoo commented Mar 17, 2021 •

edited

facebook-github-bot commented Mar 17, 2021 •

edited