CUDA BFloat16 gelu, hardswish, hardsigmoid #44997

zasdfgbnm · 2020-09-18T23:35:20Z

No description provided.

dr-ci · 2020-09-19T00:02:59Z

💊 CI failures summary and remediations

As of commit 977b89a (more details on the Dr. CI page):

4/4 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_clang5_asan_build (1/3)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Oct 19 18:48:14 caused by: Connection refused (os error 111)

Oct 19 18:48:14 ++++ extract_trap_cmd 
Oct 19 18:48:14 ++++ printf '%s\n' '' 
Oct 19 18:48:14 +++ printf '%s\n' cleanup 
Oct 19 18:48:14 ++ trap -- ' 
Oct 19 18:48:14 cleanup' EXIT 
Oct 19 18:48:14 ++ [[ pytorch-linux-xenial-py3-clang5-asan-build != *pytorch-win-* ]] 
Oct 19 18:48:14 ++ which sccache 
Oct 19 18:48:14 ++ sccache --stop-server 
Oct 19 18:48:14 Stopping sccache server... 
Oct 19 18:48:14 error: couldn't connect to server 
Oct 19 18:48:14 caused by: Connection refused (os error 111) 
Oct 19 18:48:14 ++ true 
Oct 19 18:48:14 ++ rm /var/lib/jenkins/sccache_error.log 
Oct 19 18:48:14 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory 
Oct 19 18:48:14 ++ true 
Oct 19 18:48:14 ++ [[ pytorch-linux-xenial-py3-clang5-asan-build == *rocm* ]] 
Oct 19 18:48:14 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Oct 19 18:48:14 ++ SCCACHE_IDLE_TIMEOUT=1200 
Oct 19 18:48:14 ++ RUST_LOG=sccache::server=error 
Oct 19 18:48:14 ++ sccache --start-server 
Oct 19 18:48:14 Starting sccache server...

pytorch_linux_xenial_py3_clang7_onnx_ort_test1 (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Oct 19 19:39:37 ERROR: No matching distribution found for ort-nightly==1.5.0.dev202009182

Oct 19 19:39:34 Collecting pip 
Oct 19 19:39:34   Downloading pip-20.2.4-py2.py3-none-any.whl (1.5 MB) 
Oct 19 19:39:34 Installing collected packages: pip 
Oct 19 19:39:34   Attempting uninstall: pip 
Oct 19 19:39:34     Found existing installation: pip 20.2.3 
Oct 19 19:39:34     Uninstalling pip-20.2.3: 
Oct 19 19:39:35       Successfully uninstalled pip-20.2.3 
Oct 19 19:39:36 Successfully installed pip-20.2.4 
Oct 19 19:39:36 + pip install -q --user -i https://test.pypi.org/simple/ ort-nightly==1.5.0.dev202009182 
Oct 19 19:39:37 ERROR: Could not find a version that satisfies the requirement ort-nightly==1.5.0.dev202009182 (from versions: 1.5.2.dev202010091, 1.5.2.dev202010121, 1.5.2.dev202010141, 1.5.2.dev202010151, 1.5.2.dev202010161) 
Oct 19 19:39:37 ERROR: No matching distribution found for ort-nightly==1.5.0.dev202009182

pytorch_linux_xenial_py3_clang7_onnx_ort_test2 (3/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Oct 19 19:39:07 ERROR: No matching distribution found for ort-nightly==1.5.0.dev202009182

Oct 19 19:39:04 Collecting pip 
Oct 19 19:39:04   Downloading pip-20.2.4-py2.py3-none-any.whl (1.5 MB) 
Oct 19 19:39:05 Installing collected packages: pip 
Oct 19 19:39:05   Attempting uninstall: pip 
Oct 19 19:39:05     Found existing installation: pip 20.2.3 
Oct 19 19:39:05     Uninstalling pip-20.2.3: 
Oct 19 19:39:06       Successfully uninstalled pip-20.2.3 
Oct 19 19:39:07 Successfully installed pip-20.2.4 
Oct 19 19:39:07 + pip install -q --user -i https://test.pypi.org/simple/ ort-nightly==1.5.0.dev202009182 
Oct 19 19:39:07 ERROR: Could not find a version that satisfies the requirement ort-nightly==1.5.0.dev202009182 (from versions: 1.5.2.dev202010091, 1.5.2.dev202010121, 1.5.2.dev202010141, 1.5.2.dev202010151, 1.5.2.dev202010161) 
Oct 19 19:39:07 ERROR: No matching distribution found for ort-nightly==1.5.0.dev202009182

1 job timed out:

pytorch_python_doc_build

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 21 times.

…ations2

codecov · 2020-09-19T06:24:50Z

Codecov Report

Merging #44997 into master will decrease coverage by 0.11%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #44997      +/-   ##
==========================================
- Coverage   68.31%   68.19%   -0.12%     
==========================================
  Files         410      410              
  Lines       53582    53232     -350     
==========================================
- Hits        36602    36303     -299     
+ Misses      16980    16929      -51

Impacted Files	Coverage Δ
torch/testing/_internal/te_utils.py	`0.00% <0.00%> (-82.15%)`	⬇️
torch/nn/modules/channelshuffle.py	`0.00% <0.00%> (-63.64%)`	⬇️
torch/quantization/fx/utils.py	`50.47% <0.00%> (-11.43%)`	⬇️
torch/quantization/fx/quantization_patterns.py	`89.35% <0.00%> (-7.17%)`	⬇️
torch/multiprocessing/spawn.py	`79.26% <0.00%> (-6.60%)`	⬇️
torch/fx/experimental/GraphManipulation.py	`94.28% <0.00%> (-5.72%)`	⬇️
torch/utils/data/_utils/worker.py	`21.49% <0.00%> (-2.83%)`	⬇️
torch/autograd/__init__.py	`84.28% <0.00%> (-1.43%)`	⬇️
torch/distributions/binomial.py	`94.93% <0.00%> (-1.01%)`	⬇️
torch/quantization/quantize_fx.py	`91.42% <0.00%> (-0.76%)`	⬇️
... and 55 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dec61f9...3830f80. Read the comment docs.

…ations2

ngimel · 2020-10-07T23:47:40Z

test/test_torch.py

-        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4, 1000]
+        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4]
+        if dtype != torch.bfloat16:
+            inputValues.append(1000)


what's up with this? Error too large?

So maybe switching to acc_t for internal computations will actually make it ok?

Do you know if backward for hardswish and hardsigmoid is tested anywhere? Aha, looks like it is tested in test_nn, but only with gradcheck and only for float64. Ok, whatever.

…ations2

ngimel · 2020-10-15T04:49:41Z

aten/src/ATen/native/cuda/Activation.cu

+        if (self_val < neg_three) {
+          return zero;
+        } else if (self_val <= three) {
+          return grad_val * ((self_val / three) + one_half);


unrelated to this PR, but I wonder if computations here should be done in accscalar_t

ngimel · 2020-10-15T04:57:54Z

test/test_torch.py

-        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4, 1000]
+        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4]
+        if dtype != torch.bfloat16:
+            inputValues.append(1000)


So maybe switching to acc_t for internal computations will actually make it ok?

ngimel · 2020-10-15T05:01:32Z

test/test_torch.py

-        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4, 1000]
+        inputValues = [-1000, -4, -3, -2, 0, 2, 3, 4]
+        if dtype != torch.bfloat16:
+            inputValues.append(1000)


Do you know if backward for hardswish and hardsigmoid is tested anywhere? Aha, looks like it is tested in test_nn, but only with gradcheck and only for float64. Ok, whatever.

…ations2

zasdfgbnm · 2020-10-26T17:59:42Z

ping @ngimel

facebook-github-bot

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-10-27T00:18:07Z

@ngimel merged this pull request in 7731370.

CUDA BFloat16 gelu, hardswish, hardsigmoid

b9f19af

pytorchbot added the open source label Sep 18, 2020

Merge branch 'master' of github.com:pytorch/pytorch into bfloat-activ…

d5e87c7

…ations2

zasdfgbnm requested a review from ngimel September 22, 2020 06:34

ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 22, 2020

Merge branch 'master' of github.com:pytorch/pytorch into bfloat-activ…

b33a55e

…ations2

ngimel reviewed Oct 7, 2020

View reviewed changes

Merge branch 'master' of github.com:pytorch/pytorch into bfloat-activ…

3830f80

…ations2

ngimel reviewed Oct 15, 2020

View reviewed changes

zasdfgbnm added 3 commits October 15, 2020 17:33

Merge branch 'master' of github.com:pytorch/pytorch into bfloat-activ…

71f9e71

…ations2

fix test

ce2da5b

Use T_ACC for computation

977b89a

ngimel approved these changes Oct 26, 2020

View reviewed changes

facebook-github-bot reviewed Oct 26, 2020

View reviewed changes

facebook-github-bot closed this in 7731370 Oct 26, 2020

zasdfgbnm deleted the bfloat-activations2 branch October 26, 2020 23:06

facebook-github-bot added the Merged label Oct 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA BFloat16 gelu, hardswish, hardsigmoid #44997

CUDA BFloat16 gelu, hardswish, hardsigmoid #44997

zasdfgbnm commented Sep 18, 2020

dr-ci bot commented Sep 19, 2020 •

edited

codecov bot commented Sep 19, 2020 •

edited

ngimel Oct 7, 2020

zasdfgbnm Oct 7, 2020

ngimel Oct 15, 2020

ngimel Oct 15, 2020

ngimel Oct 15, 2020

ngimel Oct 15, 2020

ngimel Oct 15, 2020

zasdfgbnm commented Oct 26, 2020

facebook-github-bot left a comment

facebook-github-bot commented Oct 27, 2020

CUDA BFloat16 gelu, hardswish, hardsigmoid #44997

CUDA BFloat16 gelu, hardswish, hardsigmoid #44997

Conversation

zasdfgbnm commented Sep 18, 2020

dr-ci bot commented Sep 19, 2020 • edited

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

pytorch_linux_xenial_py3_clang5_asan_build (1/3)

pytorch_linux_xenial_py3_clang7_onnx_ort_test1 (2/3)

pytorch_linux_xenial_py3_clang7_onnx_ort_test2 (3/3)

codecov bot commented Sep 19, 2020 • edited

Codecov Report

ngimel Oct 7, 2020

Choose a reason for hiding this comment

zasdfgbnm Oct 7, 2020

Choose a reason for hiding this comment

ngimel Oct 15, 2020

Choose a reason for hiding this comment

ngimel Oct 15, 2020

Choose a reason for hiding this comment

ngimel Oct 15, 2020

Choose a reason for hiding this comment

ngimel Oct 15, 2020

Choose a reason for hiding this comment

ngimel Oct 15, 2020

Choose a reason for hiding this comment

zasdfgbnm commented Oct 26, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Oct 27, 2020

dr-ci bot commented Sep 19, 2020 •

edited

codecov bot commented Sep 19, 2020 •

edited