Implement hardswish/hardsigmoid on MKLDNN tensors #55218

Krovatkin · 2021-04-02T07:16:36Z

Adding hardwish and hardsigmoid improves mobilenetv3 by ~13%

	hardswish	base
run 1	1305.032	1486.013
run 2	1290.142	1491.001
run 3	1305.51	1491.66
run 4	1308.788	1495.577
avg	1302.368	1491.063	0.873449

facebook-github-bot · 2021-04-02T07:16:44Z

💊 CI failures summary and remediations

As of commit 52b86f5 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-scanned failure(s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

eellison

Looks good!

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

eellison · 2021-04-05T14:06:49Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    std::function<Tensor(Tensor)> fallback) {
+  return [aten_op, fallback](Stack* stack) {
+    auto a = pop(stack).toTensor();
+    if (a.numel() == 0) {


if this is just to put numel() == 0 on the hot path instead of for correctness reasons is it worth it ?

eellison · 2021-04-05T14:08:15Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

@@ -633,6 +692,10 @@ class MKLDNNSubgraphSlicer {
        return true;
    }

+    if (n->kind() == Symbol::aten("hardswish")) {


nit: register aten::hardswish as an interned operator ?

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

eellison · 2021-04-05T14:10:59Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

@@ -456,6 +510,11 @@ void ComputeSubgraphInMKLDNN(Node* subgraph_node) {
      body_node->replaceInput(1, node->outputs().at(1));
    }

+    if (body_node->kind() == Symbol::aten("hardswish")) {
+      body_node->replaceWithNewSymbol(Symbol::aten("MKLDNNHardSwish"));
+      body_node->destroy();


We should be adding hardswish_ here

pytorch/torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

Line 819 in ce32d3d

static std::unordered_set<Symbol> mkldnn_ops = {

eellison · 2021-04-05T14:12:03Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

@@ -238,6 +281,17 @@ Operation BroadOp(const Node* node) {
  };
 }

+const RegisterOperators MKLDNNHardSwishOpReg({
+    torch::jit::Operator(
+        "aten::MKLDNNHardSwish(Tensor a) -> Tensor",


We should have a way of inplacing this :

pytorch/torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

Line 180 in ce32d3d

if (k == aten::relu || k == aten::sigmoid || k == aten::dropout) {

.

Maybe just by adding an inplace=true attribute to the node

eellison · 2021-04-05T14:12:35Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

@@ -238,6 +281,17 @@ Operation BroadOp(const Node* node) {
  };
 }

+const RegisterOperators MKLDNNHardSwishOpReg({
+    torch::jit::Operator(


nit: we have prim::mkldnn_convolution, reason not to put this in as prim::hardswish for consistency ?

eellison

Nice, LGTM!

eellison · 2021-04-09T19:01:41Z

test/jit/test_freezing.py

+                        g = parse_ir(graph_str)
+                        m = self.createFunctionFromGraph(g)
+                        x = torch.rand(size)
+                        x_copy = x.detach().clone()


nit: why are you detaching and cloning x_copy ?

for in-place tests, the reference implementations modify the input in aten_op(x, inplace=inplace). I guess, I could just always use F.hardswish(inplace=False) , might look a bit cleaner.

eellison · 2021-04-09T19:01:50Z

test/jit/test_freezing.py

        mod = self.freezeAndConvert(mod_eager)
-        FileCheck().check("mkldnn_convolution").check_next("aten::relu_").check_next("aten::relu_").run(mod.graph)
+        print(mod.graph)


stray print

a great catch!

eellison · 2021-04-09T19:02:01Z

test/jit/test_freezing.py

        mod = self.freezeAndConvert(mod_eager)
-        FileCheck().check("mkldnn_convolution").check_next("aten::relu_").check_next("aten::relu_").run(mod.graph)
+        print(mod.graph)
+        FileCheck().check("mkldnn_convolution").check_next("prim::MKLDNNHardSwish_").check_next("aten::relu_").run(mod.graph)


is the hardswish still necessary

facebook-github-bot · 2021-04-11T18:43:35Z

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

eellison

This looks good and I think is right (great job, this was tricky) but can you add a little more testing and more of a comment ? Reading through the code right now I wouldn't know what invariants we have to maintain and why the code is written how it is

eellison · 2021-04-12T20:31:07Z

test/jit/test_freezing.py

+                        g = parse_ir(graph_str)
+                        m = self.createFunctionFromGraph(g)
+                        x = torch.rand(size)
+                        # `inplace=False` is intentional, otherwise we modify the input


Can we test the inplace version somehow? Ideally we'd also test this after we'd run it with a Conv where we know it's going to output a packed tensor

added a conv test and we already have inplace tests.

we have inplace tests that arent testing these new ops which have a different way of running them, maybe as a follow up at least in next pr add more tests there

eellison · 2021-04-12T22:31:25Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    c10::impl::ExcludeDispatchKeyGuard edkg(c10::autograd_dispatch_keyset);
+    // we cast `a` to an `ideep::tensor`, so we can get at its descriptor
+    // which we then use to set up `out` tensor w/ the same props as a
+    auto a_it = at::native::itensor_from_mkldnn(a);


I'm going to suggest a bunch of things here for readability just because this is IMO very tricky code that is easy to get wrong (we almost did), and which requires an understanding of both mkldnn packed format and aten tensors. Feel free to not accept the comments obviously.

nit: a_it -> a_ideep_tensor

eellison · 2021-04-12T22:31:48Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    // we cast `a` to an `ideep::tensor`, so we can get at its descriptor
+    // which we then use to set up `out` tensor w/ the same props as a
+    auto a_it = at::native::itensor_from_mkldnn(a);
+    auto raw_data = a_it.get_data_handle();


nit: maybe mkldnn_raw_data_handle ?

eellison · 2021-04-12T22:32:51Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    // which we then use to set up `out` tensor w/ the same props as a
+    auto a_it = at::native::itensor_from_mkldnn(a);
+    auto raw_data = a_it.get_data_handle();
+    auto topt = a.options().layout(c10::kStrided);


can you rename topt ? I don't know where t or opt is coming from. Maybe: a_options_with_strided

eellison · 2021-04-12T22:33:10Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    auto raw_data = a_it.get_data_handle();
+    auto topt = a.options().layout(c10::kStrided);
+    // we also wrap `a` storage into an aten tensor
+    auto t = at::from_blob(raw_data, {a.numel()}, topt);


can you give t a more verbose name ?

eellison · 2021-04-12T22:39:44Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+    if (!inplace) {
+      // `a_it.get_desc()` will allocate a tensor
+      // of the right physical size.
+      auto it_empty = ideep::tensor(a_it.get_desc());


maybe add an assertion that it_empty physical size equals input physical size?

eellison · 2021-04-12T22:42:23Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

-      // this node doesnt handle string padding yet...
-      if (!body_node->namedInput("padding")->type()->cast<StringType>()) {
-        body_node->replaceWithNewSymbol(Symbol::prim("mkldnn_convolution"));
+    auto true_pred = [](Node*) { return true; };


Nit: can you keep the conv handling how it was originally? I think it was more readable originally and we've lost the this node doesnt handle string padding yet comment. If we start getting a lot more predicates than maybe we should refactor.

eellison · 2021-04-12T22:44:04Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+        "prim::MKLDNNRelu6_(Tensor(a!) self) -> Tensor(a!)",
+        createUnaryOp(
+            [](at::Tensor output, at::Tensor input) {
+              at::hardtanh_out(output, input, 0.f, 6.f);


Same comment here, just use relu6 since it exists

eellison · 2021-04-12T22:44:11Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+        "prim::MKLDNNRelu6(Tensor(a!) self) -> Tensor(a!)",
+        createUnaryOp(
+            [](at::Tensor output, at::Tensor input) {
+              at::hardtanh_out(output, input, 0.f, 6.f);


There is an at::relu6, can you use it instead of hardtanh?

eellison · 2021-04-12T22:45:18Z

torch/csrc/jit/passes/frozen_ops_to_mkldnn.cpp

+        "prim::MKLDNNHardSwish_(Tensor(a!) self) -> Tensor(a!)",
+        createUnaryOp(
+            [](at::Tensor output, at::Tensor input) {
+              at::hardswish_out(output, input);


For these operators, we could just call at::cpu::hardswish_out. Since we are already creating a custom operator we might as well avoid the overhead

remove dead code clang-format remove recompute files more diag add tests, fix warnings fix clang-format warnings fix aliasing fxi test failures after fixing the schema

facebook-github-bot · 2021-04-14T20:45:13Z

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-04-15T00:29:19Z

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

codecov · 2021-04-15T05:13:55Z

Codecov Report

Merging #55218 (52b86f5) into master (09c0bb4) will increase coverage by 0.00%.
The diff coverage is 93.33%.

@@           Coverage Diff           @@
##           master   #55218   +/-   ##
=======================================
  Coverage   77.13%   77.13%           
=======================================
  Files        1912     1912           
  Lines      189312   189356   +44     
=======================================
+ Hits       146024   146058   +34     
- Misses      43288    43298   +10

facebook-github-bot · 2021-04-15T15:52:57Z

@Krovatkin merged this pull request in 9d3d169.

Summary: Adding hardwish and hardsigmoid improves mobilenetv3 by ~13% | hardswish | base | -- | -- | -- | -- run 1 | 1305.032 | 1486.013 | run 2 | 1290.142 | 1491.001 | run 3 | 1305.51 | 1491.66 | run 4 | 1308.788 | 1495.577 | avg | 1302.368 | 1491.063 | 0.873449 Pull Request resolved: pytorch#55218 Reviewed By: albanD Differential Revision: D27701276 Pulled By: Krovatkin fbshipit-source-id: cde78da71d327e65461e80fbb6c3bb3429505410

facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue cla signed labels Apr 2, 2021

eellison reviewed Apr 5, 2021

View reviewed changes

Krovatkin changed the title ~~[WIP] Krovatkin/recompute grad~~ Implement hardswish/hardsigmoid on MKLDNN tensors Apr 6, 2021

Krovatkin force-pushed the krovatkin/recompute_grad branch from 649fcd1 to 32765f7 Compare April 6, 2021 16:40

Krovatkin requested review from eellison and bertmaher April 6, 2021 16:41

eellison approved these changes Apr 9, 2021

View reviewed changes

Krovatkin force-pushed the krovatkin/recompute_grad branch from e389016 to d9b7c11 Compare April 9, 2021 20:21

Krovatkin requested a review from eellison April 12, 2021 20:02

eellison reviewed Apr 12, 2021

View reviewed changes

Krovatkin added 10 commits April 14, 2021 19:53

init

3a489db

remove dead code clang-format remove recompute files more diag add tests, fix warnings fix clang-format warnings fix aliasing fxi test failures after fixing the schema

fix bbrks and simplify a test

0e7f0ac

account for padding

eddccc8

relu6

a42e489

add comments

b310be0

format comments

ac46b3f

add tests

aa1659a

remove unused headers

01a4367

address feedback

d64364e

fix narrowing warning

2dc9bce

Krovatkin force-pushed the krovatkin/recompute_grad branch from a1be3ba to 2dc9bce Compare April 14, 2021 19:53

add continue since we are destorying nodes

52b86f5

facebook-github-bot closed this in 9d3d169 Apr 15, 2021

facebook-github-bot added the Merged label Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement hardswish/hardsigmoid on MKLDNN tensors #55218

Implement hardswish/hardsigmoid on MKLDNN tensors #55218

Krovatkin commented Apr 2, 2021 •

edited

Loading

facebook-github-bot commented Apr 2, 2021 •

edited

Loading

eellison left a comment

eellison Apr 5, 2021

eellison Apr 5, 2021

eellison Apr 5, 2021

eellison Apr 5, 2021

eellison Apr 5, 2021

eellison left a comment

eellison Apr 9, 2021

Krovatkin Apr 9, 2021

eellison Apr 9, 2021

Krovatkin Apr 9, 2021

eellison Apr 9, 2021

facebook-github-bot commented Apr 11, 2021

eellison left a comment

eellison Apr 12, 2021

Krovatkin Apr 14, 2021

eellison Apr 14, 2021 •

edited

Loading

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

eellison Apr 12, 2021

facebook-github-bot commented Apr 14, 2021

facebook-github-bot commented Apr 15, 2021

codecov bot commented Apr 15, 2021

facebook-github-bot commented Apr 15, 2021

Implement hardswish/hardsigmoid on MKLDNN tensors #55218

Implement hardswish/hardsigmoid on MKLDNN tensors #55218

Conversation

Krovatkin commented Apr 2, 2021 • edited Loading

facebook-github-bot commented Apr 2, 2021 • edited Loading

💊 CI failures summary and remediations

eellison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eellison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 11, 2021

eellison left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eellison Apr 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 14, 2021

facebook-github-bot commented Apr 15, 2021

codecov bot commented Apr 15, 2021

Codecov Report

facebook-github-bot commented Apr 15, 2021

Krovatkin commented Apr 2, 2021 •

edited

Loading

facebook-github-bot commented Apr 2, 2021 •

edited

Loading

eellison Apr 14, 2021 •

edited

Loading