Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Relu6 mkldnn op #25713

Conversation

arlesniak
Copy link
Contributor

@arlesniak arlesniak commented Jul 24, 2020

PR types

New features

PR changes

OPs

Describe

Dygraph: Added Relu6 mkldnn op used in mobilenetV2 from models repo

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Jul 24, 2020

✅ This PR's description meets the template requirements!
Please wait for other CI results.

@jczaja jczaja added Intel dygraph issues related to dygraph mode labels Jul 24, 2020
sfraczek
sfraczek previously approved these changes Jul 24, 2020
Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sfraczek
sfraczek previously approved these changes Jul 27, 2020
Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

grygielski
grygielski previously approved these changes Jul 27, 2020
Copy link
Contributor

@grygielski grygielski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

luotao1
luotao1 previously approved these changes Jul 28, 2020
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1
Copy link
Contributor

luotao1 commented Jul 28, 2020

2020-07-25 05:40:02 0. You must have one RD (XiaoguangHu01 or lanxianghit) and one TPM (saxon-zh or jzhang533 or swtkiwi or Heeenrrry or TCChenlong) approval for the api change for the management reason of API interface.
2020-07-25 05:40:02 1. You must have one TPM (saxon-zh or jzhang533 or swtkiwi or Heeenrrry or TCChenlong) approval for the api change for the management reason of API document.
2020-07-25 05:40:02 2. You must have one RD (zhiqiu (Recommend) or phlrain) approval for the api change for the opreator-related api without 'core.ops'.
2020-07-25 05:40:02 For more details, please click [https://github.com/PaddlePaddle/Paddle/wiki/paddle_api_development_manual.md]
2020-07-25 05:40:02 Related APIs: paddle.fluid.layers.relu6
2020-07-25 05:40:02 There are 3 approved errors.
2020-07-25 05:40:02 ****************
2020-07-25 05:40:02 API Difference is: 
2020-07-25 05:40:02 - paddle.fluid.layers.relu6 (ArgSpec(args=['x', 'threshold', 'name'], varargs=None, keywords=None, defaults=(6.0, None)), ('document', '09b5e5decc444b742f737ece2f5b9f7e'))
2020-07-25 05:40:02 ?                                                                                                                                       ^^ ^ ^^^^^^^^^^^^^    ^^^^^  ^^^
2020-07-25 05:40:02 
2020-07-25 05:40:02 + paddle.fluid.layers.relu6 (ArgSpec(args=['x', 'threshold', 'name', 'use_mkldnn'], varargs=None, keywords=None, defaults=(6.0, None, False)), ('document', '114eba28e873971013636dbeabb9051c'))
2020-07-25 05:40:02 ?                                                                  ++++++++++++++                                                   +++++++                  ^^^^ ^^^ ^  + +++++++++ ^^  ^^^^
2020-07-25 05:40:02 
2020-07-25 05:40:04 + approval_error=1

@zhiqiu zhiqiu self-requested a review July 28, 2020 02:42
zhiqiu
zhiqiu previously approved these changes Jul 28, 2020
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -9313,7 +9313,7 @@ def elu(x, alpha=1.0, name=None):


@templatedoc()
def relu6(x, threshold=6.0, name=None):
def relu6(x, threshold=6.0, name=None, use_mkldnn=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you add the attribute use_mkldnn here?

Copy link
Contributor

@sfraczek sfraczek Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we add use_mkldnn here

This Relu6 is used in models repo. It is needed for mobilenetv2 here https://github.com/PaddlePaddle/models/blob/develop/dygraph/mobilenet/mobilenet_v2.py#L69.

We wouldn't have to do this if models repo would use the approach as here:
https://github.com/PaddlePaddle/models/blob/develop/dygraph/resnet/train.py#L207
which calls:
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/dygraph/nn.py#L233.
which is passed to convolution here: https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/dygraph/nn.py#L170

There are few ways how ops are added in models repo, for example this: https://github.com/PaddlePaddle/models/blob/develop/dygraph/resnet/train.py#L243
I am guessing that this is inconsistent because it isn't fully migrated yet but I am not sure.

Value of use_mkldnn has to be evaluated before this line for dygraph to use mkldnn kernel of relu6: https://github.com/PaddlePaddle/Paddle/pull/25713/files/3edaaad6bc84ed19715aaca04f9ed6bdbf06e25e#diff-f8fc1afdc1f06d4c47bc0933d53bf3c6R9359.

The case of sigmoid

For sigmoid I haven't added any use_mkldnn yet because we haven't enabled it yet. However, It will have to be handled with in the same way as we will eventually decide to handle relu6.

Old vs New approach

This use_mkldnn=False is from previous approach before we decided ot add FLAGS_use_mkldnn.
The new approach we have is to pass use_mkldnn=True to ops and check it with FLAGS_use_mkldnn. For details please refer to this PR
https://github.com/PaddlePaddle/Paddle/pull/25773/files#diff-aeec709d2310c344dd0f235478865b7cR182. We pass use_mkldnn=True and we call CheckUseMkldnnWithFlag() to compare against FLAGS_use_mkldnn.

How we can proceed

We could apply the same approach to this PR with checking against FLAGS_use_mkldnn. If you think this is correct, I can do that here.
Alternatively, maybe models repo could call relu6 in different way? Or maybe you have other ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the same situation here: #25773 (comment)
We should definitely decide how to deal with that so I can change both places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could apply the same approach to this PR with checking against FLAGS_use_mkldnn. If you think this is correct, I can do that here.
Alternatively, maybe models repo could call relu6 in different way? Or maybe you have other ideas?

@phlrain Please have a look, and which way do you prefer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use FLAGS_use_mkldnn. It's seems an easier way

@@ -504,6 +504,9 @@ class Relu6OpMaker : public framework::OpProtoAndCheckerMaker {
AddAttr<float>("threshold",
"The threshold value of Relu6. Default is 6.0. ")
.SetDefault(6.0f);
AddAttr<bool>("use_mkldnn",
"(bool, default false) Only used in mkldnn kernel")
.SetDefault(false);
Copy link
Contributor

@luotao1 luotao1 Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add an attribute here? A lot of activation ops already support the use_mkldnn attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it it crashes. We run modified mobilenet script:
https://github.com/PaddlePaddle/models/blob/a7fb45f650e0d32ba697b5818aa4799c2ccc9b05/dygraph/mobilenet/mobilenet_v2.py#L69
passing use_mkldnn=True in that line.
The error is:

Traceback (most recent call last):
  File "train.py", line 262, in <module>
    train_mobilenet()
  File "train.py", line 195, in train_mobilenet
    avg_loss.backward()
  File "</nfs/pdx/home/sfraczek/.local/lib/python2.7/site-packages/decorator.pyc:decorator-gen-97>", line 2, in backward
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 219, in __impl__
    return func(*args, **kwargs)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/varbase_patch_methods.py", line 172, in backward
    framework._dygraph_tracer(), retain_graph)
KeyboardInterrupt

Reservation status error
sfraczek@aipg-ra-clx-86:/data/sfraczek/models/dygraph/mobilenet$ bash run_cpu_v2.sh | grep relu
Traceback (most recent call last):
  File "train.py", line 262, in <module>
    train_mobilenet()
  File "train.py", line 180, in train_mobilenet
    out = net(img)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
    outputs = self.forward(*inputs, **kwargs)
  File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 212, in forward
    y = self._conv1(inputs, if_act=True)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
    outputs = self.forward(*inputs, **kwargs)
  File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 69, in forward
    y = fluid.layers.relu6(y, use_mkldnn=True)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/layers/nn.py", line 9359, in relu6
    'use_mkldnn': use_mkldnn})
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 2789, in append_op
    kwargs.get("stop_gradient", False))
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/tracer.py", line 45, in trace_op
    not stop_gradient)
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2   std::conditional<std::is_pointer<paddle::framework::Attribute >::value, bool const*, bool const&>::type paddle::platform::details::SafeBoostGetConst<bool, paddle::framework::Attribute >(paddle::framework::Attribute const&, char const*, char const*, int)
3   paddle::operators::GetKernelType(paddle::framework::ExecutionContext const&, paddle::framework::OperatorWithKernel const&, std::string const&)
4   paddle::operators::ActivationOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const
5   paddle::imperative::PreparedOp paddle::imperative::PrepareOpImpl<paddle::imperative::VarBase>(paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place, paddle::framework::AttributeMap const&)
6   paddle::imperative::PreparedOp::Prepare(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place const&, paddle::framework::AttributeMap const&)
7   paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool)

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: boost::get failed, cannot get value (GetAttr(name)) by type bool, its type is int. at (/data/sfraczek/Paddle/paddle/fluid/framework/operator.h:259)
  [operator < relu6 > error]

Reservation status error
sfraczek@aipg-ra-clx-86:/data/sfraczek/models/dygraph/mobilenet$ bash run_cpu_v2.sh | grep relu
Traceback (most recent call last):
  File "train.py", line 262, in <module>
    train_mobilenet()
  File "train.py", line 180, in train_mobilenet
    out = net(img)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
    outputs = self.forward(*inputs, **kwargs)
  File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 212, in forward
    y = self._conv1(inputs, if_act=True)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
    outputs = self.forward(*inputs, **kwargs)
  File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 69, in forward
    y = fluid.layers.relu6(y, use_mkldnn=True)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/layers/nn.py", line 9359, in relu6
    'use_mkldnn': use_mkldnn})
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 2789, in append_op
    kwargs.get("stop_gradient", False))
  File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/tracer.py", line 45, in trace_op
    not stop_gradient)
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2   std::conditional<std::is_pointer<paddle::framework::Attribute >::value, bool const*, bool const&>::type paddle::platform::details::SafeBoostGetConst<bool, paddle::framework::Attribute >(paddle::framework::Attribute const&, char const*, char const*, int)
3   paddle::operators::GetKernelType(paddle::framework::ExecutionContext const&, paddle::framework::OperatorWithKernel const&, std::string const&)
4   paddle::operators::ActivationOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const
5   paddle::imperative::PreparedOp paddle::imperative::PrepareOpImpl<paddle::imperative::VarBase>(paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place, paddle::framework::AttributeMap const&)
6   paddle::imperative::PreparedOp::Prepare(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place const&, paddle::framework::AttributeMap const&)
7   paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool)

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: boost::get failed, cannot get value (GetAttr(name)) by type bool, its type is int. at (/data/sfraczek/Paddle/paddle/fluid/framework/operator.h:259)
  [operator < relu6 > error]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I think adding use_mkldnn attribute in activation_op.cc is OK.

@luotao1
Copy link
Contributor

luotao1 commented Aug 5, 2020

Discussion Summary from @phlrain and I:

  • We prefer to use FLAGS_use_mkldnn.
  • You can add use_mkldnn attribute in C++ OpMaker, such as in activation_op.cc.
  • Please don't add use_mkldnn attribute in Python API, such as in def Relu6(xxx).

@grygielski grygielski mentioned this pull request Aug 7, 2020
@luotao1
Copy link
Contributor

luotao1 commented Aug 7, 2020

Is this PR closed due to duplicated #26037? @grygielski

@grygielski
Copy link
Contributor

@luotao1 Sure, closing due to duplicate.

@grygielski grygielski closed this Aug 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dygraph issues related to dygraph mode Intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants