-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Relu6 mkldnn op #25713
Added Relu6 mkldnn op #25713
Conversation
Thanks for your contribution! |
✅ This PR's description meets the template requirements! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -9313,7 +9313,7 @@ def elu(x, alpha=1.0, name=None): | |||
|
|||
|
|||
@templatedoc() | |||
def relu6(x, threshold=6.0, name=None): | |||
def relu6(x, threshold=6.0, name=None, use_mkldnn=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you add the attribute use_mkldnn
here?
- If we use
FLAGS_use_mkldnn=true
to global control the model, we don't need adduse_mkldnn
attribute in each API. - add dnnl sigmoid (logistic) activation #25745 sigmoid mkldnn op does not add the attribute
use_mkldnn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we add use_mkldnn here
This Relu6 is used in models
repo. It is needed for mobilenetv2 here https://github.com/PaddlePaddle/models/blob/develop/dygraph/mobilenet/mobilenet_v2.py#L69.
We wouldn't have to do this if models
repo would use the approach as here:
https://github.com/PaddlePaddle/models/blob/develop/dygraph/resnet/train.py#L207
which calls:
https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/dygraph/nn.py#L233.
which is passed to convolution here: https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/dygraph/nn.py#L170
There are few ways how ops are added in models repo, for example this: https://github.com/PaddlePaddle/models/blob/develop/dygraph/resnet/train.py#L243
I am guessing that this is inconsistent because it isn't fully migrated yet but I am not sure.
Value of use_mkldnn has to be evaluated before this line for dygraph to use mkldnn kernel of relu6: https://github.com/PaddlePaddle/Paddle/pull/25713/files/3edaaad6bc84ed19715aaca04f9ed6bdbf06e25e#diff-f8fc1afdc1f06d4c47bc0933d53bf3c6R9359.
The case of sigmoid
For sigmoid I haven't added any use_mkldnn yet because we haven't enabled it yet. However, It will have to be handled with in the same way as we will eventually decide to handle relu6.
Old vs New approach
This use_mkldnn=False
is from previous approach before we decided ot add FLAGS_use_mkldnn
.
The new approach we have is to pass use_mkldnn=True
to ops and check it with FLAGS_use_mkldnn
. For details please refer to this PR
https://github.com/PaddlePaddle/Paddle/pull/25773/files#diff-aeec709d2310c344dd0f235478865b7cR182. We pass use_mkldnn=True and we call CheckUseMkldnnWithFlag()
to compare against FLAGS_use_mkldnn
.
How we can proceed
We could apply the same approach to this PR with checking against FLAGS_use_mkldnn. If you think this is correct, I can do that here.
Alternatively, maybe models repo could call relu6 in different way? Or maybe you have other ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the same situation here: #25773 (comment)
We should definitely decide how to deal with that so I can change both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could apply the same approach to this PR with checking against FLAGS_use_mkldnn. If you think this is correct, I can do that here.
Alternatively, maybe models repo could call relu6 in different way? Or maybe you have other ideas?
@phlrain Please have a look, and which way do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to use FLAGS_use_mkldnn. It's seems an easier way
@@ -504,6 +504,9 @@ class Relu6OpMaker : public framework::OpProtoAndCheckerMaker { | |||
AddAttr<float>("threshold", | |||
"The threshold value of Relu6. Default is 6.0. ") | |||
.SetDefault(6.0f); | |||
AddAttr<bool>("use_mkldnn", | |||
"(bool, default false) Only used in mkldnn kernel") | |||
.SetDefault(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add an attribute here? A lot of activation ops already support the use_mkldnn
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without it it crashes. We run modified mobilenet script:
https://github.com/PaddlePaddle/models/blob/a7fb45f650e0d32ba697b5818aa4799c2ccc9b05/dygraph/mobilenet/mobilenet_v2.py#L69
passing use_mkldnn=True in that line.
The error is:
Traceback (most recent call last):
File "train.py", line 262, in <module>
train_mobilenet()
File "train.py", line 195, in train_mobilenet
avg_loss.backward()
File "</nfs/pdx/home/sfraczek/.local/lib/python2.7/site-packages/decorator.pyc:decorator-gen-97>", line 2, in backward
File "/data/sfraczek/Paddle/build/python/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 219, in __impl__
return func(*args, **kwargs)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/varbase_patch_methods.py", line 172, in backward
framework._dygraph_tracer(), retain_graph)
KeyboardInterrupt
Reservation status error
sfraczek@aipg-ra-clx-86:/data/sfraczek/models/dygraph/mobilenet$ bash run_cpu_v2.sh | grep relu
Traceback (most recent call last):
File "train.py", line 262, in <module>
train_mobilenet()
File "train.py", line 180, in train_mobilenet
out = net(img)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 212, in forward
y = self._conv1(inputs, if_act=True)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 69, in forward
y = fluid.layers.relu6(y, use_mkldnn=True)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/layers/nn.py", line 9359, in relu6
'use_mkldnn': use_mkldnn})
File "/data/sfraczek/Paddle/build/python/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 2789, in append_op
kwargs.get("stop_gradient", False))
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/tracer.py", line 45, in trace_op
not stop_gradient)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2 std::conditional<std::is_pointer<paddle::framework::Attribute >::value, bool const*, bool const&>::type paddle::platform::details::SafeBoostGetConst<bool, paddle::framework::Attribute >(paddle::framework::Attribute const&, char const*, char const*, int)
3 paddle::operators::GetKernelType(paddle::framework::ExecutionContext const&, paddle::framework::OperatorWithKernel const&, std::string const&)
4 paddle::operators::ActivationOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const
5 paddle::imperative::PreparedOp paddle::imperative::PrepareOpImpl<paddle::imperative::VarBase>(paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place, paddle::framework::AttributeMap const&)
6 paddle::imperative::PreparedOp::Prepare(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place const&, paddle::framework::AttributeMap const&)
7 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: boost::get failed, cannot get value (GetAttr(name)) by type bool, its type is int. at (/data/sfraczek/Paddle/paddle/fluid/framework/operator.h:259)
[operator < relu6 > error]
Reservation status error
sfraczek@aipg-ra-clx-86:/data/sfraczek/models/dygraph/mobilenet$ bash run_cpu_v2.sh | grep relu
Traceback (most recent call last):
File "train.py", line 262, in <module>
train_mobilenet()
File "train.py", line 180, in train_mobilenet
out = net(img)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 212, in forward
y = self._conv1(inputs, if_act=True)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/data/sfraczek/models/dygraph/mobilenet/mobilenet_v2.py", line 69, in forward
y = fluid.layers.relu6(y, use_mkldnn=True)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/layers/nn.py", line 9359, in relu6
'use_mkldnn': use_mkldnn})
File "/data/sfraczek/Paddle/build/python/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/data/sfraczek/Paddle/build/python/paddle/fluid/framework.py", line 2789, in append_op
kwargs.get("stop_gradient", False))
File "/data/sfraczek/Paddle/build/python/paddle/fluid/dygraph/tracer.py", line 45, in trace_op
not stop_gradient)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(paddle::platform::ErrorSummary const&, char const*, int)
2 std::conditional<std::is_pointer<paddle::framework::Attribute >::value, bool const*, bool const&>::type paddle::platform::details::SafeBoostGetConst<bool, paddle::framework::Attribute >(paddle::framework::Attribute const&, char const*, char const*, int)
3 paddle::operators::GetKernelType(paddle::framework::ExecutionContext const&, paddle::framework::OperatorWithKernel const&, std::string const&)
4 paddle::operators::ActivationOp::GetExpectedKernelType(paddle::framework::ExecutionContext const&) const
5 paddle::imperative::PreparedOp paddle::imperative::PrepareOpImpl<paddle::imperative::VarBase>(paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::imperative::details::NameVarMapTrait<paddle::imperative::VarBase>::Type const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place, paddle::framework::AttributeMap const&)
6 paddle::imperative::PreparedOp::Prepare(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::OperatorWithKernel const&, paddle::platform::Place const&, paddle::framework::AttributeMap const&)
7 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: boost::get failed, cannot get value (GetAttr(name)) by type bool, its type is int. at (/data/sfraczek/Paddle/paddle/fluid/framework/operator.h:259)
[operator < relu6 > error]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I think adding use_mkldnn
attribute in activation_op.cc
is OK.
1bc2b3b
Discussion Summary from @phlrain and I:
|
Is this PR closed due to duplicated #26037? @grygielski |
@luotao1 Sure, closing due to duplicate. |
PR types
New features
PR changes
OPs
Describe
Dygraph: Added Relu6 mkldnn op used in mobilenetV2 from models repo