Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MKLDNN Relu Tanh Sqrt Abs activations added #9081

Merged
merged 4 commits into from
Mar 26, 2018

Conversation

kbinias
Copy link
Contributor

@kbinias kbinias commented Mar 14, 2018

  • Implements Relu, Tanh, Sqrt, Abs activations
  • Passes unit tests for Relu, Tanh, Sqrt, Abs

Beside unit tests, we validated these kernels by running training and interference on MNIST dataset and comparing results with IntelCaffe library.

@CLAassistant
Copy link

CLAassistant commented Mar 14, 2018

CLA assistant check
All committers have signed the CLA.

__macro(relu, ReluMkldnnFunctor, ReluMkldnnGradFunctor) \
__macro(tanh, TanhMkldnnFunctor, TanhMkldnnGradFunctor) \
__macro(sqrt, SqrtMkldnnFunctor, SqrtMkldnnGradFunctor) \
__macro(abs, AbsMkldnnFunctor, AbsMkldnnGradFunctor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could line 187-190 be left-justified?

#define FOR_EACH_MKLDNN_KERNEL_FUNCTOR(__macro)                   \
  __macro(relu, ReluMkldnnFunctor, ReluMkldnnGradFunctor)         \
  __macro(tanh, TanhMkldnnFunctor, TanhMkldnnGradFunctor)         \
  __macro(sqrt, SqrtMkldnnFunctor, SqrtMkldnnGradFunctor)         \
  __macro(abs, AbsMkldnnFunctor, AbsMkldnnGradFunctor);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

AddAttr<bool>("use_mkldnn",
"(bool, default false) Only used in mkldnn kernel")
.SetDefault(false);
AddAttr<std::string>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why activation_op has the data_format? They should not consider NCHW or NHCW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Yes, it doesn't matter for MKLDNN eltwise.

@@ -484,5 +484,72 @@ def test_check_grad(self):
self.check_grad(['X'], 'Out', max_relative_error=0.008)


#--------------------test MKLDNN--------------------
class TestMKLDNNRelu(OpTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you re-use the above unit-tests, and create MKDNN related unit-tests like

class TestMKLDNN(TestConv2dOp):
def init_op_type(self):
self.use_mkldnn = True
self.op_type = "conv2d"

Copy link
Contributor Author

@kbinias kbinias Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests in test_activation_op.py inherit directly from OpTest.
I had to send to numpy uniform different size (e.g. [2, 4, 3, 5]) so I needed to use setUp().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why not inherit directly from TestRelu?

class TestMKLDNNRelu(TestRelu)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -215,7 +215,8 @@ def setUpClass(cls):
'''Fix random seeds to remove randomness from tests'''
cls._np_rand_state = np.random.get_state()
cls._py_rand_state = random.getstate()

cls.use_mkldnn = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for updating the op_test.py? Adding MKLDNN unit-tests (both conv2d #8451 and pool ##8879) don't need to modify the op_test.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I removed all my changes.

@@ -49,6 +76,27 @@ class ActivationKernel
}
};

template <typename Functor>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could MKLDNNActivationKernel and MKLDNNActivationGradKernel move to a new file named mkldnn_activation_op.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -42,6 +42,7 @@ inline mkldnn::memory::desc MKLDNNMemDesc(const std::vector<int>& dims,
}

inline bool CanMKLDNNBeUsed(const framework::ExecutionContext& ctx) {
if (!ctx.HasAttr("use_mkldnn")) return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of line 45? Why don't previous conv2d #8451 and pool #8879 PR need this line?

Copy link
Contributor Author

@kbinias kbinias Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I removed all my changes. The problem was with not MKLDNN activation operators. The file activation_op.cc contains mix of operators (all for CPU and CUDA and 4 for MKLDNN). They don't have by default required MKLDNN attribute (e.g. AddAttr("use_mkldnn"), "..."). I added new ActivationWithMKLDNNOp and ActivationWithMKLDNNOpGrad classes to support this situation. An alternative solution may be adding use_mkldnn attribute to all activations operators but it seems to be ugly.

@@ -195,6 +199,10 @@ class ExecutionContext {
return op_.Attr<T>(name);
}

inline bool HasAttr(const std::string& name) const {
return op_.HasAttr(name);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of adding HasAttr method? Why don't previous conv2d #8451 and pool #8879 PR need this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I removed all my changes. The idea was that it would be nice to have a function that would check existence of attribute in hashtable.

@kbinias kbinias removed the request for review from tensor-tang March 20, 2018 09:12
@kbinias kbinias force-pushed the kbinias/mkldnn-activations branch 2 times, most recently from 8be8def to aad5df7 Compare March 21, 2018 22:11
@kbinias
Copy link
Contributor Author

kbinias commented Mar 22, 2018

@luotao1 May I ask you to continue your review.

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tensor-tang Can you help review activation_mkldnn_op.cc?

@@ -484,5 +484,72 @@ def test_check_grad(self):
self.check_grad(['X'], 'Out', max_relative_error=0.008)


#--------------------test MKLDNN--------------------
class TestMKLDNNRelu(OpTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why not inherit directly from TestRelu?

class TestMKLDNNRelu(TestRelu)

Copy link
Contributor

@tensor-tang tensor-tang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only one question below.

@@ -127,7 +127,12 @@ function(op_library TARGET)

# pybind USE_OP_DEVICE_KERNEL for MKLDNN
if (WITH_MKLDNN AND ${mkldnn_cc_srcs_len} GREATER 0)
# Append first implemented MKLDNN activation operator
if (${MKLDNN_FILE} STREQUAL "activation_mkldnn_op")
file(APPEND ${pybind_file} "USE_OP_DEVICE_KERNEL(relu, MKLDNN);\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition is activation_mkldnn_op, so why named as "relu" which is just one of activation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that it's enough to just adding one operator to pybind in a *_op.cc file:

# Note that it's enough to just adding one operator to pybind in a *_op.cc file.
# And for detail pybind information, please see generated paddle/pybind/pybind.h.
file(READ ${TARGET}.cc TARGET_CONTENT)
string(REGEX MATCH "REGISTER_OP\\(.*REGISTER_OP\\(" multi_register "${TARGET_CONTENT}")
string(REGEX MATCH "REGISTER_OP\\([a-z0-9_]*," one_register "${multi_register}")
if (one_register STREQUAL "")
string(REPLACE "_op" "" TARGET "${TARGET}")
else ()
string(REPLACE "REGISTER_OP(" "" TARGET "${one_register}")
string(REPLACE "," "" TARGET "${TARGET}")
endif()

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit cb3bbbd into PaddlePaddle:develop Mar 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

4 participants