INTEL MKL: Enhance MkL BatchNorm ops with primitive reuse #19402

gzmkl · 2018-05-18T20:36:21Z

Enable MKL BatchNorm ops with primitive reuse, to improve
(1) model training and
(2) inference of small batch size
by minimizing primitive creation time.

************ Notes *******************
Please review and merge this PR first
#19399

gzmkl · 2018-05-21T16:22:02Z

Close temporarily - pending conv_bwd PR

gzmkl · 2018-05-21T20:41:39Z

mkl_conv_ops.cc has been reverted to avoid any review confusion.
The new version of mkl_conv_ops.cc is contained in PR #19399.

Thanks

gzmkl · 2018-06-04T16:55:52Z

Pending on #19754

gzmkl · 2018-07-04T22:14:37Z

Reopen since PR #19399 has been merged

googlebot · 2018-07-15T14:31:45Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

yiqianglee · 2018-07-15T14:32:05Z

Yes, i have already singed CLA, and should be ok to contribute.

gzmkl · 2018-07-18T16:32:42Z

Hi,
I am the PR submitter and confirm that I am OK with all the changes made by yiqianglee.
Thanks,
GZ

rmlarsen · 2018-07-19T16:52:55Z

@gzmkl I think many of my comments for PR 19403 would apply to this as well, please modify accordingly.

googlebot · 2018-07-19T17:50:42Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

gzmkl · 2018-07-19T17:58:21Z

Plan to apply PR 19403 comments on this PR too.

googlebot · 2018-07-24T15:40:13Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

gzmkl · 2018-07-24T15:51:56Z

Latest code change based on PR 19403 code review suggestions.
Summary:
(1) Change signature of Execute (...) so that input parameters are declared as "const" while output parameter as "non-const".
(2) In many places, remove unnecessary static_cast and const_cast to simplify the code.
(3) Back-out code in SetOp() method (mkl_util.h) - using "emplace" causes unit test failures of ops/nn_grad_test.py.
(4) Minor changes in some comment statements to make descriptions more accurate.
BTW, related code review suggestions from PR 19399 & 19400 have also been reflected in this PR.

rmlarsen · 2018-07-24T17:42:44Z

@gzmkl Thanks for the update. Let's try to get PR 19403 merged first before we push the remaining PRs. Running tests for it again now.

rmlarsen · 2018-07-31T15:57:20Z

tensorflow/core/kernels/mkl_fused_batch_norm_op.cc

-      (*diff_src_tensor)->flat<T>().data()[i] = 0;
+    int num_elements = (*diff_src_tensor)->shape().num_elements();
+    auto diff_src_data = (*diff_src_tensor)->flat<T>().data();
+    for (size_t i = 0; i < num_elements; i++)


I would use std::fill instead of a loop. Same everywhere.

Sure, I will do.
There are places in files which are not changed in this PR. Should I do the same change?

Please do. :-)

I have done code refactoring for multiple places in mkl_fused_batch_norm_op.cc.

To avoid potential merge conflicts, I did not apply this code change recommendation to
source files not related to this PR.

We will include this suggestion, along with others (such as changing signature of Execute()
with proper const or non-const argument declarations), to create a separate "code clean up"
PR.

Thanks!

rmlarsen · 2018-07-31T16:36:20Z

@gzmkl resolved conflict

googlebot · 2018-07-31T18:12:21Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

…smus's suggestion

googlebot · 2018-07-31T20:58:07Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

googlebot · 2018-08-01T00:16:18Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

gzmkl · 2018-08-07T15:54:08Z

Hi Rasmus,

Please choose "master" version to address the following conflict in mkl_util.h

<<<<<<< primreuse_batch_norm
#include

master

Line of "#include " seems to be duplicated.

gzmkl · 2018-08-07T16:10:07Z

Please take the branch code with the following conflict
<<<<<<< primreuse_batch_norm
#include "tensorflow/core/platform/cpu_info.h" // Keep this line

master

Thanks!

PiperOrigin-RevId: 207737829

penpornk · 2018-08-07T18:23:39Z

tensorflow/core/kernels/mkl_fused_batch_norm_op.cc

@@ -262,6 +262,7 @@ class MklFusedBatchNormOp : public OpKernel {
    }

    void MklCreateInputLayout(OpKernelContext* context) {
+      const Tensor& input = MklGetInput(context, 0);


Just curious, why do we have this line? This local variable isn't used anywhere in the function.

Should not be there. See more comment below.

penpornk · 2018-08-07T18:24:37Z

tensorflow/core/kernels/mkl_fused_batch_norm_op.cc

@@ -544,6 +545,7 @@ class MklFusedBatchNormGradOp : public OpKernel {
    }

    void MklCreateInputLayout(OpKernelContext* context) {
+      const Tensor& input = MklGetInput(context, 0);


This local variable also isn't used anywhere in this function.

Good catch!

Yes, this line of code should be there. And it was in a function with MKL-ML integration which will be removed in the long run since we have MKL-DNN integration.

I will clean up this code, after all primitive reuse PRs have been done (only Relu one remains). I have a TODO list based on Rasmus's suggestions, which were applied only to individual PRs (thus
only on related changed files), and will have a "clean-up" PR.

Thanks!

Thanks for your clarification!

enhancement with BatchNorm primitive reuse

28727ba

googlebot added the cla: yes label May 18, 2018

gzmkl closed this May 21, 2018

revert mkl_conv_ops.cc to avoid PR review confusion

d89e88b

gzmkl reopened this May 21, 2018

minor code style fix

52485e7

drpngx requested a review from rmlarsen June 4, 2018 15:59

drpngx assigned rmlarsen Jun 4, 2018

drpngx added the awaiting review Pull request awaiting review label Jun 4, 2018

gzmkl closed this Jun 4, 2018

gzmkl and others added 2 commits June 12, 2018 16:05

code refactoring per Rasmus's suggestions on PR 19754

9aca063

Merge branch 'master' into primreuse_batch_norm

5b19c88

gzmkl reopened this Jul 4, 2018

Replace to call fast reorder path in MklBN op.

85aa931

googlebot added cla: no and removed cla: yes labels Jul 15, 2018

rmlarsen added cla: yes and removed cla: no labels Jul 19, 2018

tensorflowbutler removed the awaiting review Pull request awaiting review label Jul 20, 2018

code refactoring based on suggestions from PR403 code review

abfc852

googlebot added cla: no and removed cla: yes labels Jul 24, 2018

rmlarsen added the kokoro:run label Jul 24, 2018

kokoro-team removed the kokoro:run label Jul 24, 2018

rmlarsen requested changes Jul 31, 2018

View reviewed changes

Merge branch 'master' into primreuse_batch_norm

7d4d34c

rmlarsen added cla: yes awaiting testing (then merge) and removed cla: no labels Jul 31, 2018

use std::fill_n to replace array assignment with for loop based on Ra…

8e2f587

…smus's suggestion

googlebot added cla: no and removed cla: yes labels Jul 31, 2018

rmlarsen approved these changes Aug 1, 2018

View reviewed changes

rmlarsen added cla: yes ready to pull PR ready for merge process and removed cla: no labels Aug 1, 2018

tensorflow-copybara merged commit 8e2f587 into tensorflow:master Aug 7, 2018

tensorflow-copybara pushed a commit that referenced this pull request Aug 7, 2018

Merge pull request #19402 from Intel-tensorflow:primreuse_batch_norm

90bf05c

PiperOrigin-RevId: 207737829

penpornk reviewed Aug 7, 2018

View reviewed changes

claynerobison deleted the primreuse_batch_norm branch March 22, 2019 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INTEL MKL: Enhance MkL BatchNorm ops with primitive reuse #19402

INTEL MKL: Enhance MkL BatchNorm ops with primitive reuse #19402

gzmkl commented May 18, 2018 •

edited

gzmkl commented May 21, 2018

gzmkl commented May 21, 2018

gzmkl commented Jun 4, 2018

gzmkl commented Jul 4, 2018

googlebot commented Jul 15, 2018

yiqianglee commented Jul 15, 2018

gzmkl commented Jul 18, 2018

rmlarsen commented Jul 19, 2018

googlebot commented Jul 19, 2018

gzmkl commented Jul 19, 2018

googlebot commented Jul 24, 2018

gzmkl commented Jul 24, 2018

rmlarsen commented Jul 24, 2018

rmlarsen Jul 31, 2018

gzmkl Jul 31, 2018

rmlarsen Jul 31, 2018

gzmkl Jul 31, 2018

rmlarsen commented Jul 31, 2018

googlebot commented Jul 31, 2018

googlebot commented Jul 31, 2018

googlebot commented Aug 1, 2018

gzmkl commented Aug 7, 2018

gzmkl commented Aug 7, 2018

penpornk Aug 7, 2018

gzmkl Aug 7, 2018

penpornk Aug 7, 2018

penpornk Aug 7, 2018

gzmkl Aug 7, 2018

penpornk Aug 7, 2018

INTEL MKL: Enhance MkL BatchNorm ops with primitive reuse #19402

INTEL MKL: Enhance MkL BatchNorm ops with primitive reuse #19402

Conversation

gzmkl commented May 18, 2018 • edited

gzmkl commented May 21, 2018

gzmkl commented May 21, 2018

gzmkl commented Jun 4, 2018

gzmkl commented Jul 4, 2018

googlebot commented Jul 15, 2018

yiqianglee commented Jul 15, 2018

gzmkl commented Jul 18, 2018

rmlarsen commented Jul 19, 2018

googlebot commented Jul 19, 2018

gzmkl commented Jul 19, 2018

googlebot commented Jul 24, 2018

gzmkl commented Jul 24, 2018

rmlarsen commented Jul 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmlarsen commented Jul 31, 2018

googlebot commented Jul 31, 2018

googlebot commented Jul 31, 2018

googlebot commented Aug 1, 2018

gzmkl commented Aug 7, 2018

<<<<<<< primreuse_batch_norm #include

gzmkl commented Aug 7, 2018

Please take the branch code with the following conflict <<<<<<< primreuse_batch_norm #include "tensorflow/core/platform/cpu_info.h" // Keep this line

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gzmkl commented May 18, 2018 •

edited

<<<<<<< primreuse_batch_norm
#include

Please take the branch code with the following conflict
<<<<<<< primreuse_batch_norm
#include "tensorflow/core/platform/cpu_info.h" // Keep this line