MKL-DNN integration: request for reviews #7931

ykim362 · 2017-09-18T08:18:27Z

This PR is a beta version for the code reviews and experiments. There are several known issues which are being debugged.

If this version is built with 'USE_MKL2017' and 'USE_MKL2017_EXPERIMENTAL' flag, it will provide the same functionalities and performance as the current MKLML release. If this is built with 'USE_MKLDNN' flag, it will go through the new code pass (MKL-DNN integration).

MKL-DNN

A new open-source deep learning library providing IA optimized DNN kernels.
https://github.com/01org/mkl-dnn

Advantages

More functionalities

New functionalities will be mainly added to MKL-DNN rather than MKLML library.
Below are two examples.

Fused RNN cell (To be added)
int8 inference (To be added)

Performance optimization

As of Sep. 18 2017.

all units are (img/sec)
Alexnet Inference (BS:256): 1474 (MKLML) --> 1568 (MKL-DNN)
inception-bn inference (BS:32): 454 (MKLML) --> 483 (MKL-DNN)
on Skylake 20-core machine (6148)
Resnet 50 inference (BS: 32): 99 (MKLML) --> 116 (MKL-DNN)
on KNL 7250

Known issues

Convergence (resnet training)

Contributors for this PR

@ashokei @karkadad @louisfeng @adstraw

elemwise sum bug fixes

ykim362 · 2017-10-18T23:27:52Z

@piiswrong We have fixed convergence issue in Resnet. There were some problems in Conv and Batch norm layers. Also, we have added some more optimizations to get more speed-ups. Now, MKL-DNN version is 15% faster than MKLML (MKL2017) for inference and training on average.

ykim362 · 2017-10-18T23:34:08Z

@szha @piiswrong MKL-DNN doesn't support fp64(double) data type. Do you think this is an issue? The library team is more focusing on adding lower precisions.

zheng-da · 2017-10-19T20:42:00Z

src/operator/mkl/mkldnn_elemwise_sum-inl.h

+ * @param req
+ * @param out_data
+ */
+template<typename xpu, typename DType>


Why do you need xpu for any MKLDNN functions? Doesn't the code always run on CPU?

This is a good point. I was following the convention of other MKLDNN operators to be consistent. I can change this to cpu only.

Also discussed with Young and Ashok, we would like to keep this template parameter for supporting future Intel devices. We may support devices other than traditional CPU in the future.

zheng-da · 2017-10-19T20:42:42Z

src/operator/mkl/mkldnn_elemwise_sum-inl.h

+
+  if (req[0] == kNullOp) return;
+
+  Stream<xpu> *s = ctx.get_stream<xpu>();


It doesn't seem the stream is used anywhere.

This is true right now. We may need to use the stream when we try to support other tensor shapes beside nchw. I can remove it for now.

zheng-da · 2017-10-19T20:47:50Z

src/operator/tensor/elemwise_binary_op_basic.cc

+.set_attr<FInferStorageType>("FInferStorageType", 
+                              ElemwiseStorageType<2, 1, true, false, false>)       \
+.set_attr<FCompute>("FCompute<cpu>", MKLDNNElementWiseAddCompute<cpu>)             \
+.set_attr<FComputeEx>("FComputeEx<cpu>", ElemwiseBinaryOp::ComputeEx<cpu, mshadow::op::plus>)


MKLDNN implementation is only defined for FCompute?

Since FComputeEx is for NDArray inputs and all current MKLDNN operators are only supporting TBlobs and the main benefit of the MKLDNN elemwise sum operator comes from working with other MKLDNN operators. With the upcoming sparse tensor support, we may need to make some adjustments.

szha · 2017-10-19T22:39:32Z

@ykim362 could you verify if #8196 is fixed?

ykim362 · 2017-10-19T22:42:36Z

@szha Sure, I am looking into it.

szha · 2017-10-19T22:49:03Z

@ykim362 BTW is the fix in mklml_lnx_2018.0.20170908.tgz? Does it make sense to upgrade the library for mkl2017 use case? Many people are using MKL version (with MKL2017 and experimental on).

ykim362 · 2017-10-20T17:58:45Z

@szha MKL-DNN also utilizes MKL2017. So, it would be useful to update MKL2017(2018) as well. And, the fix would be in the MXNet code. I am still investigating it.

jesterhazy · 2017-10-27T16:02:37Z

is there an update on this issue? we are keen to include MKL in an upcoming project.

ykim362 · 2017-10-30T17:29:44Z

@piiswrong @szha @sbodenstein MKL-DNN now officially supports OSX. So, we don't need to worry about this issue.
https://github.com/01org/mkl-dnn/releases/tag/v0.11

sbodenstein · 2017-11-07T07:20:34Z

@ykim362: thanks! I saw that.

Btw: do you have any estimate for when this PR will be ready?

ykim362 · 2017-11-07T22:27:51Z

@sbodenstein From my understanding, this PR is not going to be directly merged. It's going to be merged with another revision with sparse tensor (mkl storage). @piiswrong Is this correct?

sbodenstein · 2017-11-19T18:23:50Z

@ykim362: do you know if bugs, like the resnet convergence bug, are still unsolved with v0.11 MKL-DNN?

leezu · 2017-11-20T05:54:30Z

@sbodenstein MKL with v0.11 is quite buggy. I often got inf values during training for no obvious reason (i.e. training is stable without mkl / on GPU).
I'm not sure about v0.12 due to #8280 .

(i.e. even without MKL-DNN MKL is buggy)

piiswrong · 2017-12-12T22:17:54Z

closing since @zheng-da is making a new PR for this

Young Jin Kim and others added 30 commits April 10, 2017 10:05

Remove MKL's fully connected layer for the convergence

c851fa1

Merge branch 'master' into master

cdacfe4

Merge remote-tracking branch 'upstream/master'

8c98f05

Use the latest MKLML release

d83c038

Merge branch 'master' of https://github.com/dmlc/mxnet

9b08ed9

Use the latest MKLML release

fb2c454

Merge branch 'master' of https://github.com/ykim362/mxnet

c066328

Use the latest MKLML release

85359bc

Merge branch 'master' of https://github.com/ykim362/mxnet

4f90a16

Merge branch 'master' of github.com:NervanaSystems/private-mxnet

ef6ae9f

Merge branch 'master' of https://github.com/dmlc/mxnet

829da79

Merge branch 'master' of https://github.com/dmlc/mxnet

598befc

Added convnet alexnet

f158840

Improved convnet benchmark using regular module

6c823bd

Add a benchmark script for training (no data layer)

c6dcab6

First commit of the MKLDNN,MKL2017 integrated code

0f06144

test commit

d69677b

test commit

ce48f24

Adding the src files

431cf42

Test commit

17bb62e

Merge branch 'master' of https://github.com/NervanaSystems/mxnet-amazon

3c69af5

Adding license files

03e8666

Adding changes to the Makefile with MKLDNN options

37ce2c4

Updated ndarray.h file by enabling mkldnn

2e678a3

Updated tensor_blob.h file by enabling mkldnn

d4058f1

Enabled mkldnn in attach_op_execs_pass.cc file

52879a2

enabled mkldnn in mkl_memory.h

b84613d

enabled mkldnn in mkl_util-inl.h

8697475

enabled mkldnn in pooling.cc

7185529

Adding additional files required by mkldnn implementation

fbef1ea

louisfeng and others added 15 commits October 13, 2017 22:41

fixed casting.

208c61f

made MKLDNNElementWiseAddCompute inline.

88a83e3

Fixes based on the initial review

b6d766f

Fix compile errors

fa61706

Merge branch 'lfeng-elemwise-sum' into test_perf

72191f4

weights and bias optimization

2b6bd0d

Merge branch 'master' of github.com:NervanaSystems/private-mxnet

9e7aee9

Merge remote-tracking branch 'upstream/master'

9ce7037

Fix compile errors and update submodules

6168d5b

Fixed broken nnvm submodule link

3ad3f4a

Fix compile error for MKL2017

8145184

fixed various bugs uncovered from running unit tests.

99e4491

Merge branch 'master' into lfeng-elemwise-sum

6bd048e

Merge pull request apache#57 from NervanaSystems/lfeng-elemwise-sum

e0cacb9

elemwise sum bug fixes

Merge remote-tracking branch 'private/master' into add-mkldnn

0568d7a

zheng-da reviewed Oct 19, 2017

View reviewed changes

piiswrong closed this Dec 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MKL-DNN integration: request for reviews #7931

MKL-DNN integration: request for reviews #7931

ykim362 commented Sep 18, 2017 •

edited

ykim362 commented Oct 18, 2017

ykim362 commented Oct 18, 2017

zheng-da Oct 19, 2017

louisfeng Oct 19, 2017

louisfeng Oct 20, 2017

zheng-da Oct 19, 2017

louisfeng Oct 19, 2017

zheng-da Oct 19, 2017

louisfeng Oct 20, 2017

szha commented Oct 19, 2017

ykim362 commented Oct 19, 2017

szha commented Oct 19, 2017

ykim362 commented Oct 20, 2017

jesterhazy commented Oct 27, 2017

ykim362 commented Oct 30, 2017

sbodenstein commented Nov 7, 2017

ykim362 commented Nov 7, 2017

sbodenstein commented Nov 19, 2017

leezu commented Nov 20, 2017 •

edited

piiswrong commented Dec 12, 2017


		if (req[0] == kNullOp) return;

		Stream<xpu> *s = ctx.get_stream<xpu>();

MKL-DNN integration: request for reviews #7931

MKL-DNN integration: request for reviews #7931

Conversation

ykim362 commented Sep 18, 2017 • edited

MKL-DNN

Advantages

More functionalities

Performance optimization

Known issues

Contributors for this PR

ykim362 commented Oct 18, 2017

ykim362 commented Oct 18, 2017

zheng-da Oct 19, 2017

Choose a reason for hiding this comment

louisfeng Oct 19, 2017

Choose a reason for hiding this comment

louisfeng Oct 20, 2017

Choose a reason for hiding this comment

zheng-da Oct 19, 2017

Choose a reason for hiding this comment

louisfeng Oct 19, 2017

Choose a reason for hiding this comment

zheng-da Oct 19, 2017

Choose a reason for hiding this comment

louisfeng Oct 20, 2017

Choose a reason for hiding this comment

szha commented Oct 19, 2017

ykim362 commented Oct 19, 2017

szha commented Oct 19, 2017

ykim362 commented Oct 20, 2017

jesterhazy commented Oct 27, 2017

ykim362 commented Oct 30, 2017

sbodenstein commented Nov 7, 2017

ykim362 commented Nov 7, 2017

sbodenstein commented Nov 19, 2017

leezu commented Nov 20, 2017 • edited

piiswrong commented Dec 12, 2017

ykim362 commented Sep 18, 2017 •

edited

leezu commented Nov 20, 2017 •

edited