Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[v1.6.x] Backport [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) and Fix Sanity pipeline in 1.6.x #18206

Merged
merged 7 commits into from May 6, 2020

Conversation

ChaiBapchya
Copy link
Contributor

  • Integrate MKl-DNN conv3d and pool3d/1d

  • fix UT & address comments

  • clean code

  • rebase against latest master

* Integrate MKl-DNN conv3d and pool3d/1d

* fix UT & address comments

* clean code

* rebase against latest master
@mxnet-bot
Copy link

Hey @ChaiBapchya , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, centos-gpu, miscellaneous, windows-cpu, edge, website, clang, windows-gpu, unix-gpu, unix-cpu, centos-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@ChaiBapchya
Copy link
Contributor Author

@mxnet-bot run ci [sanity]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [sanity]

@TaoLv
Copy link
Member

TaoLv commented May 2, 2020

@wuxun-zhang please help to review ~

@ChaiBapchya ChaiBapchya changed the title [v1.6.x] Backport [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) [v1.6.x] Backport [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) and Fix Sanity pipeline in 1.6.x May 2, 2020
Copy link
Contributor

@wuxun-zhang wuxun-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some incompatible changes derived from master branch. Please modify them according to the comments.

src/operator/nn/mkldnn/mkldnn_act.cc Outdated Show resolved Hide resolved
src/operator/nn/mkldnn/mkldnn_act.cc Outdated Show resolved Hide resolved
src/operator/nn/mkldnn/mkldnn_base-inl.h Outdated Show resolved Hide resolved
src/operator/nn/mkldnn/mkldnn_base-inl.h Show resolved Hide resolved
src/operator/nn/mkldnn/mkldnn_pooling-inl.h Outdated Show resolved Hide resolved
src/operator/subgraph/mkldnn/mkldnn_subgraph_base-inl.h Outdated Show resolved Hide resolved
@ChaiBapchya
Copy link
Contributor Author

@wuxun-zhang Thanks for pointing it out. Addressed your comments.

* Remove conv3d dilation restriction

* Remove comment
@ChaiBapchya
Copy link
Contributor Author

test_quantization_mkldnn.test_quantized_conv failed with

[2020-05-04T03:07:42.847Z] ----------------------------------------------------------------------

[2020-05-04T03:07:42.847Z] Traceback (most recent call last):

[2020-05-04T03:07:42.847Z]   File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest

[2020-05-04T03:07:42.847Z]     self.test(*self.arg)

[2020-05-04T03:07:42.847Z]   File "/usr/local/lib/python3.5/dist-packages/nose/util.py", line 620, in newfunc

[2020-05-04T03:07:42.847Z]     return func(*arg, **kw)

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/tests/python/mkl/../unittest/common.py", line 177, in test_new

[2020-05-04T03:07:42.847Z]     orig_test(*args, **kwargs)

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 283, in test_quantized_conv

[2020-05-04T03:07:42.847Z]     check_quantized_conv((1, 3, 4, 28, 28), (1, 3, 3), 128, (1, 1, 1), (1, 1, 1), (2, 2, 2), False, qdtype)

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/tests/python/mkl/../quantization/test_quantization.py", line 217, in check_quantized_conv

[2020-05-04T03:07:42.847Z]     arg_shapes, _, _ = conv.infer_shape(data=data_shape)

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/python/mxnet/symbol/symbol.py", line 1103, in infer_shape

[2020-05-04T03:07:42.847Z]     res = self._infer_shape_impl(False, *args, **kwargs)

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/python/mxnet/symbol/symbol.py", line 1267, in _infer_shape_impl

[2020-05-04T03:07:42.847Z]     ctypes.byref(complete)))

[2020-05-04T03:07:42.847Z]   File "/work/mxnet/python/mxnet/base.py", line 255, in check_call

[2020-05-04T03:07:42.847Z]     raise MXNetError(py_str(_LIB.MXGetLastError()))

[2020-05-04T03:07:42.847Z] mxnet.base.MXNetError: Error in operator conv: [03:07:30] src/operator/nn/convolution.cc:242: Check failed: param_.dilate.Size() == 1U (8 vs. 1) : Dilate is not supported in 3d convolution

Cherrypicked #17491 which removed the conv3d dilate restriction.
That should resolve unix-cpu pipeline errors.

@ChaiBapchya
Copy link
Contributor Author

@wuxun-zhang One last failure
unix-gpu pipeline
Tests / Cpp: MKLDNN+GPU

pooling-inl.h:313: Check failed: outputs.size() == GetNumOutputs(param) (1 vs. 2) :  

I was able to repro locally. Not sure why output size is mismatch for pooling.

@wuxun-zhang
Copy link
Contributor

@ChaiBapchya To fix the unix-gpu issue, you can refer to this link, like what I did when backported this PR to v1.x branch before.

@ChaiBapchya
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu, unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu, unix-gpu]

@ChaiBapchya
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

Copy link
Contributor

@wuxun-zhang wuxun-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChaiBapchya
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-merge]

@lanking520 lanking520 added the pr-awaiting-merge Review and CI is complete. Ready to Merge label May 6, 2020
@leezu leezu merged commit b748c45 into apache:v1.6.x May 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants