Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[v1.7.x] ElementWiseSum fix for oneDNN #18777

Merged
merged 5 commits into from
Aug 5, 2020

Conversation

bgawrych
Copy link
Contributor

Description

This PR fixes bug which occurs when training gluonCV deeplab with oneDNN support.
Original issue: dmlc/gluon-cv#1368

To reproduce:

python3 train.py --dataset pascal_aug --model-zoo deeplab_resnet101_coco --aux --lr 0.001 --checkname res101 --no-cuda

where train.py comes from https://gluon-cv.mxnet.io/_downloads/1f67ebbf3e0adc5c6fa863a3bc7672a6/train.py (from https://gluon-cv.mxnet.io/build/examples_segmentation/train_fcn.html )

Checklist

Essentials

  • Changes are complete (i.e. I finished coding on this PR)
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Comments

Author of the fix is @anko-intel with my small help

@mxnet-bot
Copy link

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [clang, unix-cpu, windows-cpu, sanity, edge, miscellaneous, windows-gpu, website, centos-cpu, unix-gpu, centos-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

for (const NDArray& in : inputs) {
// if ndarray is in default storage and MKLDNN is available,
// need to make sure cpu layout data is used, instead of MKL layout
if (in.storage_type() == kDefaultStorage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& in.IsMKLDNNData()

// if ndarray is in default storage and MKLDNN is available,
// need to make sure cpu layout data is used, instead of MKL layout
if (in.storage_type() == kDefaultStorage) {
in_data.push_back(in.Reorder2Default());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about emplace_back

@wuxun-zhang
Copy link
Contributor

Thanks for the fix. Looks that master branch also needs this fix.

@lanking520 lanking520 added the pr-awaiting-review PR is waiting for code review label Jul 26, 2020
@roy6324
Copy link

roy6324 commented Jul 27, 2020

Thanks for the fix , problem is solved.

@bgawrych
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@bgawrych
Copy link
Contributor Author

@wuxun-zhang @xinyu-intel Can you look again? I've changed fix by reordering conditions in (I think) proper way

@xinyu-intel
Copy link
Contributor

@bgawrych thx, can you please add a simple python test case for this fix? I think it should be more clear.

Copy link
Contributor

@wuxun-zhang wuxun-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
please add a test case too

@pengzhao-intel pengzhao-intel added this to In progress in CPU Performance and Quantization via automation Jul 30, 2020
CPU Performance and Quantization automation moved this from In progress to Reviewer approved Jul 30, 2020
@bgawrych
Copy link
Contributor Author

LGTM
please add a test case too

@pengzhao-intel Done :)

@bgawrych bgawrych mentioned this pull request Aug 4, 2020
3 tasks
@pengzhao-intel pengzhao-intel merged commit 3143aab into apache:v1.7.x Aug 5, 2020
CPU Performance and Quantization automation moved this from Reviewer approved to Done Aug 5, 2020
@ChaiBapchya
Copy link
Contributor

@pengzhao-intel @bgawrych Missing in v1.x Was it intentional?

@pengzhao-intel
Copy link
Contributor

Yes, it's better to include this PR. @anko-intel

bgawrych added a commit to bgawrych/incubator-mxnet that referenced this pull request Sep 21, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
bgawrych added a commit to bgawrych/incubator-mxnet that referenced this pull request Sep 21, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
bgawrych added a commit to bgawrych/incubator-mxnet that referenced this pull request Sep 21, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
bgawrych added a commit to bgawrych/incubator-mxnet that referenced this pull request Sep 21, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
samskalicky pushed a commit that referenced this pull request Sep 22, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
pengzhao-intel pushed a commit that referenced this pull request Sep 23, 2020
* Fix ElementwiseSum for DNNL

* Fix sanity and replace push_back with emplace_back

* Change order of the data format conditions

* Add NOLINT to avoid readability error

* Add test for oneDNN ElemwiseSum

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>

Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
MKLDNN pr-awaiting-review PR is waiting for code review
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

9 participants