Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MKLDNN]Support fullyconnected and element-wise ops fusion #15950

merged 5 commits into from Aug 22, 2019


Copy link

commented Aug 20, 2019


This PR is to add the support for fullyconnected and some element-wise (including activation/square/sqrt/exp/abs/clip) ops fusion.



Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change


  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)


@pengzhao-intel pengzhao-intel added this to In progress in CPU Performance and Quantization via automation Aug 20, 2019

return true;
if (new_node.op() == Op::Get("clip")) {
const ClipParam &param = nnvm::get<ClipParam>(new_node.attrs.parsed);
if (param.a_min == 0.f && param.a_max == 1.0f) {

This comment has been minimized.

Copy link

ZhennanQin Aug 20, 2019


why a_max have to be 1.0f? I think it's not necessary.

This comment has been minimized.

Copy link

ciyongch Aug 21, 2019

Author Contributor

Good catch, will remove this check for a_max for bounded_relu.


This comment has been minimized.

Copy link

commented Aug 21, 2019

Is there performance data for fusion?

CPU Performance and Quantization automation moved this from In progress to Reviewer approved Aug 21, 2019

TaoLv approved these changes Aug 21, 2019
Copy link

left a comment

LGTM and merging now.

@pengzhao-intel pengzhao-intel merged commit 434f185 into apache:master Aug 22, 2019

12 checks passed

ci/jenkins/mxnet-validation/centos-cpu Job succeeded
ci/jenkins/mxnet-validation/centos-gpu Job succeeded
ci/jenkins/mxnet-validation/clang Job succeeded
ci/jenkins/mxnet-validation/edge Job succeeded
ci/jenkins/mxnet-validation/miscellaneous Job succeeded
ci/jenkins/mxnet-validation/sanity Job succeeded
ci/jenkins/mxnet-validation/unix-cpu Job succeeded
ci/jenkins/mxnet-validation/unix-gpu Job succeeded
ci/jenkins/mxnet-validation/website Job succeeded
ci/jenkins/mxnet-validation/windows-cpu Job succeeded
ci/jenkins/mxnet-validation/windows-gpu Job succeeded
continuous-integration/travis-ci/pr The Travis CI build passed

CPU Performance and Quantization automation moved this from Reviewer approved to Done Aug 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants
You can’t perform that action at this time.