Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Cherry-pick #18310 #18355 #18608

Merged
merged 4 commits into from
Jul 20, 2020
Merged

Conversation

MoisesHer
Copy link
Contributor

@MoisesHer MoisesHer commented Jun 23, 2020

Description

Fixing issue: #18120 on Mxnet v.1.7.x

Cherry-pick #18310
Cherry-pick #18355
Cherry pick #18713

* Fix cmake mkldnn install target. Previously mkldnn headers are installed to CMAKE_INSTALL_INCLUDEDIR instead of CMAKE_INSTALL_INCLUDEDIR/mkldnn

* Fix pypi_package.sh pip/setup.py for mkldnn builds
@MoisesHer MoisesHer requested a review from szha as a code owner June 23, 2020 01:14
@mxnet-bot
Copy link

Hey @MoisesHer , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [miscellaneous, edge, windows-cpu, centos-cpu, sanity, centos-gpu, windows-gpu, website, unix-gpu, unix-cpu, clang]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leezu is there a plan to fix the CI on 1.x/1.7.x branches?

@ChaiBapchya
Copy link
Contributor

Can we rebase this cherry-pick? 1.7.x seems to regularly get commits merged so CI shouldn't be an issue..

@ChaiBapchya
Copy link
Contributor

@leezu @TaoLv @ciyongch gentle ping.. for help..
unix-gpu: Any idea why specifically python3 GPU tests are failing while trying to add mkl headers?
edge: Jetson build failure: libmxnet.a(random_generator.cu.o): error adding symbols: File in wrong format I haven't seen this before.. maybe you guys have some idea?

@ciyongch
Copy link
Contributor

Hi @ChaiBapchya , I took a look at the failure of [unix-gpu] which showed TVM compilation error and GPU OOM runtime error, seems it's more like the CI side issue. can you try to trigger the failed jobs?
This patch is targeting to fix the header issue for the gpu binary, right?

@MoisesHer
Copy link
Contributor Author

@mxnet-bot run ci [edge, unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [edge, unix-gpu]

@MoisesHer
Copy link
Contributor Author

Hi @ChaiBapchya , I took a look at the failure of [unix-gpu] which showed TVM compilation error and GPU OOM runtime error, seems it's more like the CI side issue. can you try to trigger the failed jobs?
This patch is targeting to fix the header issue for the gpu binary, right?

Thank you for taking a look. I triggered again, but still faiilig.
Yes, this i to fix headers for gpu binary.

@MoisesHer
Copy link
Contributor Author

@mxnet-bot run ci [edge, unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [edge, unix-gpu]

@ChaiBapchya
Copy link
Contributor

@MoisesHer no point in retriggering edge pipeline. It's not a flaky issue.

I saw the same issue in my cherrypick PR: #18742

Please update the cmakevar: @leezu #18713
That should resolve the edge issue. I cherry-picked it in 1.x
You should do that for 1.7.x

Thanks.

…18713)

CMAKE_CUDA_HOST_COMPILER will be reset if CMAKE_CUDA_COMPILER is not set as of cmake 3.17.3

See https://gitlab.kitware.com/cmake/cmake/-/issues/20826
Copy link
Contributor

@ChaiBapchya ChaiBapchya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see it unblocked! LGTM! Thanks!

@ChaiBapchya
Copy link
Contributor

@sandeep-krishnamurthy @leezu @ciyongch
This one fixes MKLDNN missing headers. Please help review/merge.

@ciyongch
Copy link
Contributor

Hi @ChaiBapchya @MoisesHer please check the failed job. BTW, as it's for the binary release, then I think it's not a mandatory patch for the source release (the current release candidate is rc1). We might consider to include this patch in 1.7.0 release if there's rc2, otherwise, this patch will be in v1.7.x branch and the binary release, what do you think?

@ChaiBapchya
Copy link
Contributor

@MoisesHer sorry about hitting that flaky test. Please retrigger unix-cpu pipeline. Hopefully that should be the last retrigger for this PR.

We might consider to include this patch in 1.7.0 release if there's rc2, otherwise, this patch will be in v1.7.x branch and the binary release

@ciyongch sounds good to me.

@MoisesHer
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@MoisesHer
Copy link
Contributor Author

@mxnet-bot run ci [unix-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-cpu]

@MoisesHer
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@sandeep-krishnamurthy sandeep-krishnamurthy merged commit d95de55 into apache:v1.7.x Jul 20, 2020
ChaiBapchya pushed a commit to ChaiBapchya/mxnet that referenced this pull request Jul 27, 2020
* cherry-pick: Fix missing MKLDNN headers (apache#18310)

* Include all mkldnn headers in CD builds (apache#18355)

* Fix cmake mkldnn install target. Previously mkldnn headers are installed to CMAKE_INSTALL_INCLUDEDIR instead of CMAKE_INSTALL_INCLUDEDIR/mkldnn

* Fix pypi_package.sh pip/setup.py for mkldnn builds

* Set CMAKE_CUDA_COMPILER in aarch64-linux-gnu-toolchain.cmake (apache#18713)

CMAKE_CUDA_HOST_COMPILER will be reset if CMAKE_CUDA_COMPILER is not set as of cmake 3.17.3

See https://gitlab.kitware.com/cmake/cmake/-/issues/20826

Co-authored-by: Leonard Lausen <lausen@amazon.com>
szha pushed a commit that referenced this pull request Aug 3, 2020
* Cherry-pick #18310 #18355 (#18608)

* cherry-pick: Fix missing MKLDNN headers (#18310)

* Include all mkldnn headers in CD builds (#18355)

* Fix cmake mkldnn install target. Previously mkldnn headers are installed to CMAKE_INSTALL_INCLUDEDIR instead of CMAKE_INSTALL_INCLUDEDIR/mkldnn

* Fix pypi_package.sh pip/setup.py for mkldnn builds

* Set CMAKE_CUDA_COMPILER in aarch64-linux-gnu-toolchain.cmake (#18713)

CMAKE_CUDA_HOST_COMPILER will be reset if CMAKE_CUDA_COMPILER is not set as of cmake 3.17.3

See https://gitlab.kitware.com/cmake/cmake/-/issues/20826

Co-authored-by: Leonard Lausen <lausen@amazon.com>

* remove linux-gputoolchain

Co-authored-by: MoisesHer <50716238+MoisesHer@users.noreply.github.com>
Co-authored-by: Leonard Lausen <lausen@amazon.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants