Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Disable OpenMP offloading support for 3rdparty/openmp #17098

Merged
merged 2 commits into from Dec 23, 2019
Merged

Conversation

leezu
Copy link
Contributor

@leezu leezu commented Dec 17, 2019

Description

OpenMP offloading was introduced some time during the past two years and is
enabled by default. With upgrading 3rdparty/openmp in
#17012 it was unintentionally made part of the MXNet CMake build.

But we don't use OpenMP offloading and the Cuda target in the llvm OpenMP
Offloading build is broken in our setting. Thus disable it.

Due to bug in CMake < 3.13, this requires updating to CMake 3.13. See https://cmake.org/cmake/help/latest/policy/CMP0077.html

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Disable OpenMP offloading support for 3rdparty/openmp in CMake build, which was enabled by mistake
  • Upgrade the Docker images used for edge build to latest upstream version. (Temporarily) disable pinning.

OpenMP offloading was introduced some time during the past two years and is
enabled by default. With upgrading 3rdparty/openmp in
apache#17012
it was made part of the MXNet CMake build.

But we don't use OpenMP offloading and the Cuda target in the llvm OpenMP
Offloading build is broken in our setting.
@leezu leezu added the pr-awaiting-review PR is waiting for code review label Dec 18, 2019
@szha szha merged commit 5aa3a7a into apache:master Dec 23, 2019
haojin2 added a commit that referenced this pull request Dec 24, 2019
leezu added a commit to leezu/mxnet that referenced this pull request Dec 30, 2019
* Disable OpenMP offloading support for 3rdparty/openmp

OpenMP offloading was introduced some time during the past two years and is
enabled by default. With upgrading 3rdparty/openmp in
apache#17012
it was made part of the MXNet CMake build.

But we don't use OpenMP offloading and the Cuda target in the llvm OpenMP
Offloading build is broken in our setting.

* Update CMake on CI
leezu added a commit that referenced this pull request Dec 30, 2019
* Upgrade 3rdparty/openmp to release_90 version (#17012)

Fixes #10856

* Fix omp assert issue (#17039)

* Disable OpenMP offloading support for 3rdparty/openmp (#17098)

* Disable OpenMP offloading support for 3rdparty/openmp

OpenMP offloading was introduced some time during the past two years and is
enabled by default. With upgrading 3rdparty/openmp in
#17012
it was made part of the MXNet CMake build.

But we don't use OpenMP offloading and the Cuda target in the llvm OpenMP
Offloading build is broken in our setting.

* Update CMake on CI

* Fix reshape interoperability test (#17155)

* fix reshape interoperability test

* fix for scipy import

Co-authored-by: Chris Olivier <cjolivier01@gmail.com>
Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
@samskalicky
Copy link
Contributor

samskalicky commented Jan 7, 2020

@leezu im getting this error in my windows CI runs:

Traceback (most recent call last):
  File "ci/build_windows.py", line 273, in <module>
    sys.exit(main())
  File "ci/build_windows.py", line 261, in main
    windows_build(args)
  File "ci/build_windows.py", line 174, in windows_build
    datetime.timedelta(seconds=int(time.time() - t0))))
  File "C:\Python37\lib\tempfile.py", line 805, in __exit__
    self.cleanup()
  File "C:\Python37\lib\tempfile.py", line 809, in cleanup
    _shutil.rmtree(self.name)
  File "C:\Python37\lib\shutil.py", line 513, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Python37\lib\shutil.py", line 392, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Python37\lib\shutil.py", line 392, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Python37\lib\shutil.py", line 397, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\Python37\lib\shutil.py", line 395, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\\Windows\\TEMP\\tmpuubj2i7u\\cmake-3.16.1-win64-x64\\bin\\cmake.exe'

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwindows-gpu/detail/PR-17204/11/pipeline/

I think this is coming from here:
https://github.com/apache/incubator-mxnet/blob/bd7eedf3a43279380750dfc54f081e32891620af/ci/build_windows.py#L148-L153
when the scope exits it must try and delete stuff that it doesnt have permission for

@leezu
Copy link
Contributor Author

leezu commented Jan 7, 2020

@samskalicky above lines are a temporary workaround until the Base Windows AMI used by CI is updated. @larroy may be able to give an estimate when it's possible to update.

@leezu
Copy link
Contributor Author

leezu commented Jan 7, 2020

The reason for using a tmpdir is that multiple jobs are executed on the same Windows CI server. In an earlier stage of this PR I used a fixed path but got a race condition during the cmake installation.

I'm not sure why above cleanup error occurs rarely.

@marcoabreu
Copy link
Contributor

Any reason you are not using the Jenkins workspace? That one is guaranteed to be exclusively to the current job

@leezu
Copy link
Contributor Author

leezu commented Jan 8, 2020

@marcoabreu no reason. In principle tempfile.TemporaryDirectory has similar guarantees. I didn't expect the problems reported by Sam.

@larroy
Copy link
Contributor

larroy commented Jan 13, 2020

We will remove that when we update CMake on windows, pending some issues with CMake & Visual studio

@TaoLv
Copy link
Member

TaoLv commented Feb 10, 2020

I'm still seeing the PermissionError. Here is an issue for it. #17558

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants