Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to CUDA 11.4.1 #7197

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Aug 3, 2021

Update to CUDA 11.4.1 (11.4.20210728):

  • CUDA runtime version 11.4.108
  • NVIDIA drivers version 470.57.02
  • cuDNN version 8.2.2.26

Add support for GCC 11 and clang 12.

See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html .

Update to CUDA 11.4.1 (11.4.20210728):
  * CUDA runtime version 11.4.108
  * NVIDIA drivers version 470.57.02

Add support for GCC 11 and clang 12.

See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html .
@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2021

A new Pull Request was created by @fwyzard (Andrea Bocci) for branch IB/CMSSW_12_1_X/master.

@cmsbuild, @smuzaffar, @mrodozov, @iarspider can you please review it and eventually sign? Thanks.
@silviodonato, @dpiparo, @qliphy, @perrotta you are the release manager for this.
cms-bot commands are listed here

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

please test

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

I expect building CMSSW may fail, so let's check that before testing on other architectures.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

However it might be worth checking a GCC 11 build in parallel.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

please test for CMSSW_12_1_X/slc7_amd64_gcc11

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2021

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17474/summary.html
COMMIT: 75efd10
CMSSW: CMSSW_12_1_X_2021-08-02-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17474/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

+ for FILE in '$FILES'
++ basename src/common.cpp
+ /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/cuda/11.4.1-274998c7f34eda4ce17dd26ec4ac9687/bin/nvcc -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -DALPAKA_DEBUG=0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/cuda/11.4.1-274998c7f34eda4ce17dd26ec4ac9687/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/tbb/v2021.3.0-13eaf94bcafc2deaec6244d3257cd1bc/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/boost/1.75.0-4f799e0d654b83bad9b3c6c2ddd3197e/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include -Iinclude -std=c++17 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda -gencode 'arch=compute_60,code=[sm_60,compute_60]' -gencode 'arch=compute_70,code=[sm_70,compute_70]' -gencode 'arch=compute_75,code=[sm_75,compute_75]' -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored --cudart shared -Xcompiler '-std=c++17 -O2 -pthread -fPIC -Wall -Wextra' -x cu -c src/common.cpp -o build/cuda/common.cpp.o
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/event/EventGenericThreads.hpp: In instantiation of 'void alpaka::traits::generic::currentThreadWaitForDevice(const TDev&) [with TDev = alpaka::DevCpu]':
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/dev/cpu/Wait.hpp:33:40:   required from here
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/event/EventGenericThreads.hpp:280:20: error: '__T30' was not declared in this scope
280 |                 auto vQueues(dev.getAllQueues());
|                 ~~~^~~~~~~~~~~~~~~~~~~~~
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.v2wR40 (%build)




@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2021

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17477/summary.html
COMMIT: 75efd10
CMSSW: CMSSW_12_1_X_2021-08-02-1100/slc7_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17477/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

+ for FILE in '$FILES'
++ basename src/common.cpp
+ /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/cuda/11.4.1-cb1437ae9d2977d557f82280af898abe/bin/nvcc -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -DALPAKA_DEBUG=0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/cuda/11.4.1-cb1437ae9d2977d557f82280af898abe/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/tbb/v2021.3.0-1a57fe4de5dfa06c29ac0428de2ef8c3/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/boost/1.75.0-9158d15931ebbfc4a7cd9cab205ad21f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include -Iinclude -std=c++17 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda -gencode 'arch=compute_60,code=[sm_60,compute_60]' -gencode 'arch=compute_70,code=[sm_70,compute_70]' -gencode 'arch=compute_75,code=[sm_75,compute_75]' -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored --cudart shared -Xcompiler '-std=c++17 -O2 -pthread -fPIC -Wall -Wextra' -x cu -c src/common.cpp -o build/cuda/common.cpp.o
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/event/EventGenericThreads.hpp: In instantiation of 'void alpaka::traits::generic::currentThreadWaitForDevice(const TDev&) [with TDev = alpaka::DevCpu]':
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/dev/cpu/Wait.hpp:33:36:   required from here
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/event/EventGenericThreads.hpp:280:20: error: '__T30' was not declared in this scope
280 |                 auto vQueues(dev.getAllQueues());
|                 ~~~^~~~~~~~~~~~~~~~~~~~~
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.ZSflaL (%build)




@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021 via email

@smuzaffar
Copy link
Contributor

please test
Alpaka 0.6.1 is merged

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2021

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17492/summary.html
COMMIT: 75efd10
CMSSW: CMSSW_12_1_X_2021-08-03-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17492/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

+ chmod -Rf a+rX,u+w,g-w,o-w .
+ '[' 11.4 '!=' 11.2 ']'
+ echo 'Incompatible CUDA version in cudnn.spec!'
Incompatible CUDA version in cudnn.spec!
+ exit 1
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.x4gVyd (%prep)


RPM build errors:
Macro %rpmbuild_libdir defined but not used within scope
Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.x4gVyd (%prep)


@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2021

Pull request #7197 was updated.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 3, 2021

please test for CMSSW_12_1_X/slc7_amd64_gcc11

@fwyzard fwyzard force-pushed the IB/CMSSW_12_1_X/master_cuda_11.4.1 branch from c238307 to f3af69f Compare August 3, 2021 22:27
@smuzaffar
Copy link
Contributor

please test with #7204 for slc7_aarch64_gcc9

@smuzaffar
Copy link
Contributor

please test with #7204 for CMSSW_12_1_X/slc7_pcc64le_gcc9

@smuzaffar
Copy link
Contributor

please test with #7204 for CMSSW_12_1_X/slc7_ppc64le_gcc9

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 5, 2021

@smuzaffar , keep in mind that we do not want to merge CUDA 11.4.x in the GCC 9 and GCC 10 builds.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 5, 2021

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17573/summary.html
COMMIT: f3af69f
CMSSW: CMSSW_12_1_X_2021-08-04-1100/slc7_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17573/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

Entering library rule at CUDADataFormats/StdDictionaries
>> Building LCG reflex dict from header file src/CUDADataFormats/StdDictionaries/src/classes.h
>> Compiling LCG dictionary: tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/a/CUDADataFormatsStdDictionaries_xr.cc
>> Building  shared library tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/libCUDADataFormatsStdDictionaries.so
/cvmfs/cms-ib.cern.ch/nweek-02692/slc7_amd64_gcc11/external/gcc/11.1.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/11.1.1/../../../../x86_64-unknown-linux-gnu/bin/ld: cannot find -lHeterogeneousCoreCUDAUtilities
collect2: error: ld returned 1 exit status
gmake: *** [tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/libCUDADataFormatsStdDictionaries.so] Error 1
Leaving library rule at CUDADataFormats/StdDictionaries
>> Leaving Package CUDADataFormats/StdDictionaries
>> Package CUDADataFormats/StdDictionaries built
>> Entering Package CondFormats/DQMObjects


@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 6, 2021

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17595/summary.html
COMMIT: f3af69f
CMSSW: CMSSW_12_1_X_2021-08-04-2300/slc7_ppc64le_gcc9
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17595/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

/scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > ,  ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op ,  ::Eigen::Matrix > > ::rows const") is not allowed

/scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code

/scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code



@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 6, 2021

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17594/summary.html
COMMIT: f3af69f
CMSSW: CMSSW_12_1_X_2021-08-05-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17594/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > ,  ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op ,  ::Eigen::Matrix > > ::rows const") is not allowed

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code



@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 6, 2021

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17598/summary.html
COMMIT: f3af69f
CMSSW: CMSSW_12_1_X_2021-08-05-2300/slc7_aarch64_gcc9
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/7197/17598/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17598/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17598/git-merge-result

Build

I found compilation error when building:

/home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > ,  ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op ,  ::Eigen::Matrix > > ::rows const") is not allowed

/home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code

/home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed

/home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code



@smuzaffar
Copy link
Contributor

lets gets this in for GCC 11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants