Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual cuda #29782

Closed
wants to merge 5 commits into from
Closed

Virtual cuda #29782

wants to merge 5 commits into from

Conversation

haampie
Copy link
Member

@haampie haampie commented Mar 29, 2022

With this PR there are 3 packages providing CUDA: cuda-toolkit, nvhpc and nvhpc-slim.

  1. cuda-toolkit: originally the cuda packages, and it just "forwards" cuda
    versions
  2. nvhpc: provides multiple cuda's as a version list, corresponding to
    those from cuda-toolkit based on matching version.{json,txt}
  3. nvhpc-slim: same as nvhpc, except it only provides a single "latest"
    cuda to reduce bandwidth & storage, and makes it easier to work with
    compat bounds like cuda@11.2:

This PR basically combines #29550 and #29155 (but provide cuda unconditionally) and #29742 and fixes #19365 for the most part.

As an example:

$ spack spec spfft +cuda ^cuda@:11.4 ^cuda-toolkit
    ^cuda-toolkit@11.4.4%gcc@11.2.0~allow-unsupported-compilers~dev arch=linux-ubuntu20.04-zen2

$ spack spec spfft +cuda ^cuda@:11.4 ^nvhpc-slim
    ^nvhpc-slim@21.9%gcc@11.2.0+blas+lapack~mpi install_type=single arch=linux-ubuntu20.04-zen2

$ spack spec spfft +cuda ^cuda@:11.4 ^nvhpc
    ^nvhpc@22.3%gcc@11.2.0+blas+lapack~mpi install_type=single arch=linux-ubuntu20.04-zen2

To me concretization makes sense for cuda-toolkit and nvhpc-slim. For nvhpc, I'm not entirely convinced this behavior is great, it picks up the newest nvhpc because it happens to provide an old 10.x cuda toolkit. But the build system likely doesn't know how to pick it up, so maybe some work has to be done to get the correct nvcc in the PATH (see e.g. https://cmake.org/cmake/help/latest/module/FindCUDAToolkit.html#search-behavior).

haampie and others added 4 commits March 30, 2022 01:21
The still rather big but slightly smaller single CUDA providing version
of nvhpc.

Co-authored-by: Wileam Y. Phan <wphan@vols.utk.edu>
Giant package that provides multiple versions of CUDA.

Co-authored-by: Wileam Y. Phan <wphan@vols.utk.edu>
Rename cuda to cuda-toolkit and make it provide cuda.

Co-authored-by: Wileam Y. Phan <wphan@vols.utk.edu>
Co-authored-by: Wileam Y. Phan <wphan@vols.utk.edu>
@haampie
Copy link
Member Author

haampie commented Mar 30, 2022

@spackbot run pipeline

@spackbot-app
Copy link

spackbot-app bot commented Mar 30, 2022

I've started that pipeline for you!

@fspiga
Copy link
Contributor

fspiga commented Mar 31, 2022

I am drafting a long comment to this PR. Stay tuned.

@haampie
Copy link
Member Author

haampie commented Mar 31, 2022

Maybe it's better to have the discussion in #19365, since this is "just" one way to close it (and nobody is subscribed here).

@samcmill
Copy link
Contributor

samcmill commented Apr 1, 2022

Adding the new nvhpc-slim package is only indirectly related to virtualizing the CUDA dependency. I would suggest splitting that out into a separate PR.

Regarding nvhpc-slim, how do you see package maintainers setting their dependencies on the NVIDIA compilers? Would they use nvhpc, or nvhpc-slim, would they be expected to support both?

@tldahlgren tldahlgren requested review from samcmill and vkallesh and removed request for samcmill April 26, 2022 23:36
@vkallesh
Copy link
Contributor

Below message was seen when I tried to execute following command:

--snippet--

9 CMake Error at /home/amd/spack/opt/spack/linux-ubuntu20.04-zen2/gcc-11.2.0/
cmake-3.22.3-dvpfzzxzvjfzi67jmz6qfyi723bced5j/share/cmake-3.22/Modules/CMak
eDetermineCUDACompiler.cmake:179 (message):
10 Failed to find nvcc.
11
12 Compiler requires the CUDA toolkit. Please set the CUDAToolkit_ROOT
13 variable.
--snippet--

Command executed:
spack install spfft +cuda ^cuda@:11.4 ^nvhpc@22.3%gcc@11.2.0+blas+lapack~mpi install_type=single arch=linux-ubuntu20.04-zen2

Please guide me how to resolve above issue.

@tgamblin tgamblin self-requested a review May 11, 2022 23:28
@wyphan
Copy link
Contributor

wyphan commented May 19, 2022

@vkallesh Sorry for the late response, but would you mind creating a separate issue for this? You can pick "Build error" for the issue type, and the affected package is spfft.

@wyphan wyphan mentioned this pull request May 19, 2022
@wyphan
Copy link
Contributor

wyphan commented May 19, 2022

Per @haampie 's suggestion, I've taken over this PR with #30748.

@wyphan wyphan closed this May 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternate CUDA provider
5 participants