Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add host compiler compatibility check to CUDA package. #24540

Closed
wants to merge 1 commit into from

Conversation

romerojosh
Copy link
Contributor

nvcc only officially supports specific ranges of host compilers, with version ranges varied based on CUDA toolkit version. Currently, Spack does not consider these host compiler restrictions during concretization. This can result in cases where users go through a long build process, only to find out later on that the host compiler they are using is not supported by the CUDA toolkit version they've installed.

This PR introduces code into the CUDA package to add conflict statements to enforce host compiler dependencies based on CUDA version. To facilitate this, I've added a _supported_compilers dictionary that contains information on supported compilers based on CUDA version. I parsed this data directly from the ${CUDA_ROOT}/include/crt/host_config.h header for each CUDA version listed (as old as 8.0).

It appears that the conflicts statement does not accept syntax like conflicts('^%gcc@:10', when ='%gcc') (i.e. conflicts with gcc versions greater than 10 when building with gcc). As a result, I had to add a helper function invert_support_entry to effectively perform this NOT operation, and then create conflicts using the inverted ranges. See the comment above the function for examples. If there is a better way to accomplish this without requiring this type of code, please let me know.

With these conflicts in place, the build will now error out during concretization. For example, trying to install CUDA 10.0 in a container with GCC 9 will now yield the following:

$ spack install cuda@10.0.130
==> Error: Conflicts in concretized spec "cuda@10.0.130%gcc@9.3.0~dev arch=linux-ubuntu20.04-broadwell/3flnvxz"
List of matching conflicts for spec:

    cuda@10.0.130%gcc@9.3.0~dev arch=linux-ubuntu20.04-broadwell

1. "%gcc@8:" conflicts with "cuda@10.0" [gcc version is not within supported range (:7) for CUDA 10.0.130.]

I'm not sure why the error reports %gcc@8, but maybe that is expected? On the other hand, when trying to install CUDA 10.2, the error reported shows %gcc@9 which is more accurate:

$ spack install cuda@10.2.89
==> Error: Conflicts in concretized spec "cuda@10.2.89%gcc@9.3.0~dev arch=linux-ubuntu20.04-broadwell/367rjfk"
List of matching conflicts for spec:

    cuda@10.2.89%gcc@9.3.0~dev arch=linux-ubuntu20.04-broadwell
        ^libxml2@2.9.10%gcc@9.3.0~python arch=linux-ubuntu20.04-broadwell
            ^libiconv@1.16%gcc@9.3.0 arch=linux-ubuntu20.04-broadwell
            ^pkgconf@1.7.4%gcc@9.3.0 arch=linux-ubuntu20.04-broadwell
            ^xz@5.2.5%gcc@9.3.0~pic libs=shared,static arch=linux-ubuntu20.04-broadwell
            ^zlib@1.2.11%gcc@9.3.0+optimize+pic+shared arch=linux-ubuntu20.04-broadwell

1. "%gcc@9:" conflicts with "cuda@10.2" [gcc version is not within supported range (:8) for CUDA 10.2.89.]

CC: @bvanessen

Copy link
Member

@alalazo alalazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinging @Rombur and @ax3l for a review. For information, there's a PR trying to relax constraints on CUDA packages that has some overlap with this one (see #19736), so wondering if those two should be worked together.

In any case I think that if we add constraints to Cuda in the same way we do for CudaPackage then most of the code should be factored in the same place rather than being duplicated.

@alalazo alalazo self-assigned this Jun 28, 2021
@alalazo alalazo added the cuda label Jun 28, 2021
@romerojosh
Copy link
Contributor Author

Thanks for the comment @alalazo and sharing info on #19736. The benefit of having the host compiler conflict code in Cuda rather than CudaPackage is that it will error out during concretization, rather than during build. This would mean that the code to respect the allow-unsupported-compilers option (and bypassing these conflicts) would need to be added to Cuda as well. I guess the trick is that if the Cuda package is installed with +allow-unsupported-compilers, that information needs to be propagated to CudaPackage so that the correct flag is passed to nvcc for subsequent builds.

@ax3l
Copy link
Member

ax3l commented Jun 28, 2021

Yes, maybe we can even unify the two files so we don't need to maintain this in two places with a new release? :)

@alalazo
Copy link
Member

alalazo commented May 10, 2022

Can we close this?

@alecbcs
Copy link
Member

alecbcs commented Nov 19, 2023

Closing due to no response. @romerojosh happy to reopen if you'd like to continue working on this PR.

@alecbcs alecbcs closed this Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants