Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New shared compiler specs #15

Closed
19 of 20 tasks
davidbeckingsale opened this issue Nov 14, 2022 · 27 comments
Closed
19 of 20 tasks

New shared compiler specs #15

davidbeckingsale opened this issue Nov 14, 2022 · 27 comments
Assignees

Comments

@davidbeckingsale
Copy link
Member

davidbeckingsale commented Nov 14, 2022

TOSS3 -> Now TOSS4 (ruby)

Before:

  • intel/19.1.2 w/ gcc 8.3.1 toolchain
  • intel/oneapi.2022.3
  • clang/12.0.1 w/gcc 8.3.1 toolchain
  • clang/14.0.4
  • gcc/8.3.1
  • gcc/10.2.1_

-> Now:

  • intel/19.1.2 w/ gcc 8.5.0 toolchain
  • intel/oneapi.2022.3 (not available on the system)
  • clang/14.0.6
  • gcc/8.5.0
  • gcc/10.3.1

TOSS4 (corona)

  • rocm/5.4.1

TOSS4 Cray (EAS3)

  • rocm/5.4.2
  • cce-tce/15.0.0c (CPU only)

blueos

  • clang/12.0.1 w/ gcc 8.3.1 toolchain + cuda/10.1.243 (old, but system default and is being used)
  • clang/12.0.1 ^cuda/11.2.0+allow-unsupported-compilers
  • clang 14.0.5 w/ gcc 8.3.1 toolchain ^cuda/11.7+allow-unsupported-compilers
  • xl/2022.08.19 + cuda/11.2.0
  • xl/2022.08.19 + cuda/11.7.0
  • gcc/8.3.1 + cuda/11.7.0
@rhornung67
Copy link
Member

Adding most recent info I have from several code projects.

@rhornung67
Copy link
Member

rhornung67 commented Jan 20, 2023

@davidbeckingsale @adrienbernede I updated the issue that was already here based on our recent DevOps survey of 32 code projects.

Note that we will need to add some things to the radiuss spack configs to support these.

@rhornung67 rhornung67 changed the title Updated Compiler Specs New shared compiler specs Jan 20, 2023
@MrBurmark
Copy link
Member

I believe that rocmcc-tce/5.4.1-cce-15.0.0c is really only using "cce" for fortran. So it might be more realistic to use it instead of rocm/5.4.1 if we have any fortran.

@MrBurmark
Copy link
Member

What about cuda 10.1.243?

@rhornung67
Copy link
Member

I would hope that they would update the default from that. I think all code projects are using 11.x at this point.

@MrBurmark
Copy link
Member

I think all code project can use 11.x at this point, but it doesn't mean that they all are using 11.x.

@adrienbernede
Copy link
Member

@rhornung67 should I use the ibm clang or regular clang on blueos.
I ask because we used to have only ibm clang in the lassen specs.

@adrienbernede
Copy link
Member

@rhornung67 what’s the cuda version you’d like to run with clang 14? that’s here
clang 14.0.5 w/ gcc 8.3.1 toolchain + cuda/11.

@adrienbernede
Copy link
Member

adrienbernede commented Jan 27, 2023

I just had to search to see which xl version was the 2022.08.19. I suggest we move forward with this and name the spec xl@16.1.1.12.2022.08.19. This will be more accurate.

@adrienbernede
Copy link
Member

Fixed: rocm@5.4.2 not yet available on corona

adrienbernede added a commit that referenced this issue Jan 27, 2023
adrienbernede added a commit to LLNL/RAJA that referenced this issue Jan 27, 2023
@rhornung67
Copy link
Member

@rhornung67 what’s the cuda version you’d like to run with clang 14? that’s here clang 14.0.5 w/ gcc 8.3.1 toolchain + cuda/11.

Sorry. CUDA 11.7

@rhornung67
Copy link
Member

@rhornung67 should I use the ibm clang or regular clang on blueos. I ask because we used to have only ibm clang in the lassen specs.

We should probably use ibm clang because that's what code projects use. Unfortunately, ibm clang fails at runtime on RAJA sort tests with OpenMP, while all regular clang versions work. I've reported this issue to LC and IBM, but have not yet been able to generate a reproducer that does not use all of RAJA. We can filter out those tests to make sure everything else is working.

@adrienbernede
Copy link
Member

@rhornung67 I got:

Error: raja '%clang@12:' conflicts with '+cuda ^cuda@:11.4.0~allow-unsupported-compilers'

Trying to use clang 12 with cuda 11.2. I need to increase cuda version to 11.5.
IIRC, this is Spack strictly enforcing cuda recommendations. @davidbeckingsale may confirm.
If codes really use clang 12 with cuda 11.2... can we have them move to 11.5 ?

@rhornung67
Copy link
Member

Sorry. My mistake transcribing information from my notes to this issue. Please do clang 12 + CUDA 11.5.

Clang 12 works with CUDA 10.1.243, correct? That is what some code projects reported using.

@adrienbernede
Copy link
Member

In fact, there was an error in my file, and the actual cuda used was 10.1.243...
So... if there was an error in your notes, that’s pure luck we found it!

@MrBurmark
Copy link
Member

My understanding is that we're on cuda 10.1 with clang 12 and trying to move to cuda 11.1 or 11.2 but we can't move past that due to compatibility issues with other software.

@adrienbernede
Copy link
Member

@MrBurmark, I can try 11.2, but from the error message above, Spack won’t let me.
So either you are not building building the stack with Spack, or the cuda package was patched to remove those constraints. David did that in the past IIRC.

@adrienbernede
Copy link
Member

(I’m not saying any of those is wrong, I’m just trying to see the bigger picture)

@adrienbernede
Copy link
Member

Idem with clang@14. cuda in Spack requires a version strictly greater than 11.7.

raja '%clang@14:' conflicts with '+cuda ^cuda@:11.7~allow-unsupported-compilers'

@adrienbernede
Copy link
Member

I’ll take a pragmatic approach and activate "allow-unsupported-compilers" for now.

@rhornung67
Copy link
Member

I've been told by multiple users that they are building with clang14 + cuda 11.7. None of our integrated apps that have a Fortran package can use CUDA greater than 11.7 due to incompatibility between OpenMP and CUDA runtimes.
Do you know how the Spack version constraints are formed?

What does the allow-unsupported-compilers thing do?

I just built RAJA manually and ran all its tests with clang/13.0.1-gcc-8.3.1 + cuda/11.7.0. Let's go with that. We may be able to use clang/14.0.5 by manually setting the gcc toolchain to 8.3.1, but I haven't tried.

@adrienbernede
Copy link
Member

In the Spack Cuda "build system" (not package) you’ll find all the compiler conflicts. That’s in lib/spack/spack/build_systems/cuda.py.
All those conflicts will be ignored when passing the variant +allow-unsupported-compilers.

@adrienbernede
Copy link
Member

adrienbernede commented Jan 27, 2023

In the end, that’s not too big of a deal: Since that’s what users need, and this option allows it, I’ll go for it.

@rhornung67
Copy link
Member

If it doesn't work, we can fall back to clang/13.0.1-gcc-8.3.1 + cuda/11.7.0. I know that works.

@adrienbernede
Copy link
Member

I find my mistake: I needed to both set the external cuda with +allow-unsupported-compilers and add that variant to the cuda dependency in each spec.

@adrienbernede
Copy link
Member

This is because that variant is "sticky":

spack/spack#19736 (comment)

@adrienbernede
Copy link
Member

@rhornung67 I needed to add gcc toolchain to the following specs for Umpire:

clang/12.0.1 ^cuda/11.2.0+allow-unsupported-compilers
xl/2022.08.19 + cuda/11.2.0
xl/2022.08.19 + cuda/11.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants