Skip to content

Sync meeting on EESSI ROCm support (2026 05 04)

Caspar van Leeuwen edited this page May 4, 2026 · 2 revisions

Attending: Aayush, Jan Andre, Kenneth, Caspar

  • Decided on AMD GPU support targets in EESSI: all 8 targets supported by ROCm 6.4.1 (fgx908, gfx90a, gfx942, gfx103, gfx1100, gfx1101, gfx1200, gfx1201) will be built for. See https://gitlab.com/eessi/support/-/work_items/71#note_3312063835 for extensive considerations.

  • ROCm-LLVM EasyConfig should be finalized before deploying into EESSI

    • BUILD_LLVM_DYLIB option should be passed to the second build (openMP-enabled one) of LLVM, just to make sure that the library 'matches' the last build
    • Then, we will retest this PR, and if it still resolves the original issue (i.e. it DOES produced amdclang and friends), we can merge this and use it for deploying ROCm-LLVM in EESSI here
  • rompi needs work, these easyconfig changes should get proper ROCm support in there.

    • This PR needs a close look. hwloc-rocm may be included in places where it's not used. We'd need to check documentation to see which software may actually use hwloc to detect e.g. topology of how GPUs are connected
    • Performance of OSU tests based on this PR on LUMI show very poor results, so this PR may not be doing what it should be yet
  • rfoss easyconfigs pr looks ok, but

    • lapack tests are still disabled for OpenBLAS. Currently, the OpenBLAS-0.3.30_better-support-llvm-flang.patch patch doesn't apply to this OpenBlas version (0.3.29). Jan thinks if it does, maybe the number of test failures would be lower and we could re-enable running the tests. He'll look into it.
  • TheRock PR will build all rocm components in a single prefix.

    • This probably means fewer issues with build systems checking e.g. ROCM_PATH and expecting it to find all components in that same prefix.

Clone this wiki locally