Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Custom gfx1100 kernel sample fails to build (clang-offload-bundler not found) #16899

Open
kuhar opened this issue Mar 26, 2024 · 13 comments
Assignees
Labels
codegen/rocm ROCm code generation compiler backend

Comments

@kuhar
Copy link
Member

kuhar commented Mar 26, 2024

Error:

➜ ninja all iree-test-deps && ctest -j32 --label-exclude '^driver=cuda|metal' --output-on-failure 
[0/2] Re-checking globbed directories...
[53/53] Generating kernels_gfx1100.co
FAILED: samples/custom_dispatch/hip/kernels/kernels_gfx1100.co /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co 
cd /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels && /home/jakub/iree/build/relass/llvm-project/bin/clang-18 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/opt/rocm -fuse-cuid=none -O3 /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
clang-18: error: unable to execute command: Executable "clang-offload-bundler" doesn't exist!
clang-18: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)

My rocm installation is under /opt/rocm, the version is 5.7.1.

@kuhar kuhar added the codegen/rocm ROCm code generation compiler backend label Mar 26, 2024
@benvanik
Copy link
Collaborator

is it a complete install? my windows SDK has it:
image

@kuhar
Copy link
Member Author

kuhar commented Mar 26, 2024

Yes, I even used the cursed amdgpu-pro installer.

ls /opt/rocm/bin 
amdclang     amdclang-cpp  hipcc      hipcc_cmake_linker_helper  hipconfig.pl               hipdemangleatp      hipfc         hipvars.pm    roc-obj-extract  rocm_agent_enumerator
amdclang++   amdflang      hipcc.bin  hipconfig                  hipconvertinplace-perl.sh  hipexamine-perl.sh  hipify-clang  offload-arch  roc-obj-ls       rocminfo
amdclang-cl  amdlld        hipcc.pl   hipconfig.bin              hipconvertinplace.sh       hipexamine.sh       hipify-perl   roc-obj       rocm-smi 

@raikonenfnu
Copy link
Collaborator

I think @sogartar faced something similar?
can you try with my build script here https://gist.github.com/raikonenfnu/7d2843107929b161b12e56c057e8735d to see if the issue persist?

@kuhar
Copy link
Member Author

kuhar commented Mar 26, 2024

@raikonenfnu can you first confirm where the clang-offload-bundler binary should be? Do you have it under /opt/rocm like Ben or installed system-wide?

@kuhar
Copy link
Member Author

kuhar commented Mar 26, 2024

We may need to check for this during the cmake configuration step.

@raikonenfnu
Copy link
Collaborator

raikonenfnu commented Mar 26, 2024

I only have it on /opt/rocm/llvm/bin/ not system wide. IIRC the clang commands to generate the bitcode should not need clang-offload-bundler at all.

I also do not have clang-offload-bundler on my env and was able to compile.

@raikonenfnu
Copy link
Collaborator

raikonenfnu commented Mar 26, 2024

Oh wait you are talking about macrokernel not microkernel, so my previous assumption/comments might be correct here. The previous comments were more about microkernel. I need to check a bit more about samples macrokernel.

I think it may be the --rocm-path option? I was able to compile hsaco/co with https://github.com/raikonenfnu/macroHipKernel/blob/main/generate_hsaco.sh#L2-L4

Perhaps missing a nogpulib option?

@raikonenfnu
Copy link
Collaborator

raikonenfnu commented Mar 26, 2024

@kuhar Was able to repro your issue on my system as well. But if I specify export IREE_ROCM_PATH=/opt/rocm, then my error would be:

(EDIT: Deleted log from using -nogpublib )

(EDIT: this one actually works if we point to where the clang-offload-bundler live which is /opt/rocm/llvm/bin)
Seems like if we append rocm llvm path for this it will compile OK:

PATH=$PATH:/opt/rocm/llvm/bin /home/stanley/nod/iree-build-notrace/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/opt/rocm -fuse-cuid=none -O3 /home/stanley/nod/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/stanley/nod/iree-build-notrace/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co

@kuhar
Copy link
Member Author

kuhar commented Mar 27, 2024

Thanks, with this set export PATH="$PATH:/opt/rocm/llvm/bin" it makes more progress and then errors out with:

➜ ninja all                                                                                                               
[0/2] Re-checking globbed directories...
[57/332] Generating rocm_executable_cache_test.bin from executable_cache_test.mlir
FAILED: runtime/plugins/hal/drivers/rocm/cts/rocm_executable_cache_test.bin /home/jakub/iree/build/relass/runtime/plugins/hal/drivers/rocm/cts/rocm_executable_cache_test.bin 
cd /home/jakub/iree/build/relass/runtime/plugins/hal/drivers/rocm/cts && /home/jakub/iree/build/relass/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --compile-mode=hal-executable --iree-hal-target-backends=rocm --iree-rocm-target-chip=gfx908 /home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir -o rocm_executable_cache_test.bin --iree-hal-executable-object-search-path=\"/home/jakub/iree/build/relass\"
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: cannot find ROCM bitcode files. Check your installation consistency and in the worst case, set --iree-rocm-bc-dir= to a path on your system.
hal.executable.source public @executable {
^
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: failed to serialize executable for target backend rocm
hal.executable.source public @executable {
^
/home/jakub/iree/iree/runtime/src/iree/hal/cts/testdata/executable_cache_test.mlir:15:1: error: failed to serialize executables
hal.executable.source public @executable {
^
[58/332] Generating rocm_command_buffer_dispatch_test.bin from command_buffer_dispatch_test.mlir

I set both IREE_ROCM_PATH as the cmake variable and exported it as an env var. What am I missing @raikonenfnu?

Separately from solving this, why do we even build this test data in the all target? I'd assume it should only be a dependency for iree-test-deps, no?

@kuhar
Copy link
Member Author

kuhar commented Mar 28, 2024

OK it does work after switching from the rocm installation from the amdgpu-pro installer to https://github.com/nod-ai/TheRock/releases/tag/nightly-staging-20240328.41 , setting -DIREE_ROCM_PATH, and doing a clean bulid.

@kuhar
Copy link
Member Author

kuhar commented Mar 28, 2024

The last remaining issue is the following error:

➜  ninja iree-test-deps       
[0/2] Re-checking globbed directories...
[1266/1266] Generating kernels_gfx1100.co
FAILED: samples/custom_dispatch/hip/kernels/kernels_gfx1100.co /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co 
cd /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels && /home/jakub/iree/build/relass/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/home/jakub/bin/therock -fuse-cuid=none -O3 /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
In file included from /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu:7:
In file included from /home/jakub/bin/therock/include/hip/hip_runtime.h:62:
In file included from /home/jakub/bin/therock/include/hip/amd_detail/amd_hip_runtime.h:432:
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:194:27: error: use of undeclared identifier 'max'; did you mean 'fmax'?
  194 |   double __logbw = _LOGBd(_fmaxd(_ABSd(__c), _ABSd(__d)));
      |                           ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:45:16: note: expanded from macro '_fmaxd'
   45 | #define _fmaxd max
      |                ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_math_forward_declares.h:73:19: note: 'fmax' declared here
   73 | __DEVICE__ double fmax(double, double);
      |                   ^
In file included from /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu:7:
In file included from /home/jakub/bin/therock/include/hip/hip_runtime.h:62:
In file included from /home/jakub/bin/therock/include/hip/amd_detail/amd_hip_runtime.h:432:
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:227:26: error: use of undeclared identifier 'max'; did you mean 'fmax'?
  227 |   float __logbw = _LOGBf(_fmaxf(_ABSf(__c), _ABSf(__d)));
      |                          ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_complex_builtins.h:46:16: note: expanded from macro '_fmaxf'
   46 | #define _fmaxf max
      |                ^
/home/jakub/iree/build/relass/llvm-project/lib/clang/19/include/__clang_cuda_math_forward_declares.h:74:18: note: 'fmax' declared here
   74 | __DEVICE__ float fmax(float, float);
      |                  ^
2 errors generated when compiling for gfx1100.
ninja: build stopped: subcommand failed

@kuhar
Copy link
Member Author

kuhar commented Mar 28, 2024

@raikonenfnu @antiagainst should we disable these rocm kernels and make them experimental? They don't seem to work out of the box on a typical linux installation but are included in the main ninja targets all (sic!) and iree-test-deps.

@kuhar
Copy link
Member Author

kuhar commented Apr 11, 2024

Ping. This still doesn't build for me. After manually patching the cuda kernel, I'm hitting an issue with another tool missing from path:

➜  ninja all iree-test-deps          
[0/2] Re-checking globbed directories...
[638/2136] Generating kernels_gfx1100.co
FAILED: samples/custom_dispatch/hip/kernels/kernels_gfx1100.co /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co 
cd /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels && /home/jakub/iree/build/relass/llvm-project/bin/clang-19 -x hip --offload-device-only --offload-arch=gfx1100 --rocm-path=/home/jakub/bin/therock -fuse-cuid=none -O3 /home/jakub/iree/iree/samples/custom_dispatch/hip/kernels/kernels.cu -o /home/jakub/iree/build/relass/samples/custom_dispatch/hip/kernels/kernels_gfx1100.co
/home/jakub/bin/therock/bin/clang-offload-bundler: error: unable to find 'llvm-objcopy' in path
clang-19: error: amdgcn-link command failed with exit code 1 (use -v to see invocation)
[641/2136] Building CXX object tracy/CMakeFiles/IREETracyProfiler.dir/__/__/__/third_party/tracy/profiler/src/main.cpp.o
ninja: build stopped: subcommand failed.

Seems like this needs a very specific system-wide installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/rocm ROCm code generation compiler backend
Projects
None yet
Development

No branches or pull requests

3 participants