Git commit
90f9b88
Operating systems
Linux
GGML backends
CUDA, HIP
Problem description & steps to reproduce
this construct https://github.com/ggerganov/llama.cpp/blob/90f9b88afb6447d3929843a2aa98c0f11074762d/ggml/src/ggml-cuda/fattn-common.cuh#L553 can not be unrolled by llvm for gpu targets (ie amdgcn) when ne01 is unkown at compile time, at the moment this causes several hundred warnings (one set for each arch) when compiling for rocm, please silence this like done for https://github.com/ggerganov/llama.cpp/blob/90f9b88afb6447d3929843a2aa98c0f11074762d/ggml/src/ggml-cuda/softmax.cu#L18
First Bad Commit
No response
Compile command
Relevant log output
llama.cpp/ggml/src/ggml-cuda/template-instances/../fattn-common.cuh:523:24: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]