Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocFFT Test Suite Fails #439

Closed
tflink opened this issue Aug 22, 2023 · 42 comments
Closed

rocFFT Test Suite Fails #439

tflink opened this issue Aug 22, 2023 · 42 comments

Comments

@tflink
Copy link
Contributor

tflink commented Aug 22, 2023

Apologies for opening an issue about this but I couldn't find another way to ask this question.

I'm working to build and package rocFFT in Fedora and while the previous build issue has been resolved (#422) I'm trying to run the test client and I'm unclear on what the expected results or if the test client is expected to have 0 FAILED results.

What is the expected behavior

  • The test client run passes

What actually happens

172385 tests from 50 test suites ran in 3.8 hours

  • 148480 tests PASSED
  • 309 tests SKIPPED
  • 23596 tests FAILED

How to reproduce

  • build rocFFT with Fedora dependencies, test client enabled, debug build
  • install to a directory (in this case, in my home directory)
  • ensure that the libraries are in LD_LIBRARY_PATH and binaries are in PATH
  • run the test suite with no arguments (rocfft-test-d)

Environment

Hardware description
GPU Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon Pro VII/Radeon Instinct MI50 32GB] (rev 06)
CPU AMD Ryzen 7 5700X 8-Core Processor
Software version
ROCK no binary driver. 6.5.0-0.rc6.20230818git0e8860d2125f.47.fc40.x86_64 kernel build for Fedora rawhide (f40)
ROCR v5.6.0 (fedora package)
HCC v5.6.0 (fedora package)
Library v16.4 (fedora package built from fork)
rocFFT release/rocm-rel-5.6 branch
@tflink tflink changed the title rocFFT Expected Test Suite Results rocFFT Test Suite Fails Aug 22, 2023
@cgmb
Copy link
Contributor

cgmb commented Aug 22, 2023

I would expect zero failing tests. Debian is still on rocFFT from ROCm 5.5.1 and it was built in release mode, but that version had 289081 passing tests and 587 skipped tests on gfx906.

@tflink
Copy link
Contributor Author

tflink commented Aug 22, 2023

Thanks for confirming the expected test results. I wanted to make sure that they were expected to all pass before digging into why they're failing.

I'll close the issue and re-file if I find something that isn't self-inflicted.

@tflink tflink closed this as completed Aug 22, 2023
@evetsso
Copy link
Contributor

evetsso commented Aug 22, 2023

Can you provide a couple of examples of failing tests, and any messages that rocfft-test-d printed at the time that they failed?

@tflink
Copy link
Contributor Author

tflink commented Aug 22, 2023

I copied a few of the first failures into a file

I captured all stdout and stderr from the test suite with normal output and I think I have some output with ROCFFT_LAYER=57 that has some failures but probably not the same ones that I posted above. Let me know if you want more data

@evetsso
Copy link
Contributor

evetsso commented Aug 22, 2023

All of the test cases in your sample are half-precision, so it would be interesting to see if any half-precision test cases passed for you. All of the half-precision test cases have half in their names.

If none did, then it sounds plausible that something's wrong with half-precision arithmetic on either the host side or device side. To troubleshoot that, I would isolate a small test case to see what's going on:

rocfft-test-d --gtest_filter=pow2_1D_half/accuracy_test.vs_fftw/complex_forward_len_2_half_ip_batch_1_istride_1_CI_ostride_1_CI_idist_2_odist_2_ioffset_0_0_ooffset_0_0 --verbose 4

This will show the actual numbers that we got from both rocFFT and the reference implementation (FFTW). I would imagine one of them looks totally bogus so once we know which one, we can dig further.

@tflink
Copy link
Contributor Author

tflink commented Aug 23, 2023

It looks like all of the test failures have half in their names and I can't find a single passing test case with half in the name.

Additionally, I'm noticing this at the end of the test runner output that doesn't seem right:

Random seed: 2740725319
half precision max l-inf epsilon: inf
half precision max l2 epsilon:     inf
single precision max l-inf epsilon: 1.72013e-07
single precision max l2 epsilon:     6.70508e-06
double precision max l-inf epsilon: 4.86806e-16
double precision max l2 epsilon:     1.86753e-14

I ran the test you suggested. Here is the output

@evetsso
Copy link
Contributor

evetsso commented Aug 23, 2023

Hmm.. it's hard to be sure, but I think something is wrong with half-precision arithmetic on the host as opposed to the device.

The way the tests are structured, we're generating FFT input data on the GPU device. The test copies that input data to the host. rocFFT does an FFT on the device copy while FFTW works on the host copy. Then we copy the GPU output back to the host to compare them.

What's most suspicious to me is that the --verbose 4 output claims that all the input and output elements have the same value, and yet the comparison still thinks the results are incorrect. To me, that suggests that the host is unable to correctly print and/or do arithmetic with half-precision values.

rocFFT is using the _Float16 data type in its code for half-precision values. It might be instructive to see if a simple standalone test program can print and do math with _Float16 values. Since FFTW does not natively support half-precision, the test will also cast the _Float16s to regular floats and back again. And C++ iostreams don't natively support _Float16s either, so in that case we cast to double. It would be interesting to verify that those operations work.

Assuming that's all OK, I think the next step would be to dig in with a debugger (e.g. rocgdb) - in particular, right after the hipMemcpy around accuracy_test.h:1556 is done, if we can print ((_Float16*)cpu_input[0].buf)[0] and ((_Float16*)cpu_input[0].buf)[1], we can at least confirm that the _Float16s we're getting back from the device look like sensible values (they should be between -0.5 and +0.5).

Is this package (with the debug build) available for download? I wouldn't mind trying to repro this in a docker container.

@evetsso
Copy link
Contributor

evetsso commented Aug 25, 2023

FYI I've reproduced this problem in a rawhide Docker container. Something seems to be wrong with optimization and the compiler runtime.

With this test program (float16.cpp):

#include <iostream>
#include <algorithm>
#include <vector>

int main()
{
  const size_t count = 4;

  std::vector<_Float16> f16(count);
  std::vector<float> f32(count);

  for(size_t i = 0 ; i < count; ++i)
    f16[i] = static_cast<_Float16>(0.123) * static_cast<_Float16>(i);
  
  std::copy_n(f16.begin(), count, f32.begin());

  for(size_t i = 0; i < count; ++i)
    {
      std::cout << "loop f16=" << static_cast<float>(f16[i]) << " f32=" << f32[i] << std::endl;
    }

  _Float16 one_f16 = 0.123;
  float one_f32 = one_f16;
  std::cout << "one f32=" << one_f32 << std::endl;
  return 0;
}

Build it several ways:

  • hipcc -O0 float16.cpp -o float16_hipcc_O0
  • hipcc -O1 float16.cpp -o float16_hipcc_O1
  • hipcc -O2 float16.cpp -o float16_hipcc_O2
  • hipcc -O3 float16.cpp -o float16_hipcc_O3
  • hipcc -Og float16.cpp -o float16_hipcc_Og

-O3 produces sensible output, but the other versions show varying levels of incorrect output. A converting a single _Float16 to float works, but things go wrong in a loop.

Setting HIPCC_VERBOSE=7 in the environment will show the command line that is ultimately passed down to clang++. --rtlib=compiler-rt is the option that hipcc is adding that's breaking things. Everything's fine when I run clang++ directly but without that option.

I also note that ROCm/rocBLAS#1350 seems to be suspiciously similar, but in rocBLAS instead.

I'm not familiar enough with how these options are supposed to interact to know what the fix is. It's quite possible that this has already been fixed upstream (in LLVM, I imagine). I'll ask some colleagues to see if they can shed some light on this problem.

@evetsso evetsso reopened this Aug 25, 2023
@scchan
Copy link

scchan commented Aug 25, 2023

It might be related to the fp16 ABI change: llvm/llvm-project#56854
Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

@tflink
Copy link
Contributor Author

tflink commented Aug 25, 2023

FYI I've reproduced this problem in a rawhide Docker container.

Thank you for doing that. It was taking me a while to write an example because as it turns out, I was approaching the problem backwards. On the bright side, I learned a lot about FP16 and std::format in C++23.

Since I had it set up, I ran your code in a Debian 12 amd64 VM and the console output is consistent for all optimization levels.

Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

I'm not clear on what information you're asking for. Current Fedora rawhide has llvm 16.0.6. The cmake command used when its built (newlines inserted for readability) is:

/usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG
-DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF
-DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include
-DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc
-DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON
-G Ninja -DBUILD_SHARED_LIBS:BOOL=OFF -DLLVM_PARALLEL_LINK_JOBS=1
-DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_SKIP_RPATH:BOOL=ON -DLLVM_LIBDIR_SUFFIX=64
-DLLVM_TARGETS_TO_BUILD=all -DLLVM_ENABLE_LIBCXX:BOOL=OFF -DLLVM_ENABLE_ZLIB:BOOL=ON
-DLLVM_ENABLE_FFI:BOOL=ON -DLLVM_ENABLE_RTTI:BOOL=ON -DLLVM_USE_PERF:BOOL=ON
-DLLVM_BINUTILS_INCDIR=/usr/include -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=AVR
-DLLVM_BUILD_RUNTIME:BOOL=ON -DLLVM_INCLUDE_TOOLS:BOOL=ON
-DLLVM_BUILD_TOOLS:BOOL=ON -DLLVM_INCLUDE_TESTS:BOOL=ON -DLLVM_BUILD_TESTS:BOOL=ON
-DLLVM_INSTALL_GTEST:BOOL=ON -DLLVM_LIT_ARGS=-v -DLLVM_INCLUDE_EXAMPLES:BOOL=ON
-DLLVM_BUILD_EXAMPLES:BOOL=OFF -DLLVM_INCLUDE_UTILS:BOOL=ON
-DLLVM_INSTALL_UTILS:BOOL=ON -DLLVM_UTILS_INSTALL_DIR:PATH=/usr/bin
-DLLVM_TOOLS_INSTALL_DIR:PATH=bin -DLLVM_INCLUDE_DOCS:BOOL=ON
-DLLVM_BUILD_DOCS:BOOL=ON -DLLVM_ENABLE_SPHINX:BOOL=ON
-DLLVM_ENABLE_DOXYGEN:BOOL=OFF -DLLVM_VERSION_SUFFIX=
-DLLVM_BUILD_LLVM_DYLIB:BOOL=ON -DLLVM_LINK_LLVM_DYLIB:BOOL=ON
-DLLVM_BUILD_EXTERNAL_COMPILER_RT:BOOL=ON -DLLVM_INSTALL_TOOLCHAIN_ONLY:BOOL=OFF
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux-gnu -DSPHINX_WARNINGS_AS_ERRORS=OFF
-DCMAKE_INSTALL_PREFIX=/usr -DLLVM_INSTALL_SPHINX_HTML_DIR=/usr/share/doc/llvm/html
-DSPHINX_EXECUTABLE=/usr/bin/sphinx-build-3 -DLLVM_INCLUDE_BENCHMARKS=OFF

The full build logs for llvm-16.0.6

The patches used for that build

On a semi-related note, llvm17 will be landing in Fedora rawhide before much longer. llvm16 will continue to be available for a while yet, though.

@yxsamliu
Copy link
Contributor

I think you may need to add -DLLVM_ENABLE_RUNTIMES="compiler-rt" to your cmake command line. Otherwise it may not build compiler-rt, then clang may use the system compiler-rt, which may cause ABI mismatch.

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

I think you may need to add -DLLVM_ENABLE_RUNTIMES="compiler-rt" to your cmake command line. Otherwise it may not build compiler-rt, then clang may use the system compiler-rt, which may cause ABI mismatch.

OK, sounds like a place to start. I'll make some test builds and we'll see if that fixes things.

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

After a bit more research, it turns out that compiler-rt is built as a separate package in Fedora and they're both on llvm 16.0.6 (latest released compiler-rt build logs) on Fedora Rawhide, which my test system is running.

Is there a way to test for an ABI mismatch other than building a monolithic llvm and potentially breaking a bunch of other stuff?

@tstellar
Copy link

It might be related to the fp16 ABI change: llvm/llvm-project#56854 Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

According to that bug, the ABI changed in Clang 15. So it seems like Clang 16 and newer should be unaffected. @tflink Are the tests linking against anything that was built with Clang 14?

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

@tflink Are the tests linking against anything that was built with Clang 14?

Not that I know of, no. AFAIK, everything built with Clang in Fedora was rebuilt with Clang 16 in the F39 mass rebuild last month and my system was updated to Rawhide as of at least 2023-08-14 (and probably more recent but I don't recall off the top of my head) when I submitted this issue.

@yxsamliu
Copy link
Contributor

maybe use -O0 -save-temps -v to compile and link that test program. Then we can check the dumped .s file to see what API the caller is using. and use objdump -d with the executable to see what ABI the called function is using. Then we investigate why the callee has wrong ABI.

@tstellar
Copy link

@tflink Can you post the full buld log from when you built rocFFT?

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

maybe use -O0 -save-temps -v to compile and link that test program. Then we can check the dumped .s file to see what API the caller is using. and use objdump -d with the executable to see what ABI the called function is using. Then we investigate why the callee has wrong ABI.

I've done that and uploaded everything including the cpp file, the output binary and all the bits generated by hipcc.

In particular:

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

@tflink Can you post the full buld log from when you built rocFFT?

I didn't save it when I built rocFFT last but I can rebuild and save the output. Are there any particular options that you'd like to see? I assume -DCMAKE_VERBOSE_MAKEFILE=TRUE , -DCMAKE_BUILD_TYPE=Debug and -DBUILD_VERBOSE=ON but I'm not sure if there are others.

@tstellar
Copy link

@tflink I actually just wanted to see the build commands used to build the test, and it looks like you posted that in the previous comment.

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

I don't know if this is useful but I did the same test program compile on a debian12 VM that is NOT showing this issue.

I'm the wrong person to ask about the build details but this Debian 12 system appears to have llvm and clang 14.0 available from the debian repos but hipcc --version is showing something different (minus a python traceback which I assume is unrelated):

hipcc --version
HIP version: 5.2.21153-0
Debian clang version 15.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

@tstellar
Copy link

Would you be able to test in a Fedora 37 container? Fedora 37 has clang 15.0.6 just like debian 12.

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

Would you be able to test in a Fedora 37 container? Fedora 37 has clang 15.0.6 just like debian 12.

I will try to do so but the hipcc package is pretty new and was never built in F37. As I understand it, quite a bit of work went into getting it package-able so I'm not terribly optimistic that it'll work but we'll find out :)

@tstellar
Copy link

@tflink OK, I wouldn't go to a lot of trouble to get it built then. I thought it would be easy to try.

@scchan
Copy link

scchan commented Aug 28, 2023

hipcc won't be needed. You could take the test from #439 (comment), which is just a plain CPU program and then compile with --rtlib=compiler-rt at -O2/-O3

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

I attempted to run the tests from the earlier comment on an F37 VM (clang 15.0.7) and got the following output:

$ clang++ -O0 --rtlib=compiler-rt -o float16_clang15_O0 float16.cpp
$ clang++ -O1 --rtlib=compiler-rt -o float16_clang15_O1 float16.cpp
$ clang++ -O2 --rtlib=compiler-rt -o float16_clang15_O2 float16.cpp
$ clang++ -O3 --rtlib=compiler-rt -o float16_clang15_O3 float16.cpp
$ clang++ -Og --rtlib=compiler-rt -o float16_clang15_Og float16.cpp

$ ./float16_clang15_O0
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O1
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O2
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O3
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_Og
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

Assuming that I've understood the non-hipcc reproducer correctly, the same issue doesn't show up in F37 and clang 15.

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

I'm going to try rebuilding hipcc on F39 against clang/llvm 15 instead of 16

@tflink
Copy link
Contributor Author

tflink commented Aug 28, 2023

I'm going to try rebuilding hipcc on F39 against clang/llvm 15 instead of 16

This is going to take more effort than I anticipated, if it's even possible. I'll update again if I'm able to make it work.

@tstellar
Copy link

This looks like a bug in clang 16:

[root@23c79a8c3e0a ~]# clang++ float16.cpp -o libgcc
[root@23c79a8c3e0a ~]# ./libgcc 
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
[root@23c79a8c3e0a ~]# clang++ float16.cpp -rtlib=compiler-rt -o crt
[root@23c79a8c3e0a ~]# ./crt
loop f16=24.625 f32=-856
loop f16=24.625 f32=0
loop f16=24.625 f32=0
loop f16=24.625 f32=0
one f32=50688

@tstellar
Copy link

I've narrowed this down further, and it looks like there is a bug in Fedora's compiler-rt package.

@TorreZuk
Copy link

You could add clang flags -mf16c and -D__STDC_WANT_IEC_60559_TYPES_EXT__=1 to see if it helps the compiler but the former may be default on all your clang versions.

@tstellar
Copy link

@tflink Can you install this build and see if it fixes the issue for you:

dnf install https://kojipkgs.fedoraproject.org//work/tasks/9610/105499610/compiler-rt-16.0.6-2.fc38.x86_64.rpm

@tflink
Copy link
Contributor Author

tflink commented Aug 30, 2023

@tstellar I'll do a full rocFFT build and test cycle and report back either later today or tomorrow since that build+test cycle takes almost 6 hours on my machine.

In the meantime, the fix looks good for the example posted above. On my F40 machine with LLVM16, I'm now seeing the same output that I was seeing on F37 with LLVM15.

$ hipcc -O0 float16.cpp -o float16_hipcc_O0
$ hipcc -O1 float16.cpp -o float16_hipcc_O1
$ hipcc -O2 float16.cpp -o float16_hipcc_O2
$ hipcc -O3 float16.cpp -o float16_hipcc_O3
$ hipcc -Og float16.cpp -o float16_hipcc_Og

$ ./float16_hipcc_O0
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_O1
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
:$ ./float16_hipcc_O2
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_O3
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_Og
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

@tflink
Copy link
Contributor Author

tflink commented Aug 31, 2023

@tstellar I rebuilt rocFFT locally with the compiler-rt build you linked and the tests passed; I don't see a single failure in the test output that I captured. Additionally, the half precision stats printed at the end of the test suite look far less suspicious to me.

Random seed: 2591561259
half precision max l-inf epsilon: 0.000801052
half precision max l2 epsilon:     0.0148505
single precision max l-inf epsilon: 1.72013e-07
single precision max l2 epsilon:     5.58241e-06
double precision max l-inf epsilon: 4.86806e-16
double precision max l2 epsilon:     1.58969e-14

Unless someone sees something that I don't, I think this can be closed.

Thank you all for your help in getting this resolved.

@evetsso
Copy link
Contributor

evetsso commented Aug 31, 2023

@tstellar Do you have a link to the compiler-rt change that fixes this problem? ROCm/rocBLAS#1350 seems to be hitting the same problem, but for Gentoo and I'm wondering how they can get this fix.

@tstellar
Copy link

tstellar commented Sep 2, 2023

@evetsso
Copy link
Contributor

evetsso commented Sep 5, 2023

As this issue appears to be caused and fixed upstream, I'm closing it now. Thanks everyone, for your input.

@evetsso evetsso closed this as completed Sep 5, 2023
@littlewu2508
Copy link

compiler-rt-Fix-FLOAT16-feature-detection.patch

Hello, is this patch submitted to llvm-project upstream?

@tstellar
Copy link

@littlewu2508 It's not an upstream bug. It was a bug in the Fedora builds.

@littlewu2508
Copy link

@littlewu2508 It's not an upstream bug. It was a bug in the Fedora builds.

Sorry, I don't quite understand -- the patch is on upstream's CMakeLists.txt, so I would think the bug is from upstream. Also this patches fix Gentoo's problem ROCm/rocBLAS#1350, so I think this is not specific to one distro. Upstreaming the patch also make it easier to distro packages.

@tstellar
Copy link

@littlewu2508 Upstream does not support the build configurations that Fedora is using. I'm guessing this is probably also true for Gentoo. So, they won't accept a patch like this.

@littlewu2508
Copy link

@littlewu2508 Upstream does not support the build configurations that Fedora is using. I'm guessing this is probably also true for Gentoo. So, they won't accept a patch like this.

Thanks for the explanation! I'll contact Gentoo maintainer @mgorny to help look at this, since this is a major regression with ROCm libraries.

@mgorny Can you please take a look at https://src.fedoraproject.org/fork/tstellar/rpms/compiler-rt/blob/0459cbc5d9eb15f1ad51d74707b4988049183708/f/0001-compiler-rt-Fix-FLOAT16-feature-detection.patch and apply it on compiler-rt package? Gentoo ROCm-5.6 stack is having the same problem and I don't bump the version because of this. It at least affects rocBLAS as ROCm/rocBLAS#1350 reported.

thesamesam pushed a commit to llvm/llvm-project that referenced this issue Jan 24, 2024
CMAKE_TRY_COMPILE_TARGET_TYPE defaults to EXECUTABLE, which causes
any feature detection code snippet without a main function to fail,
so we need to make sure it gets explicitly set to STATIC_LIBRARY.

Bug: ROCm/rocFFT#439
Bug: ROCm/rocBLAS#1350
Bug: https://bugs.gentoo.org/916069
Closes: #69842

Reviewed by: thesamesam, mgorny
sylvestre pushed a commit to tstellar/llvm-toolchain-integration-test-suite that referenced this issue May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants