rocFFT Test Suite Fails #439

tflink · 2023-08-22T18:09:52Z

Apologies for opening an issue about this but I couldn't find another way to ask this question.

I'm working to build and package rocFFT in Fedora and while the previous build issue has been resolved (#422) I'm trying to run the test client and I'm unclear on what the expected results or if the test client is expected to have 0 FAILED results.

What is the expected behavior

The test client run passes

What actually happens

172385 tests from 50 test suites ran in 3.8 hours

148480 tests PASSED
309 tests SKIPPED
23596 tests FAILED

How to reproduce

build rocFFT with Fedora dependencies, test client enabled, debug build
install to a directory (in this case, in my home directory)
ensure that the libraries are in LD_LIBRARY_PATH and binaries are in PATH
run the test suite with no arguments (rocfft-test-d)

Environment

Hardware	description
GPU	Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon Pro VII/Radeon Instinct MI50 32GB] (rev 06)
CPU	AMD Ryzen 7 5700X 8-Core Processor

Software	version
ROCK	no binary driver. 6.5.0-0.rc6.20230818git0e8860d2125f.47.fc40.x86_64 kernel build for Fedora rawhide (f40)
ROCR	v5.6.0 (fedora package)
HCC	v5.6.0 (fedora package)
Library	v16.4 (fedora package built from fork)
rocFFT	release/rocm-rel-5.6 branch

The text was updated successfully, but these errors were encountered:

cgmb · 2023-08-22T18:31:19Z

I would expect zero failing tests. Debian is still on rocFFT from ROCm 5.5.1 and it was built in release mode, but that version had 289081 passing tests and 587 skipped tests on gfx906.

tflink · 2023-08-22T18:34:51Z

Thanks for confirming the expected test results. I wanted to make sure that they were expected to all pass before digging into why they're failing.

I'll close the issue and re-file if I find something that isn't self-inflicted.

evetsso · 2023-08-22T18:53:49Z

Can you provide a couple of examples of failing tests, and any messages that rocfft-test-d printed at the time that they failed?

tflink · 2023-08-22T19:11:17Z

I copied a few of the first failures into a file

I captured all stdout and stderr from the test suite with normal output and I think I have some output with ROCFFT_LAYER=57 that has some failures but probably not the same ones that I posted above. Let me know if you want more data

evetsso · 2023-08-22T19:57:10Z

All of the test cases in your sample are half-precision, so it would be interesting to see if any half-precision test cases passed for you. All of the half-precision test cases have half in their names.

If none did, then it sounds plausible that something's wrong with half-precision arithmetic on either the host side or device side. To troubleshoot that, I would isolate a small test case to see what's going on:

rocfft-test-d --gtest_filter=pow2_1D_half/accuracy_test.vs_fftw/complex_forward_len_2_half_ip_batch_1_istride_1_CI_ostride_1_CI_idist_2_odist_2_ioffset_0_0_ooffset_0_0 --verbose 4

This will show the actual numbers that we got from both rocFFT and the reference implementation (FFTW). I would imagine one of them looks totally bogus so once we know which one, we can dig further.

tflink · 2023-08-23T01:11:36Z

It looks like all of the test failures have half in their names and I can't find a single passing test case with half in the name.

Additionally, I'm noticing this at the end of the test runner output that doesn't seem right:

Random seed: 2740725319
half precision max l-inf epsilon: inf
half precision max l2 epsilon:     inf
single precision max l-inf epsilon: 1.72013e-07
single precision max l2 epsilon:     6.70508e-06
double precision max l-inf epsilon: 4.86806e-16
double precision max l2 epsilon:     1.86753e-14

I ran the test you suggested. Here is the output

evetsso · 2023-08-23T04:48:01Z

Hmm.. it's hard to be sure, but I think something is wrong with half-precision arithmetic on the host as opposed to the device.

The way the tests are structured, we're generating FFT input data on the GPU device. The test copies that input data to the host. rocFFT does an FFT on the device copy while FFTW works on the host copy. Then we copy the GPU output back to the host to compare them.

What's most suspicious to me is that the --verbose 4 output claims that all the input and output elements have the same value, and yet the comparison still thinks the results are incorrect. To me, that suggests that the host is unable to correctly print and/or do arithmetic with half-precision values.

rocFFT is using the _Float16 data type in its code for half-precision values. It might be instructive to see if a simple standalone test program can print and do math with _Float16 values. Since FFTW does not natively support half-precision, the test will also cast the _Float16s to regular floats and back again. And C++ iostreams don't natively support _Float16s either, so in that case we cast to double. It would be interesting to verify that those operations work.

Assuming that's all OK, I think the next step would be to dig in with a debugger (e.g. rocgdb) - in particular, right after the hipMemcpy around accuracy_test.h:1556 is done, if we can print ((_Float16*)cpu_input[0].buf)[0] and ((_Float16*)cpu_input[0].buf)[1], we can at least confirm that the _Float16s we're getting back from the device look like sensible values (they should be between -0.5 and +0.5).

Is this package (with the debug build) available for download? I wouldn't mind trying to repro this in a docker container.

evetsso · 2023-08-25T19:30:50Z

FYI I've reproduced this problem in a rawhide Docker container. Something seems to be wrong with optimization and the compiler runtime.

With this test program (float16.cpp):

#include <iostream>
#include <algorithm>
#include <vector>

int main()
{
  const size_t count = 4;

  std::vector<_Float16> f16(count);
  std::vector<float> f32(count);

  for(size_t i = 0 ; i < count; ++i)
    f16[i] = static_cast<_Float16>(0.123) * static_cast<_Float16>(i);
  
  std::copy_n(f16.begin(), count, f32.begin());

  for(size_t i = 0; i < count; ++i)
    {
      std::cout << "loop f16=" << static_cast<float>(f16[i]) << " f32=" << f32[i] << std::endl;
    }

  _Float16 one_f16 = 0.123;
  float one_f32 = one_f16;
  std::cout << "one f32=" << one_f32 << std::endl;
  return 0;
}

Build it several ways:

hipcc -O0 float16.cpp -o float16_hipcc_O0
hipcc -O1 float16.cpp -o float16_hipcc_O1
hipcc -O2 float16.cpp -o float16_hipcc_O2
hipcc -O3 float16.cpp -o float16_hipcc_O3
hipcc -Og float16.cpp -o float16_hipcc_Og

-O3 produces sensible output, but the other versions show varying levels of incorrect output. A converting a single _Float16 to float works, but things go wrong in a loop.

Setting HIPCC_VERBOSE=7 in the environment will show the command line that is ultimately passed down to clang++. --rtlib=compiler-rt is the option that hipcc is adding that's breaking things. Everything's fine when I run clang++ directly but without that option.

I also note that ROCm/rocBLAS#1350 seems to be suspiciously similar, but in rocBLAS instead.

I'm not familiar enough with how these options are supposed to interact to know what the fix is. It's quite possible that this has already been fixed upstream (in LLVM, I imagine). I'll ask some colleagues to see if they can shed some light on this problem.

scchan · 2023-08-25T21:19:00Z

It might be related to the fp16 ABI change: llvm/llvm-project#56854
Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

tflink · 2023-08-25T23:49:14Z

FYI I've reproduced this problem in a rawhide Docker container.

Thank you for doing that. It was taking me a while to write an example because as it turns out, I was approaching the problem backwards. On the bright side, I learned a lot about FP16 and std::format in C++23.

Since I had it set up, I ran your code in a Debian 12 amd64 VM and the console output is consistent for all optimization levels.

Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

I'm not clear on what information you're asking for. Current Fedora rawhide has llvm 16.0.6. The cmake command used when its built (newlines inserted for readability) is:

/usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG
-DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF
-DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include
-DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc
-DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON
-G Ninja -DBUILD_SHARED_LIBS:BOOL=OFF -DLLVM_PARALLEL_LINK_JOBS=1
-DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_SKIP_RPATH:BOOL=ON -DLLVM_LIBDIR_SUFFIX=64
-DLLVM_TARGETS_TO_BUILD=all -DLLVM_ENABLE_LIBCXX:BOOL=OFF -DLLVM_ENABLE_ZLIB:BOOL=ON
-DLLVM_ENABLE_FFI:BOOL=ON -DLLVM_ENABLE_RTTI:BOOL=ON -DLLVM_USE_PERF:BOOL=ON
-DLLVM_BINUTILS_INCDIR=/usr/include -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=AVR
-DLLVM_BUILD_RUNTIME:BOOL=ON -DLLVM_INCLUDE_TOOLS:BOOL=ON
-DLLVM_BUILD_TOOLS:BOOL=ON -DLLVM_INCLUDE_TESTS:BOOL=ON -DLLVM_BUILD_TESTS:BOOL=ON
-DLLVM_INSTALL_GTEST:BOOL=ON -DLLVM_LIT_ARGS=-v -DLLVM_INCLUDE_EXAMPLES:BOOL=ON
-DLLVM_BUILD_EXAMPLES:BOOL=OFF -DLLVM_INCLUDE_UTILS:BOOL=ON
-DLLVM_INSTALL_UTILS:BOOL=ON -DLLVM_UTILS_INSTALL_DIR:PATH=/usr/bin
-DLLVM_TOOLS_INSTALL_DIR:PATH=bin -DLLVM_INCLUDE_DOCS:BOOL=ON
-DLLVM_BUILD_DOCS:BOOL=ON -DLLVM_ENABLE_SPHINX:BOOL=ON
-DLLVM_ENABLE_DOXYGEN:BOOL=OFF -DLLVM_VERSION_SUFFIX=
-DLLVM_BUILD_LLVM_DYLIB:BOOL=ON -DLLVM_LINK_LLVM_DYLIB:BOOL=ON
-DLLVM_BUILD_EXTERNAL_COMPILER_RT:BOOL=ON -DLLVM_INSTALL_TOOLCHAIN_ONLY:BOOL=OFF
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux-gnu -DSPHINX_WARNINGS_AS_ERRORS=OFF
-DCMAKE_INSTALL_PREFIX=/usr -DLLVM_INSTALL_SPHINX_HTML_DIR=/usr/share/doc/llvm/html
-DSPHINX_EXECUTABLE=/usr/bin/sphinx-build-3 -DLLVM_INCLUDE_BENCHMARKS=OFF

The full build logs for llvm-16.0.6

The patches used for that build

On a semi-related note, llvm17 will be landing in Fedora rawhide before much longer. llvm16 will continue to be available for a while yet, though.

yxsamliu · 2023-08-28T14:18:13Z

I think you may need to add -DLLVM_ENABLE_RUNTIMES="compiler-rt" to your cmake command line. Otherwise it may not build compiler-rt, then clang may use the system compiler-rt, which may cause ABI mismatch.

tflink · 2023-08-28T14:25:21Z

I think you may need to add -DLLVM_ENABLE_RUNTIMES="compiler-rt" to your cmake command line. Otherwise it may not build compiler-rt, then clang may use the system compiler-rt, which may cause ABI mismatch.

OK, sounds like a place to start. I'll make some test builds and we'll see if that fixes things.

tflink · 2023-08-28T15:01:51Z

After a bit more research, it turns out that compiler-rt is built as a separate package in Fedora and they're both on llvm 16.0.6 (latest released compiler-rt build logs) on Fedora Rawhide, which my test system is running.

Is there a way to test for an ABI mismatch other than building a monolithic llvm and potentially breaking a bunch of other stuff?

tstellar · 2023-08-28T15:52:34Z

It might be related to the fp16 ABI change: llvm/llvm-project#56854 Depending on how the LLVM was built on Fedora, the toolchain may end up with an ABI mismatch.

According to that bug, the ABI changed in Clang 15. So it seems like Clang 16 and newer should be unaffected. @tflink Are the tests linking against anything that was built with Clang 14?

tflink · 2023-08-28T16:02:11Z

@tflink Are the tests linking against anything that was built with Clang 14?

Not that I know of, no. AFAIK, everything built with Clang in Fedora was rebuilt with Clang 16 in the F39 mass rebuild last month and my system was updated to Rawhide as of at least 2023-08-14 (and probably more recent but I don't recall off the top of my head) when I submitted this issue.

yxsamliu · 2023-08-28T16:19:30Z

maybe use -O0 -save-temps -v to compile and link that test program. Then we can check the dumped .s file to see what API the caller is using. and use objdump -d with the executable to see what ABI the called function is using. Then we investigate why the callee has wrong ABI.

tstellar · 2023-08-28T16:25:17Z

@tflink Can you post the full buld log from when you built rocFFT?

tflink · 2023-08-28T16:39:03Z

maybe use -O0 -save-temps -v to compile and link that test program. Then we can check the dumped .s file to see what API the caller is using. and use objdump -d with the executable to see what ABI the called function is using. Then we investigate why the callee has wrong ABI.

I've done that and uploaded everything including the cpp file, the output binary and all the bits generated by hipcc.

In particular:

tflink · 2023-08-28T16:42:57Z

@tflink Can you post the full buld log from when you built rocFFT?

I didn't save it when I built rocFFT last but I can rebuild and save the output. Are there any particular options that you'd like to see? I assume -DCMAKE_VERBOSE_MAKEFILE=TRUE , -DCMAKE_BUILD_TYPE=Debug and -DBUILD_VERBOSE=ON but I'm not sure if there are others.

tstellar · 2023-08-28T16:44:06Z

@tflink I actually just wanted to see the build commands used to build the test, and it looks like you posted that in the previous comment.

tflink · 2023-08-28T17:10:43Z

I don't know if this is useful but I did the same test program compile on a debian12 VM that is NOT showing this issue.

I'm the wrong person to ask about the build details but this Debian 12 system appears to have llvm and clang 14.0 available from the debian repos but hipcc --version is showing something different (minus a python traceback which I assume is unrelated):

hipcc --version
HIP version: 5.2.21153-0
Debian clang version 15.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

tstellar · 2023-08-28T17:28:49Z

Would you be able to test in a Fedora 37 container? Fedora 37 has clang 15.0.6 just like debian 12.

tflink · 2023-08-28T17:36:56Z

Would you be able to test in a Fedora 37 container? Fedora 37 has clang 15.0.6 just like debian 12.

I will try to do so but the hipcc package is pretty new and was never built in F37. As I understand it, quite a bit of work went into getting it package-able so I'm not terribly optimistic that it'll work but we'll find out :)

tstellar · 2023-08-28T17:38:59Z

@tflink OK, I wouldn't go to a lot of trouble to get it built then. I thought it would be easy to try.

scchan · 2023-08-28T17:44:08Z

hipcc won't be needed. You could take the test from #439 (comment), which is just a plain CPU program and then compile with --rtlib=compiler-rt at -O2/-O3

tflink · 2023-08-28T18:21:13Z

I attempted to run the tests from the earlier comment on an F37 VM (clang 15.0.7) and got the following output:

$ clang++ -O0 --rtlib=compiler-rt -o float16_clang15_O0 float16.cpp
$ clang++ -O1 --rtlib=compiler-rt -o float16_clang15_O1 float16.cpp
$ clang++ -O2 --rtlib=compiler-rt -o float16_clang15_O2 float16.cpp
$ clang++ -O3 --rtlib=compiler-rt -o float16_clang15_O3 float16.cpp
$ clang++ -Og --rtlib=compiler-rt -o float16_clang15_Og float16.cpp

$ ./float16_clang15_O0
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O1
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O2
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_O3
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

$ ./float16_clang15_Og
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

Assuming that I've understood the non-hipcc reproducer correctly, the same issue doesn't show up in F37 and clang 15.

tflink · 2023-08-28T18:29:15Z

I'm going to try rebuilding hipcc on F39 against clang/llvm 15 instead of 16

tflink · 2023-08-28T19:27:58Z

I'm going to try rebuilding hipcc on F39 against clang/llvm 15 instead of 16

This is going to take more effort than I anticipated, if it's even possible. I'll update again if I'm able to make it work.

tstellar · 2023-08-28T20:39:07Z

This looks like a bug in clang 16:

[root@23c79a8c3e0a ~]# clang++ float16.cpp -o libgcc
[root@23c79a8c3e0a ~]# ./libgcc 
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
[root@23c79a8c3e0a ~]# clang++ float16.cpp -rtlib=compiler-rt -o crt
[root@23c79a8c3e0a ~]# ./crt
loop f16=24.625 f32=-856
loop f16=24.625 f32=0
loop f16=24.625 f32=0
loop f16=24.625 f32=0
one f32=50688

tstellar · 2023-08-28T22:14:18Z

I've narrowed this down further, and it looks like there is a bug in Fedora's compiler-rt package.

TorreZuk · 2023-08-29T14:18:42Z

You could add clang flags -mf16c and -D__STDC_WANT_IEC_60559_TYPES_EXT__=1 to see if it helps the compiler but the former may be default on all your clang versions.

tstellar · 2023-08-30T17:46:20Z

@tflink Can you install this build and see if it fixes the issue for you:

dnf install https://kojipkgs.fedoraproject.org//work/tasks/9610/105499610/compiler-rt-16.0.6-2.fc38.x86_64.rpm

tflink · 2023-08-30T20:22:23Z

@tstellar I'll do a full rocFFT build and test cycle and report back either later today or tomorrow since that build+test cycle takes almost 6 hours on my machine.

In the meantime, the fix looks good for the example posted above. On my F40 machine with LLVM16, I'm now seeing the same output that I was seeing on F37 with LLVM15.

$ hipcc -O0 float16.cpp -o float16_hipcc_O0
$ hipcc -O1 float16.cpp -o float16_hipcc_O1
$ hipcc -O2 float16.cpp -o float16_hipcc_O2
$ hipcc -O3 float16.cpp -o float16_hipcc_O3
$ hipcc -Og float16.cpp -o float16_hipcc_Og

$ ./float16_hipcc_O0
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_O1
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
:$ ./float16_hipcc_O2
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_O3
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986
$ ./float16_hipcc_Og
loop f16=0 f32=0
loop f16=0.122986 f32=0.122986
loop f16=0.245972 f32=0.245972
loop f16=0.368896 f32=0.368896
one f32=0.122986

tflink · 2023-08-31T14:39:50Z

@tstellar I rebuilt rocFFT locally with the compiler-rt build you linked and the tests passed; I don't see a single failure in the test output that I captured. Additionally, the half precision stats printed at the end of the test suite look far less suspicious to me.

Random seed: 2591561259
half precision max l-inf epsilon: 0.000801052
half precision max l2 epsilon:     0.0148505
single precision max l-inf epsilon: 1.72013e-07
single precision max l2 epsilon:     5.58241e-06
double precision max l-inf epsilon: 4.86806e-16
double precision max l2 epsilon:     1.58969e-14

Unless someone sees something that I don't, I think this can be closed.

Thank you all for your help in getting this resolved.

evetsso · 2023-08-31T14:48:56Z

@tstellar Do you have a link to the compiler-rt change that fixes this problem? ROCm/rocBLAS#1350 seems to be hitting the same problem, but for Gentoo and I'm wondering how they can get this fix.

From ROCm/rocFFT#439

tstellar · 2023-09-02T23:09:52Z

@evetsso https://src.fedoraproject.org/fork/tstellar/rpms/compiler-rt/blob/0459cbc5d9eb15f1ad51d74707b4988049183708/f/0001-compiler-rt-Fix-FLOAT16-feature-detection.patch

evetsso · 2023-09-05T19:19:04Z

As this issue appears to be caused and fixed upstream, I'm closing it now. Thanks everyone, for your input.

littlewu2508 · 2023-09-10T08:09:05Z

compiler-rt-Fix-FLOAT16-feature-detection.patch

Hello, is this patch submitted to llvm-project upstream?

tstellar · 2023-09-11T15:27:20Z

@littlewu2508 It's not an upstream bug. It was a bug in the Fedora builds.

littlewu2508 · 2023-09-12T00:05:47Z

@littlewu2508 It's not an upstream bug. It was a bug in the Fedora builds.

Sorry, I don't quite understand -- the patch is on upstream's CMakeLists.txt, so I would think the bug is from upstream. Also this patches fix Gentoo's problem ROCm/rocBLAS#1350, so I think this is not specific to one distro. Upstreaming the patch also make it easier to distro packages.

tstellar · 2023-09-12T00:13:08Z

@littlewu2508 Upstream does not support the build configurations that Fedora is using. I'm guessing this is probably also true for Gentoo. So, they won't accept a patch like this.

littlewu2508 · 2023-09-12T00:34:57Z

@littlewu2508 Upstream does not support the build configurations that Fedora is using. I'm guessing this is probably also true for Gentoo. So, they won't accept a patch like this.

Thanks for the explanation! I'll contact Gentoo maintainer @mgorny to help look at this, since this is a major regression with ROCm libraries.

@mgorny Can you please take a look at https://src.fedoraproject.org/fork/tstellar/rpms/compiler-rt/blob/0459cbc5d9eb15f1ad51d74707b4988049183708/f/0001-compiler-rt-Fix-FLOAT16-feature-detection.patch and apply it on compiler-rt package? Gentoo ROCm-5.6 stack is having the same problem and I don't bump the version because of this. It at least affects rocBLAS as ROCm/rocBLAS#1350 reported.

CMAKE_TRY_COMPILE_TARGET_TYPE defaults to EXECUTABLE, which causes any feature detection code snippet without a main function to fail, so we need to make sure it gets explicitly set to STATIC_LIBRARY. Bug: ROCm/rocFFT#439 Bug: ROCm/rocBLAS#1350 Bug: https://bugs.gentoo.org/916069 Closes: #69842 Reviewed by: thesamesam, mgorny

From ROCm/rocFFT#439

tflink changed the title ~~rocFFT Expected Test Suite Results~~ rocFFT Test Suite Fails Aug 22, 2023

tflink closed this as completed Aug 22, 2023

evetsso reopened this Aug 25, 2023

rkamd mentioned this issue Aug 28, 2023

[Bug]: rocBLAS fails tests badly in FP16 for distro packages ROCm/rocBLAS#1350

Open

tstellar added a commit to tstellar/llvm-toolchain-integration-test-suite that referenced this issue Sep 2, 2023

Add a test for _Float16 ABI compatibility between clang and compiler-rt

e5e3b0b

From ROCm/rocFFT#439

tstellar added a commit to tstellar/llvm-toolchain-integration-test-suite that referenced this issue Sep 2, 2023

Add a test for _Float16 ABI compatibility between clang and compiler-rt

dc4c4a7

From ROCm/rocFFT#439

tstellar mentioned this issue Sep 2, 2023

Add a test for _Float16 ABI compatibility between clang and compiler-rt opencollab/llvm-toolchain-integration-test-suite#107

Open

evetsso closed this as completed Sep 5, 2023

This was referenced Oct 19, 2023

ROCm 5.7.1 scilibs gentoo/gentoo#33400

Closed

[compiler-rt][builtins] Fix FLOAT16 feature detection llvm/llvm-project#69842

Open

sylvestre pushed a commit to tstellar/llvm-toolchain-integration-test-suite that referenced this issue May 18, 2024

Add a test for _Float16 ABI compatibility between clang and compiler-rt

9b6e402

From ROCm/rocFFT#439

rocFFT Test Suite Fails #439

rocFFT Test Suite Fails #439

Comments

tflink commented Aug 22, 2023

What is the expected behavior

What actually happens

How to reproduce

Environment

cgmb commented Aug 22, 2023

tflink commented Aug 22, 2023

evetsso commented Aug 22, 2023

tflink commented Aug 22, 2023

evetsso commented Aug 22, 2023 • edited

tflink commented Aug 23, 2023

evetsso commented Aug 23, 2023

evetsso commented Aug 25, 2023

scchan commented Aug 25, 2023

tflink commented Aug 25, 2023

yxsamliu commented Aug 28, 2023

tflink commented Aug 28, 2023

tflink commented Aug 28, 2023

tstellar commented Aug 28, 2023

tflink commented Aug 28, 2023

yxsamliu commented Aug 28, 2023

tstellar commented Aug 28, 2023

tflink commented Aug 28, 2023

tflink commented Aug 28, 2023

tstellar commented Aug 28, 2023

tflink commented Aug 28, 2023

tstellar commented Aug 28, 2023

tflink commented Aug 28, 2023

tstellar commented Aug 28, 2023

scchan commented Aug 28, 2023

tflink commented Aug 28, 2023

tflink commented Aug 28, 2023

tflink commented Aug 28, 2023

tstellar commented Aug 28, 2023

tstellar commented Aug 28, 2023

TorreZuk commented Aug 29, 2023

tstellar commented Aug 30, 2023

tflink commented Aug 30, 2023

tflink commented Aug 31, 2023

evetsso commented Aug 31, 2023

tstellar commented Sep 2, 2023

evetsso commented Sep 5, 2023

littlewu2508 commented Sep 10, 2023

tstellar commented Sep 11, 2023

littlewu2508 commented Sep 12, 2023

tstellar commented Sep 12, 2023

littlewu2508 commented Sep 12, 2023

evetsso commented Aug 22, 2023 •

edited