Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use fallback libraries for archs without optimized logic
Fixes #1757. Enables architectures that don't have optimized logic files to also produce libraries when `--separate-architectures` or `--lazy-library-loading` is turned on. Previously, one must disable both of these two flags in order for rocBLAS to run on architectures like `gfx1010`. Test plan: ``` cmake -GNinja -B build -S . \ -DCMAKE_C_COMPILER=hipcc \ -DCMAKE_CXX_COMPILER=hipcc \ -DBUILD_CLIENTS_TESTS=OFF \ -DBUILD_CLIENTS_BENCHMARKS=OFF \ -DBUILD_CLIENTS_SAMPLES=OFF \ -DBUILD_TESTING=OFF \ -DBUILD_WITH_TENSILE=ON \ -DTensile_PRINT_DEBUG=ON \ -DTensile_LIBRARY_FORMAT=msgpack \ -DTensile_CPU_THREADS=14 \ -DTensile_LAZY_LIBRARY_LOADING=ON \ -DAMDGPU_TARGETS="..." ``` With `AMDGPU_TARGETS` being one of the following - `AMDGPU_TARGETS=gfx1010` - `AMDGPU_TARGETS=gfx1030;gfx1010` - `AMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102` In all three cases, `$ROCM_PATH/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat` is produced and all other `*.dat` files remain unchanged. Signed-off-by: Gavin Zhao <git@gzgz.dev>
- Loading branch information