Skip to content

Commit

Permalink
Use fallback libraries for archs without optimized logic
Browse files Browse the repository at this point in the history
Fixes #1757.

Enables architectures that don't have optimized logic files to also produce
libraries when `--separate-architectures` or `--lazy-library-loading` is
turned on. Previously, one must disable both of these two flags in order for
rocBLAS to run on architectures like `gfx1010`.

Test plan:
```
cmake -GNinja -B build -S . \
    -DCMAKE_C_COMPILER=hipcc \
    -DCMAKE_CXX_COMPILER=hipcc \
    -DBUILD_CLIENTS_TESTS=OFF \
    -DBUILD_CLIENTS_BENCHMARKS=OFF \
    -DBUILD_CLIENTS_SAMPLES=OFF \
    -DBUILD_TESTING=OFF \
    -DBUILD_WITH_TENSILE=ON \
    -DTensile_PRINT_DEBUG=ON \
    -DTensile_LIBRARY_FORMAT=msgpack \
    -DTensile_CPU_THREADS=14 \
    -DTensile_LAZY_LIBRARY_LOADING=ON \
    -DAMDGPU_TARGETS="..."
```
With `AMDGPU_TARGETS` being one of the following
- `AMDGPU_TARGETS=gfx1010`
- `AMDGPU_TARGETS=gfx1030;gfx1010`
- `AMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102`

In all three cases,
`$ROCM_PATH/lib/rocblas/library/TensileLibrary_lazy_gfx1010.dat` is produced
and all other `*.dat` files remain unchanged.

Signed-off-by: Gavin Zhao <git@gzgz.dev>
  • Loading branch information
GZGavinZhao committed Jan 11, 2024
1 parent d2924ce commit 9fa257d
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions Tensile/TensileCreateLibrary.py
Original file line number Diff line number Diff line change
Expand Up @@ -940,11 +940,18 @@ def generateLogicDataAndSolutions(logicFiles, args):
# logicData[problemType].append((scheduleName, deviceNames, \
# solutionsForSchedule, indexOrder, exactLogic, rangeLogic ))

(archs, _) = splitArchs()
if globalParameters["SeparateArchitectures"] or globalParameters["LazyLibraryLoading"]:
if "fallback" in masterLibraries.keys():
for key, value in masterLibraries.items():
if key != "fallback":
value.merge(deepcopy(masterLibraries["fallback"]))
for archName in archs:
archName = archName.split('-', 1)[0]
if archName not in masterLibraries:
print1("Using fallback for arch: " + archName)
masterLibraries[archName] = deepcopy(masterLibraries["fallback"])
masterLibraries[archName].version = args.version

masterLibraries.pop("fallback")

Expand Down

0 comments on commit 9fa257d

Please sign in to comment.