Skip to content

cublaslt_gemm fp8 does not work with RTX 40 #526

@edisonchan

Description

@edisonchan

What's the issue, what's expected?:

cuBLAS call cublasLtMatmulAlgoGetHeuristic(handle_.get(), op_desc_.get(), a_desc_.get(), b_desc_.get(), c_desc_.get(), d_desc_.get(), preference_.get(), max_algorithm_count, results.data(), &found_algorithm_count) failed at /opt/superbench/superbench/benchmarks/micro_benchmarks/cublaslt_gemm/cublaslt_utils.cc:98 'the requested functionality is not supported'

How to reproduce it?:

docker run -it --rm --gpus all -e NVIDIA_VISIBLE_DEVICES=0 --shm-size=1g --ulimit memlock=-1  superbench/superbench:v0.8.0-cuda12.1
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/superbench/lib
ldconfig
cd bin
./cublaslt_gemm -b 64 -m 2560 -n 2560 -k 16384 -i 1000 -t fp8e4m3

Additional information:
OS: ubuntu 22.04 64-bit
GPU: GeForce RTX 4090 with driver 530.41.03

Metadata

Metadata

Assignees

Labels

wontfixThis will not be worked on

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions