hipblasdgemm not getting close to peak #1705

JorgeG94 · 2023-04-06T00:29:02Z

What is the expected behavior

I would expect a dgemm of sizeable input to achieve close to the 47.9 TFLOP/s

What actually happens

By using the code provided in: https://github.com/JorgeG94/calum_performance_tool it can be seen that rocm/5.4.0 produces 38TFLOP/s

How to reproduce

The code in the repo https://github.com/JorgeG94/calum_performance_tool has the readme, but basically:
hipcc -L/opt/rocm-5.4.3/lib -lhipblas --offload-arch=gfx90a performance.cpp
./a.out 36000 14400 36000 10 T T

Environment

Hardware	description
GPU	MI250x
CPU	AMD Optimized 3rd Gen EPYC

Software	version
ROCM	v5.4.0

The text was updated successfully, but these errors were encountered:

JorgeG94 · 2023-04-06T00:34:06Z

I've tried larger sizes and at some point the code just breaks without ever breaking the 40 TFLOP barrier

daineAMD · 2023-04-06T15:28:15Z

Hi @JorgeG94, thanks for opening this issue.

hipBLAS is just a wrapper library for rocBLAS/cuBLAS backends. rocBLAS then uses the Tensile library for calls to gemm. Since you're looking for better performance in dgemm, I think it will be best if I transfer this issue to the Tensile library where they can hopefully help you out. Performance tuning done there will be realized in rocBLAS and hipBLAS w/ AMD backend.

Thanks,
Daine

nakajee · 2023-04-11T17:09:00Z

I will check this on my side.
Does the performance drop happen only with this size?
Have you checked other sizes and/or orientations?

daineAMD pinned this issue Apr 6, 2023

daineAMD unpinned this issue Apr 6, 2023

daineAMD transferred this issue from ROCm/hipBLAS Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hipblasdgemm not getting close to peak #1705

hipblasdgemm not getting close to peak #1705

JorgeG94 commented Apr 6, 2023

JorgeG94 commented Apr 6, 2023

daineAMD commented Apr 6, 2023

nakajee commented Apr 11, 2023

hipblasdgemm not getting close to peak #1705

hipblasdgemm not getting close to peak #1705

Comments

JorgeG94 commented Apr 6, 2023

What is the expected behavior

What actually happens

How to reproduce

Environment

JorgeG94 commented Apr 6, 2023

daineAMD commented Apr 6, 2023

nakajee commented Apr 11, 2023