You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to build this library so that I can link it to rocBLAS and have a functioning gemm implementation in HIP (similarly to this PR) for gfx1010/gfx1012.
I saw a commit titled Kernels now working on gfx1010. and I'm wondering what set of environment variables and compilers is needed to accomplish this.
I'm using Linux Mint 19.3 with ROCm 3.3.0 on a RX 5500 XT (gfx1012) and RX 5700 XT (gfx1010). Host processor is Ryzen 9 3900X.
I first tried python3 ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./ Tensile_default_output.TXT.zip
Most notably, I don't think that HCC supports gfx1010 and up, so I see a lot of compiler errors that look like
'['/opt/rocm/bin/hcc', '-x', 'assembler', '-target', 'amdgcn-amd-amdhsa', '-mno-code-object-v3', '-mcpu=gfx1010', '-mwavefrontsize64', '-c', '-o', '/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT64x32x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT8_4_VW1_WG8_8_4_WGM1.o', '/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT64x32x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT8_4_VW1_WG8_8_4_WGM1.s']' returned non-zero exit status 1.
/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT32x32x32_SE_AMAS0_EPS0_GRVW1_K1_PGR0_TT4_4_VW1_WG8_8_4_WGM8.s:1461:1: /home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT32x64x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT4_8_VW1_WG8_8_4_WGM1.s:2346:1: error: instruction not supported on this GPU
Additionally, the generated run.sh script tries to set the graphics card clock to an invalid value.
+ /opt/rocm/bin/rocm-smi -d 0 --setfan 255 --setsclk 7
[sudo] password for mihir:
========================ROCm System Management Interface========================
ERROR: GPU[0] : Unable to set clock level
ERROR: GPU[0] : Max clock level is 2
GPU[0] : Successfully set fan control to 'manual'
GPU[0] : Successfully set fan speed to Level 255
I also tried setting --cxx-compiler to hipcc with $HIP_PLATFORM set to clang but CMake-generated flags include -hc which isn't recognized by the compiler. I could get a little further by editing TensileCreateLibrary.py to pass in -D__HIP_VDI__, manually editing the generated flags.make in the build files, and setting the benchmark config to not build a new client, but I inevitably run into one of the above issues.
Is there a recommended way to directly build a basic gemm kernel in pure HIP that can be used by rocBLAS without using the benchmarking driver program?
The text was updated successfully, but these errors were encountered:
I'm trying to build this library so that I can link it to rocBLAS and have a functioning gemm implementation in HIP (similarly to this PR) for gfx1010/gfx1012.
I saw a commit titled
Kernels now working on gfx1010.
and I'm wondering what set of environment variables and compilers is needed to accomplish this.I'm using Linux Mint 19.3 with ROCm 3.3.0 on a RX 5500 XT (gfx1012) and RX 5700 XT (gfx1010). Host processor is Ryzen 9 3900X.
I first tried
python3 ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./
Tensile_default_output.TXT.zip
Most notably, I don't think that HCC supports gfx1010 and up, so I see a lot of compiler errors that look like
Additionally, the generated
run.sh
script tries to set the graphics card clock to an invalid value.I also tried setting
--cxx-compiler
tohipcc
with$HIP_PLATFORM
set toclang
but CMake-generated flags include-hc
which isn't recognized by the compiler. I could get a little further by editing TensileCreateLibrary.py to pass in-D__HIP_VDI__
, manually editing the generatedflags.make
in the build files, and setting the benchmark config to not build a new client, but I inevitably run into one of the above issues.Is there a recommended way to directly build a basic gemm kernel in pure HIP that can be used by rocBLAS without using the benchmarking driver program?
The text was updated successfully, but these errors were encountered: