Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic build for gfx1010 #926

Open
mihirparadkar opened this issue Apr 14, 2020 · 0 comments
Open

Basic build for gfx1010 #926

mihirparadkar opened this issue Apr 14, 2020 · 0 comments

Comments

@mihirparadkar
Copy link

mihirparadkar commented Apr 14, 2020

I'm trying to build this library so that I can link it to rocBLAS and have a functioning gemm implementation in HIP (similarly to this PR) for gfx1010/gfx1012.

I saw a commit titled Kernels now working on gfx1010. and I'm wondering what set of environment variables and compilers is needed to accomplish this.

I'm using Linux Mint 19.3 with ROCm 3.3.0 on a RX 5500 XT (gfx1012) and RX 5700 XT (gfx1010). Host processor is Ryzen 9 3900X.

I first tried python3 ../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_asm_only.yaml ./
Tensile_default_output.TXT.zip

Most notably, I don't think that HCC supports gfx1010 and up, so I see a lot of compiler errors that look like

'['/opt/rocm/bin/hcc', '-x', 'assembler', '-target', 'amdgcn-amd-amdhsa', '-mno-code-object-v3', '-mcpu=gfx1010', '-mwavefrontsize64', '-c', '-o', '/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT64x32x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT8_4_VW1_WG8_8_4_WGM1.o', '/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT64x32x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT8_4_VW1_WG8_8_4_WGM1.s']' returned non-zero exit status 1.
/home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT32x32x32_SE_AMAS0_EPS0_GRVW1_K1_PGR0_TT4_4_VW1_WG8_8_4_WGM8.s:1461:1: /home/mihir/Programs/TensileOcl/Tensile/build_hip/1_BenchmarkProblems/Cijk_Ailk_Bljk_SB_00/00_BF/sourceTmp/assembly/Cijk_Ailk_Bljk_SB_MT32x64x32_SE_AMAS0_EPS1_GRVW1_K1_PGR1_TT4_8_VW1_WG8_8_4_WGM1.s:2346:1: error: instruction not supported on this GPU

Additionally, the generated run.sh script tries to set the graphics card clock to an invalid value.

+ /opt/rocm/bin/rocm-smi -d 0 --setfan 255 --setsclk 7
[sudo] password for mihir:          


========================ROCm System Management Interface========================
ERROR: GPU[0] 		: Unable to set clock level
ERROR: GPU[0]	: Max clock level is 2
GPU[0] 		: Successfully set fan control to 'manual'
GPU[0] 		: Successfully set fan speed to Level 255

I also tried setting --cxx-compiler to hipcc with $HIP_PLATFORM set to clang but CMake-generated flags include -hc which isn't recognized by the compiler. I could get a little further by editing TensileCreateLibrary.py to pass in -D__HIP_VDI__, manually editing the generated flags.make in the build files, and setting the benchmark config to not build a new client, but I inevitably run into one of the above issues.

Is there a recommended way to directly build a basic gemm kernel in pure HIP that can be used by rocBLAS without using the benchmarking driver program?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant