Skip to content

[AMDGPU] kernel regression after enabling amdgpu-uniform-intrinsic-combine #166665

@raiseirql

Description

@raiseirql

We (Modular) integrated LLVM commit 2108c623e618265c4146c405f196953a9c157e73 to our build which includes the changes to turn on amdgpu-uniform-intrinsic-combine by default (#162819).

The kernel located here regressed as a result: #166657 (comment)

Some of the readfirstlane intrinsics were removed from this kernel, but then later passes of the compiler ended up using VGPRs for some of the calculations, and this in turned triggered si-fix-sgpr-copies to generate code like this at every buffer load instruction:

.LBB0_9:                                ;   Parent Loop BB0_7 Depth=1
                                        ; =>  This Inner Loop Header: Depth=2
        v_readfirstlane_b32 s4, v0
        v_readfirstlane_b32 s5, v1
        v_readfirstlane_b32 s6, v254
        v_readfirstlane_b32 s7, v255
        v_cmp_eq_u64_e32 vcc, s[4:5], v[0:1]
        s_nop 0
        v_cmp_eq_u64_e64 s[2:3], s[6:7], v[254:255]
        s_and_b64 s[2:3], vcc, s[2:3]
        s_and_saveexec_b64 s[2:3], s[2:3]
        buffer_load_dwordx4 v[6:9], v10, s[4:7], s48 offen
        s_xor_b64 exec, exec, s[2:3]
        s_cbranch_execnz .LBB0_9
; %bb.10:                               ;   in Loop: Header=BB0_7 Depth=1
        s_cmp_lg_u32 s55, 0
        s_mov_b64 exec, s[38:39]
        s_cselect_b32 s55, 1, 0
        s_mov_b64 s[38:39], exec
.LBB0_11:              

We have worked around locally by turning off the pass, but the pass has value and we would like it to be enabled. Either the original determination that this was uniform was wrong or the instruction selection later should have picked a scalar form of some instruction to avoid generating the above sequence.

I was comparing the code from:

llc -O3 -mcpu=gfx950 mha.ll --amdgpu-enable-uniform-intrinsic-combine=false
llc -O3 -mcpu=gfx950 mha.ll --amdgpu-enable-uniform-intrinsic-combine=true

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions