Skip to content

Commit

Permalink
[AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12
Browse files Browse the repository at this point in the history
llvm#90201 made some fixes for gfx12
image_msaa_load waitcnt insertion.
That fix might break in some situations for pre-gfx12 - this fixes that by
explitly checking for VSAMPLE which always requires a s_wait_samplecnt and
leaves the previous logic intact for non-gfx12.
  • Loading branch information
dstutt committed May 1, 2024
1 parent a051425 commit 480d565
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
12 changes: 6 additions & 6 deletions llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,12 +187,12 @@ VmemType getVmemType(const MachineInstr &Inst) {
const AMDGPU::MIMGInfo *Info = AMDGPU::getMIMGInfo(Inst.getOpcode());
const AMDGPU::MIMGBaseOpcodeInfo *BaseInfo =
AMDGPU::getMIMGBaseOpcodeInfo(Info->BaseOpcode);
// The test for MSAA here is because gfx12+ image_msaa_load is actually
// encoded as VSAMPLE and requires the appropriate s_waitcnt variant for that.
// Pre-gfx12 doesn't care since all vmem types result in the same s_waitcnt.
return BaseInfo->BVH ? VMEM_BVH
: BaseInfo->Sampler || BaseInfo->MSAA ? VMEM_SAMPLER
: VMEM_NOSAMPLER;
// We have to make an additional check for isVSAMPLE here since some
// instructions don't have a sampler, but are still classified as sampler
// instructions for the purposes of e.g. waitcnt.
return BaseInfo->BVH ? VMEM_BVH
: (BaseInfo->Sampler || SIInstrInfo::isVSAMPLE(Inst)) ? VMEM_SAMPLER
: VMEM_NOSAMPLER;
}

unsigned &getCounterRef(AMDGPU::Waitcnt &Wait, InstCounterType T) {
Expand Down
1 change: 1 addition & 0 deletions llvm/test/CodeGen/AMDGPU/waitcnt-sample-waw.mir
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: S_WAITCNT 0
; GFX11-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = IMAGE_SAMPLE_V4_V1_gfx11 killed renamable $vgpr0, renamable $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, killed renamable $sgpr8_sgpr9_sgpr10_sgpr11, 15, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load (s128), addrspace 8)
; GFX11-NEXT: S_WAITCNT 1015
; GFX11-NEXT: renamable $vgpr0_vgpr1_vgpr2_vgpr3 = IMAGE_MSAA_LOAD_V4_V2_gfx11 killed renamable $vgpr4_vgpr5, killed renamable $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7, 4, 7, -1, 0, 0, -1, 0, 0, 0, implicit $exec :: (dereferenceable load (s128), addrspace 8)
; GFX11-NEXT: S_WAITCNT 1015
; GFX11-NEXT: SI_RETURN_TO_EPILOG killed $vgpr0, killed $vgpr1, killed $vgpr2, killed $vgpr3
Expand Down

0 comments on commit 480d565

Please sign in to comment.