Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions llvm/lib/Target/AMDGPU/GCNSubtarget.h
Original file line number Diff line number Diff line change
Expand Up @@ -1868,6 +1868,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
return GFX1250Insts && getGeneration() == GFX12;
}

// src_flat_scratch_hi cannot be used as a source in SALU producing a 64-bit
// result.
bool hasFlatScratchHiInB64InstHazard() const {
return GFX1250Insts && getGeneration() == GFX12;
}

/// \returns true if the subtarget supports clusters of workgroups.
bool hasClusters() const { return HasClusters; }

Expand Down
11 changes: 11 additions & 0 deletions llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6256,6 +6256,17 @@ bool SIInstrInfo::isLegalRegOperand(const MachineInstr &MI, unsigned OpIdx,
(int)OpIdx == AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0) &&
RI.isSGPRReg(MRI, MO.getReg()))
return false;

if (ST.hasFlatScratchHiInB64InstHazard() &&
MO.getReg() == AMDGPU::SRC_FLAT_SCRATCH_BASE_HI && isSALU(MI)) {
if (const MachineOperand *Dst = getNamedOperand(MI, AMDGPU::OpName::sdst)) {
if (AMDGPU::getRegBitWidth(*MRI.getRegClass(Dst->getReg())) == 64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a register class check without assuming this is a virtual register.

Also should cover this in a verifier test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a register class check without assuming this is a virtual register.

Indeed: #170395

Also should cover this in a verifier test?

Why? We do not do it for other hazards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do and should cover missed cases. The operand legality is primarily a verification function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, should not we then just iterate all operands in the verifyInstruction() and call isOperandLegal()? Why would we duplicate the code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return false;
}
if (Opc == AMDGPU::S_BITCMP0_B64 || Opc == AMDGPU::S_BITCMP1_B64)
return false;
}

return true;
}

Expand Down
145 changes: 145 additions & 0 deletions llvm/test/CodeGen/AMDGPU/hazard-gfx1250-flat-scr-hi.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 6
# RUN: llc -mtriple=amdgcn -mcpu=gfx1250 -run-pass si-fold-operands %s -o - | FileCheck -check-prefix=GCN %s

---
name: s_ashr_i64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_ashr_i64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_ASHR_I64_:%[0-9]+]]:sreg_64 = S_ASHR_I64 undef %2:sreg_64, [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_ASHR_I64 undef %1:sreg_64, %0, implicit-def $scc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid using undef operands in SSA tests, I want to eventually ban that in the verifier

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...

---
name: s_lshl_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_lshl_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_LSHL_B64_:%[0-9]+]]:sreg_64 = S_LSHL_B64 undef %2:sreg_64, [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_LSHL_B64 undef %1:sreg_64, %0, implicit-def $scc
...

---
name: s_lshr_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_lshr_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_LSHR_B64_:%[0-9]+]]:sreg_64 = S_LSHR_B64 undef %2:sreg_64, [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_LSHR_B64 undef %1:sreg_64, %0, implicit-def $scc
...

---
name: s_bfe_i64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bfe_i64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BFE_I64_:%[0-9]+]]:sreg_64 = S_BFE_I64 undef %2:sreg_64, [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_BFE_I64 undef %1:sreg_64, %0, implicit-def $scc
...

---
name: s_bfe_u64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bfe_u64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BFE_U64_:%[0-9]+]]:sreg_64 = S_BFE_U64 undef %2:sreg_64, [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_BFE_U64 undef %1:sreg_64, %0, implicit-def $scc
...

---
name: s_bfm_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bfm_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BFM_B64_:%[0-9]+]]:sreg_64 = S_BFM_B64 [[COPY]], 1, implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%1:sreg_64 = S_BFM_B64 %0, 1, implicit-def $scc
...

---
name: s_bitcmp0_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bitcmp0_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: S_BITCMP0_B64 undef %1:sreg_64, [[COPY]], implicit undef $scc, implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
S_BITCMP0_B64 undef %1:sreg_64, %0, implicit undef $scc, implicit-def $scc
...

---
name: s_bitcmp1_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bitcmp1_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: S_BITCMP1_B64 undef %1:sreg_64, [[COPY]], implicit undef $scc, implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
S_BITCMP1_B64 undef %1:sreg_64, %0, implicit undef $scc, implicit-def $scc
...

---
name: s_bitreplicate_b64_b32
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bitreplicate_b64_b32
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BITREPLICATE_B64_B32_:%[0-9]+]]:sreg_64 = S_BITREPLICATE_B64_B32 [[COPY]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%2:sreg_64 = S_BITREPLICATE_B64_B32 %0, implicit-def $scc
...

---
name: s_bitset0_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bitset0_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BITSET0_B64_:%[0-9]+]]:sreg_64 = S_BITSET0_B64 [[COPY]], undef [[S_BITSET0_B64_]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%1:sreg_64 = S_BITSET0_B64 %0, undef %1:sreg_64, implicit-def $scc
...

---
name: s_bitset1_b64
tracksRegLiveness: true
body: |
bb.0:
; GCN-LABEL: name: s_bitset1_b64
; GCN: [[COPY:%[0-9]+]]:sreg_32 = COPY $src_flat_scratch_base_hi
; GCN-NEXT: [[S_BITSET1_B64_:%[0-9]+]]:sreg_64 = S_BITSET1_B64 [[COPY]], undef [[S_BITSET1_B64_]], implicit-def $scc
%0:sreg_32 = COPY $src_flat_scratch_base_hi
%1:sreg_64 = S_BITSET1_B64 %0, undef %1:sreg_64, implicit-def $scc
...