Open
Description
Running pass si-fix-sgpr-copies
on the following machine ir changes the %7:vgpr_16
operand of the second V_CNDMASK_B16_t16_e64
into a register of type sgpr_lo16
which is invalid.
test.mir
---
name: sgpr_copy_invalid_type
tracksRegLiveness: true
body: |
bb.0:
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32 = IMPLICIT_DEF
%2:sreg_32_xm0_xexec = IMPLICIT_DEF
%3:sgpr_lo16 = COPY undef %0.lo16
%4:sreg_32 = COPY undef %3
%5:sreg_32 = S_AND_B32 undef %4, killed undef %1, implicit-def dead $scc
%6:sgpr_32 = S_CVT_F32_F16 killed undef %5, implicit $mode
%7:vgpr_16 = COPY undef %3
%8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7, 0, 1, killed undef %2, 0, implicit $exec
S_ENDPGM 0
bb.2:
successors: %bb.0(0x80000000)
S_CMP_LG_U32 killed undef %4, killed undef %4, implicit-def $scc
S_CBRANCH_SCC1 %bb.0, implicit undef $scc
S_ENDPGM 0
...
llc invocation
llc -mtriple=amdgcn -mcpu=gfx1150 -mattr=+real-true16 -print-changed=cdiff -run-pass=si-fix-sgpr-copies -debug-only=si-fix-sgpr-copies -verify-machineinstrs test.mir
Error message from machine verifier
*** Bad machine code: Illegal virtual register for instruction ***
- function: sgpr_copy_invalid_type
- basic block: %bb.0 (0x557e9fcd29b8)
- instruction: %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
- operand 2: undef %7:sgpr_lo16
Expected a VS_16 register, but got a SGPR_LO16 register
Debug output from si-fix-sgpr-copies
V2S copy %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
is being turned to v_readfirstlane_b32 Score: 3
*** IR Dump After SI Fix SGPR copies (si-fix-sgpr-copies) on sgpr_copy_invalid_type ***
# Machine code for function sgpr_copy_invalid_type: IsSSA, NoPHIs, TracksLiveness
bb.0:
; predecessors: %bb.1
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32 = IMPLICIT_DEF
%2:sreg_32_xm0_xexec = IMPLICIT_DEF
- %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
- %4:sreg_32 = COPY undef %3:sgpr_lo16
+ %10:vgpr_16 = IMPLICIT_DEF
+ %9:vgpr_32 = REG_SEQUENCE %0.lo16:vgpr_32, %subreg.lo16, %10:vgpr_16, %subreg.hi16
+ %3:sreg_32_xm0 = V_READFIRSTLANE_B32 %9:vgpr_32, implicit $exec
+ %4:sreg_32 = COPY undef %3:sreg_32_xm0
%5:sreg_32 = S_AND_B32 undef %4:sreg_32, killed undef %1:sreg_32, implicit-def dead $scc
%6:sgpr_32 = S_CVT_F32_F16 killed undef %5:sreg_32, implicit $mode
- %7:vgpr_16 = COPY undef %3:sgpr_lo16
- %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:vgpr_16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
+ %7:sgpr_lo16 = COPY undef %3:sreg_32_xm0
+ %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
S_ENDPGM 0
bb.1:
successors: %bb.0(0x80000000); %bb.0(100.00%)
S_CMP_LG_U32 killed undef %4:sreg_32, killed undef %4:sreg_32, implicit-def $scc
S_CBRANCH_SCC1 %bb.0, implicit undef $scc
S_ENDPGM 0
# End machine code for function sgpr_copy_invalid_type.
Related commit
[AMDGPU][True16][CodeGen] readfirstlane for vgpr16 copy to sgpr32 by @broxigarchen