Skip to content

[AMDGPU][True16] si-fix-sgpr-copies: invalid sgpr_lo16 copy destination #144561

Open
@frederik-h

Description

@frederik-h

Running pass si-fix-sgpr-copies on the following machine ir changes the %7:vgpr_16 operand of the second V_CNDMASK_B16_t16_e64 into a register of type sgpr_lo16 which is invalid.

test.mir

---
name: sgpr_copy_invalid_type
tracksRegLiveness: true
body:             |
  bb.0:
    %0:vgpr_32 = IMPLICIT_DEF
    %1:sreg_32 = IMPLICIT_DEF
    %2:sreg_32_xm0_xexec = IMPLICIT_DEF

    %3:sgpr_lo16 = COPY undef %0.lo16
    %4:sreg_32 = COPY undef %3
    %5:sreg_32 = S_AND_B32 undef %4, killed undef %1, implicit-def dead $scc
    %6:sgpr_32 = S_CVT_F32_F16 killed undef %5, implicit $mode

    %7:vgpr_16 = COPY undef %3
    %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7, 0, 1, killed undef %2, 0, implicit $exec

    S_ENDPGM 0

  bb.2:
    successors: %bb.0(0x80000000)

    S_CMP_LG_U32 killed undef %4, killed undef %4, implicit-def $scc
    S_CBRANCH_SCC1 %bb.0, implicit undef $scc
    S_ENDPGM 0
...

llc invocation

llc -mtriple=amdgcn -mcpu=gfx1150 -mattr=+real-true16 -print-changed=cdiff -run-pass=si-fix-sgpr-copies -debug-only=si-fix-sgpr-copies -verify-machineinstrs test.mir

Error message from machine verifier

*** Bad machine code: Illegal virtual register for instruction ***
- function:    sgpr_copy_invalid_type
- basic block: %bb.0  (0x557e9fcd29b8)
- instruction: %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
- operand 2:   undef %7:sgpr_lo16
Expected a VS_16 register, but got a SGPR_LO16 register

Debug output from si-fix-sgpr-copies

V2S copy %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
 is being turned to v_readfirstlane_b32 Score: 3
*** IR Dump After SI Fix SGPR copies (si-fix-sgpr-copies) on sgpr_copy_invalid_type ***
 # Machine code for function sgpr_copy_invalid_type: IsSSA, NoPHIs, TracksLiveness
 bb.0:
 ; predecessors: %bb.1

   %0:vgpr_32 = IMPLICIT_DEF
   %1:sreg_32 = IMPLICIT_DEF
   %2:sreg_32_xm0_xexec = IMPLICIT_DEF
-  %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
-  %4:sreg_32 = COPY undef %3:sgpr_lo16
+  %10:vgpr_16 = IMPLICIT_DEF
+  %9:vgpr_32 = REG_SEQUENCE %0.lo16:vgpr_32, %subreg.lo16, %10:vgpr_16, %subreg.hi16
+  %3:sreg_32_xm0 = V_READFIRSTLANE_B32 %9:vgpr_32, implicit $exec
+  %4:sreg_32 = COPY undef %3:sreg_32_xm0
   %5:sreg_32 = S_AND_B32 undef %4:sreg_32, killed undef %1:sreg_32, implicit-def dead $scc
   %6:sgpr_32 = S_CVT_F32_F16 killed undef %5:sreg_32, implicit $mode
-  %7:vgpr_16 = COPY undef %3:sgpr_lo16
-  %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:vgpr_16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
+  %7:sgpr_lo16 = COPY undef %3:sreg_32_xm0
+  %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
   S_ENDPGM 0

 bb.1:
   successors: %bb.0(0x80000000); %bb.0(100.00%)

   S_CMP_LG_U32 killed undef %4:sreg_32, killed undef %4:sreg_32, implicit-def $scc
   S_CBRANCH_SCC1 %bb.0, implicit undef $scc
   S_ENDPGM 0

 # End machine code for function sgpr_copy_invalid_type.

Related commit

[AMDGPU][True16][CodeGen] readfirstlane for vgpr16 copy to sgpr32 by @broxigarchen

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions