Skip to content

RegScavenger ignores earlyclobber when scavenging virtual registers from FIE #172511

@dominik-steenken

Description

@dominik-steenken

This is something i noticed while implementing the global stack protector feature for SystemZ.

When doing frame index elimination, the SystemZ backend may realize that an instruction with a 12-bit displacement may read from a stack slot that is out of reach. In that case, it may insert an lay instruction to materialize the address into a register, or an llilh to prepare an index register, and then make the read relative to that. That introduces new virtual registers post-RA though: one in the operand slot that used to contain the frame index, and one def associated with the lay/llilh inserted above that.

scavengeFrameVirtualRegsInBlock (RegisterScavenging.cpp:402) is then called to savenge physical registers for new VRegs thus introduced. If the original instruction with the 12-bit displacement, which now has the new VReg as an operand, also defines a physical register rX, which is marked early-clobber, then the VReg must not be assigned to rX, in order to avoid an overlap between input and output forbidden by early-clobber.

However, exactly that happens. As stated above, i found this on expanding new pseudoinstructions post-RA for the purpose of implementing the global stack protector, but the same effect can be demonstrated, e.g. with inline asm:

__attribute__((noinline)) uint64_t repro(uint64_t in) {
  // Big char buffer to trigger lay insertion
  volatile char big[16384];

  uint64_t out;

  // Use a memory operand accessing the big buffer
  asm volatile(
      "  lg %0, %1\n\t"
      : "=&r"(out)                  // <-- early-clobber output GPR
      : "m"(big[5000])              // <-- memory operand -> far FI
      : // Broad clobber to force the problematic situation.
        // Correct response would be to spill, but this does not happen.
        "r2","r3","r4","r5","r6","r7","r8",
        "r9","r10","r11","r12","r13","r14",
        "cc","memory");

  return out + in;
}

This will, during Prologue-Epilogue Insertion, insert a LAY instruction before the inline asm block, which targets exactly the physical register that is also marked early-clobber, i.e. the MIR will go from

INLINEASM &"  lg $0, $1\0A\09", [...], def early-clobber renamable $r1d, 262174 /* mem:m */, %stack.0, [...]

to

$r1d = LAY $r15d, 4096, $noreg
INLINEASM &"  lg $0, $1\0A\09", [...] def early-clobber renamable $r1d, 262174 /* mem:m */, killed $r1d, [...]

Looking at RegisterScavenging.cpp, it seems the core of the search for a register to assign is the call to findSurvivorBackwards in RegScavenger::scavengeRegisterBackwards:

  std::pair<MCPhysReg, MachineBasicBlock::iterator> P = findSurvivorBackwards(
      *MRI, std::prev(MBBI), To, LiveUnits, AllocationOrder, RestoreAfter);

Here To is the instruction defining the VReg (LAY above), while MBBI stands on the instruction using the VReg (INLINEASM above). Note that the search starts at std::prev(MBBI), i.e. it does not include the using instruction, and thus cannot include the early-clobber constraint associated with it. The LiveRegUnits instance LiveUnits also does not contain this information.

I have found that both

  • extending the search to include the using instruction (i.e. using MBBI instead of std::prev(MBBI)), and
  • temporarily adding any early clobber def to LiveUnits

solve the problem, with the former breaking an x86 test case in ways that i am not sure is problematic, as well as needing additional safeguards (e.g., to account for cases like MBBI == MBB.end(). But, i am not sure what the best way to proceed here is.

I have a repro that is attached to this issue that demonstrates the inline-asm case outlined above:
repro.tar.gz

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions