Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][SVE] Failed to reserve a spill slot for an address of an SVE load/store #60918

Closed
ytmukai opened this issue Feb 22, 2023 · 3 comments
Closed
Labels
backend:AArch64 crash Prefer [crash-on-valid] or [crash-on-invalid] SVE ARM Scalable Vector Extensions

Comments

@ytmukai
Copy link
Contributor

ytmukai commented Feb 22, 2023

https://llvm.godbolt.org/z/E394ec4r3

A spill slot for a scratch register for an address of an SVE load/store may not be reserved, resulting in the following crash:

LLVM ERROR: Error while trying to spill X8 from class GPR64: Cannot scavenge register without an emergency spill slot!

Detailed conditions are as follows:

  • SVE load/store instruction
  • Accessing SP relative addresses
  • The offset exceeds the immediate range (i.e. requires a scratch register to generate the final address)
  • Requires a spill to allocate the scratch register
  • The stack object is not for scalable vector (i.e. StackID != TargetStackID::ScalableVector)
  • The size of the stack frame does not exceed the range of the immediate offset of the normal load/store instructions

The instruction LD1W_IMM in the code below met the above conditions.

; llc bug.llc --stop-before=prologepilog

  bb.2.vec.epilog.ph178:
    liveins: $p0, $p1, $s0, $w8, $w9, $w10, $w11, $w12, $w13, $w23, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x24, $x25, $z1

    STRWui $wzr, %stack.0, 7 :: (store (s32) into %stack.0 + 28)
    STRWui $wzr, %stack.0, 6 :: (store (s32) into %stack.0 + 24, align 8)
    STRWui $wzr, %stack.0, 5 :: (store (s32) into %stack.0 + 20)
    STRWui $wzr, %stack.0, 4 :: (store (s32) into %stack.0 + 16, align 16)
    STRWui $wzr, %stack.0, 3 :: (store (s32) into %stack.0 + 12)
    renamable $x26 = COPY renamable $x22
    STRWui $wzr, %stack.0, 2 :: (store (s32) into %stack.0 + 8, align 8)
    STRWui $wzr, %stack.0, 1 :: (store (s32) into %stack.0 + 4)
    renamable $x27 = COPY renamable $x19
    STRSui renamable $s0, %stack.0, 0 :: (store (s32) into %stack.0, align 32)
    renamable $x28 = COPY renamable $x16
    renamable $lr = COPY renamable $x6
    dead renamable $z2 = LD1W_IMM renamable $p1, %stack.0, 0 :: (load (s256) from %stack.0)

In this case, a spill slot is required, but since the BigStack in the following code is set to false, the slot is not allocated.

// Conservatively always assume BigStack when there are SVE spills.
bool BigStack = SVEStackSize || (EstimatedStackSize + CSStackSize +
CalleeStackUsed) > EstimatedStackSizeLimit;
if (BigStack || !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF))
AFI->setHasStackFrame(true);
// Estimate if we might need to scavenge a register at some point in order
// to materialize a stack offset. If so, either spill one additional
// callee-saved register or reserve a special spill slot to facilitate
// register scavenging. If we already spilled an extra callee-saved register
// above to keep the number of spills even, we don't need to do anything else
// here.
if (BigStack) {

The problem can be solved by ensuring that BigStack is true in the presence of such an SVE load/store.
However, I am not familiar with these codes and have no idea how to correct them properly.

Initially, this bug was reported by @ggouaillardet:
https://reviews.llvm.org/rG3f561996bf7193091bc6670a2e7804b0cb0bb936#1173798

The machine model modifications in that patch made scheduling more aggressive and thus more prone to problems.
At the source code level, the problem may occur when a small trip count loop is vectorized and fully unrolled with the -msve-vector-bits= option, as in the following case:
https://reviews.llvm.org/rG68469a80cb74#1173363

@ytmukai ytmukai added backend:AArch64 crash Prefer [crash-on-valid] or [crash-on-invalid] labels Feb 22, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 22, 2023

@llvm/issue-subscribers-backend-aarch64

@sdesmalen-arm
Copy link
Collaborator

Hi @ytmukai thanks for reporting. I've created https://reviews.llvm.org/D145497 to address this.

@EugeneZelenko EugeneZelenko added the SVE ARM Scalable Vector Extensions label Mar 7, 2023
sdesmalen-arm added a commit that referenced this issue Mar 15, 2023
This is an alternative fix to D145497, which also addresses
  #60918

In D124457 which added the original code for this, @efriedma pointed
out that it wasn't safe to assume that FI #0 would be allocated at offset
0, but that part of the patch went in without any changes.

The downside of this solution is that any access to an object on the
stack that has been allocated at SP + 0, still gets moved to a separate
register first, which degrades performance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D146056
CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Mar 17, 2023
This is an alternative fix to D145497, which also addresses
  llvm/llvm-project#60918

In D124457 which added the original code for this, @efriedma pointed
out that it wasn't safe to assume that FI #0 would be allocated at offset
0, but that part of the patch went in without any changes.

The downside of this solution is that any access to an object on the
stack that has been allocated at SP + 0, still gets moved to a separate
register first, which degrades performance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D146056
agozillon pushed a commit to ROCm-Developer-Tools/llvm-project that referenced this issue Mar 17, 2023
This is an alternative fix to D145497, which also addresses
  llvm/llvm-project#60918

In D124457 which added the original code for this, @efriedma pointed
out that it wasn't safe to assume that FI #0 would be allocated at offset
0, but that part of the patch went in without any changes.

The downside of this solution is that any access to an object on the
stack that has been allocated at SP + 0, still gets moved to a separate
register first, which degrades performance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D146056
@ytmukai
Copy link
Contributor Author

ytmukai commented Apr 4, 2023

Sorry for the late reply. Thanks for the fix!
I confirmed that the problem does not occur with the original source code (https://reviews.llvm.org/rG68469a80cb74#1173363).
(The reproduction code for this issue seems to cause no problems in the trunk due to another modification.)

@ytmukai ytmukai closed this as completed Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 crash Prefer [crash-on-valid] or [crash-on-invalid] SVE ARM Scalable Vector Extensions
Projects
None yet
Development

No branches or pull requests

4 participants