Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][SME] Don't use glue for call lowering's SMSTART. #68630

Closed
wants to merge 1 commit into from

Conversation

aemerson
Copy link
Contributor

@aemerson aemerson commented Oct 9, 2023

The glue operand causes the SMSTART instruction to have implicit register definitions added by InstrEmitter, since the actual call instruction uses those physregs.

I've been working on LLVM for over a decade now, and I've still yet to understand what Glue is, so I'm not sure if this change is correct.

The glue operand causes the SMSTART instruction to have implicit register
definitions added by InstrEmitter, since the actual call instruction uses
those physregs.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 9, 2023

@llvm/pr-subscribers-backend-aarch64

Changes

The glue operand causes the SMSTART instruction to have implicit register definitions added by InstrEmitter, since the actual call instruction uses those physregs.

I've been working on LLVM for over a decade now, and I've still yet to understand what Glue is, so I'm not sure if this change is correct.


Full diff: https://github.com/llvm/llvm-project/pull/68630.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-3)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-interface.ll (+4-8)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 9cda43e58d27a43..9ecc53f3ee4e8b2 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7687,14 +7687,13 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   if (!MemOpChains.empty())
     Chain = DAG.getNode(ISD::TokenFactor, DL, MVT::Other, MemOpChains);
 
-  SDValue InGlue;
   if (RequiresSMChange) {
     SDValue NewChain = changeStreamingMode(DAG, DL, *RequiresSMChange, Chain,
-                                           InGlue, PStateSM, true);
+                                           SDValue(), PStateSM, true);
     Chain = NewChain.getValue(0);
-    InGlue = NewChain.getValue(1);
   }
 
+  SDValue InGlue;
   // Build a sequence of copy-to-reg nodes chained together with token chain
   // and flag operands which copy the outgoing args into the appropriate regs.
   for (auto &RegToPass : RegsToPass) {
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-interface.ll b/llvm/test/CodeGen/AArch64/sme-streaming-interface.ll
index 102ed896ce7b3e8..46b7241076a1bd3 100644
--- a/llvm/test/CodeGen/AArch64/sme-streaming-interface.ll
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-interface.ll
@@ -368,15 +368,11 @@ define i8 @call_to_non_streaming_pass_sve_objects(ptr nocapture noundef readnone
 ; CHECK-NEXT:    stp d9, d8, [sp, #48] // 16-byte Folded Spill
 ; CHECK-NEXT:    stp x29, x30, [sp, #64] // 16-byte Folded Spill
 ; CHECK-NEXT:    addvl sp, sp, #-3
-; CHECK-NEXT:    rdsvl x8, #1
-; CHECK-NEXT:    addvl x9, sp, #2
-; CHECK-NEXT:    addvl x10, sp, #1
-; CHECK-NEXT:    mov x11, sp
+; CHECK-NEXT:    rdsvl x3, #1
 ; CHECK-NEXT:    smstop sm
-; CHECK-NEXT:    mov x0, x9
-; CHECK-NEXT:    mov x1, x10
-; CHECK-NEXT:    mov x2, x11
-; CHECK-NEXT:    mov x3, x8
+; CHECK-NEXT:    addvl x0, sp, #2
+; CHECK-NEXT:    addvl x1, sp, #1
+; CHECK-NEXT:    mov x2, sp
 ; CHECK-NEXT:    bl foo
 ; CHECK-NEXT:    smstart sm
 ; CHECK-NEXT:    ptrue p0.b

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been working on LLVM for over a decade now, and I've still yet to understand what Glue is, so I'm not sure if this change is correct.

The meaning of Glue is that it directs the SelectionDAG scheduler to schedule a node straight after another node, without anything scheduled in between. In contrast, a Chain just specifies that A needs be scheduled before B, it still allows other unchained nodes to be scheduled in between. Glue is stronger in that respect.

; CHECK-NEXT: mov x1, x10
; CHECK-NEXT: mov x2, x11
; CHECK-NEXT: mov x3, x8
; CHECK-NEXT: addvl x0, sp, #2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is semantically different because addvl has a different meaning after the smstop then it did before.
Before the smstop it would have calculated the address of %Data1, %Data2 and %Data3 using the streaming VL (SVL), whereas now it uses the regular (non-streaming) VL. If SVL != VL, then this is a different address.

That's also the reason for having this Glue, so that these are not scheduled after the smstop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Do you have any suggestions for how to fix this GPR move issue? It causes unnecessary size bloat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants