-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][GlobalISel] Select llvm.aarch64.neon.ld* intrinsics #65630
Conversation
@@ -3897,6 +3906,31 @@ MachineInstr *AArch64InstructionSelector::emitScalarToVector( | |||
} | |||
} | |||
|
|||
MachineInstr * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to document this helper function
; CHECK-SD-LABEL: ld2lane_16b: | ||
; CHECK-SD: // %bb.0: | ||
; CHECK-SD-NEXT: // kill: def $q1 killed $q1 killed $q0_q1 def $q0_q1 | ||
; CHECK-SD-NEXT: // kill: def $q0 killed $q0 killed $q0_q1 def $q0_q1 | ||
; CHECK-SD-NEXT: ld2.b { v0, v1 }[1], [x0] | ||
; CHECK-SD-NEXT: ret | ||
; | ||
; CHECK-GI-LABEL: ld2lane_16b: | ||
; CHECK-GI: // %bb.0: | ||
; CHECK-GI-NEXT: // kill: def $q0 killed $q0 killed $q0_q1 def $q0_q1 | ||
; CHECK-GI-NEXT: // kill: def $q1 killed $q1 killed $q0_q1 def $q0_q1 | ||
; CHECK-GI-NEXT: ld2.b { v0, v1 }[1], [x0] | ||
; CHECK-GI-NEXT: ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you happen to know why we print the these kill comments in the opposite order? It's annoying that this difference prevents the checks from merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed this weird difference too. ScheduleDAGRRList puts these COPY instructions in reverse order during REG_SEQUENCE scheduling. In GlobalISel, they get IRTranslate'd as
%0:_(<16 x s8>) = COPY $q0
%1:_(<16 x s8>) = COPY $q1
and afterward, the order doesn't change.
def : Pat<(v8i8 (AArch64dup (i8 (load (am_indexed8 GPR64sp:$Rn))))), | ||
(LD1Rv8b GPR64sp:$Rn)>; | ||
def : Pat<(v16i8 (AArch64dup (i8 (load GPR64sp:$Rn)))), | ||
(LD1Rv16b GPR64sp:$Rn)>; | ||
def : Pat<(v4i16 (AArch64dup (i16 (load GPR64sp:$Rn)))), | ||
(LD1Rv4h GPR64sp:$Rn)>; | ||
def : Pat<(v8i16 (AArch64dup (i16 (load GPR64sp:$Rn)))), | ||
(LD1Rv8h GPR64sp:$Rn)>; | ||
def : Pat<(v2i32 (AArch64dup (i32 (load GPR64sp:$Rn)))), | ||
(LD1Rv2s GPR64sp:$Rn)>; | ||
def : Pat<(v4i32 (AArch64dup (i32 (load GPR64sp:$Rn)))), | ||
(LD1Rv4s GPR64sp:$Rn)>; | ||
def : Pat<(v2i64 (AArch64dup (i64 (load GPR64sp:$Rn)))), | ||
(LD1Rv2d GPR64sp:$Rn)>; | ||
def : Pat<(v1i64 (AArch64dup (i64 (load GPR64sp:$Rn)))), | ||
(LD1Rv1d GPR64sp:$Rn)>; | ||
|
||
class Ld1Lane64PatGISel<SDPatternOperator scalar_load, Operand VecIndex, | ||
ValueType VTy, ValueType STy, Instruction LD1> | ||
: Pat<(insertelt (VTy VecListOne64:$Rd), | ||
(STy (scalar_load GPR64sp:$Rn)), VecIndex:$idx), | ||
(EXTRACT_SUBREG | ||
(LD1 (SUBREG_TO_REG (i32 0), VecListOne64:$Rd, dsub), | ||
(UImmS1XForm VecIndex:$idx), GPR64sp:$Rn), | ||
dsub)>; | ||
|
||
class Ld1Lane128PatGISel<Operand VecIndex, ValueType VTy, | ||
ValueType STy, Instruction LD1> | ||
: Pat<(insertelt (VTy VecListOne128:$Rd), | ||
(STy (load GPR64sp:$Rn)), VecIndex:$idx), | ||
(LD1 VecListOne128:$Rd, (UImmS1XForm VecIndex:$idx), GPR64sp:$Rn)>; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these necessary support the AArch64 intrinsics? They look like matching generic code to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def : Pat<(v8i8 (AArch64dup (i8 (load (am_indexed8 GPR64sp:$Rn))))),
(LD1Rv8b GPR64sp:$Rn)>;
def : Pat<(v16i8 (AArch64dup (i8 (load GPR64sp:$Rn)))),
(LD1Rv16b GPR64sp:$Rn)>;
def : Pat<(v4i16 (AArch64dup (i16 (load GPR64sp:$Rn)))),
(LD1Rv4h GPR64sp:$Rn)>;
def : Pat<(v8i16 (AArch64dup (i16 (load GPR64sp:$Rn)))),
(LD1Rv8h GPR64sp:$Rn)>;
These lines are different from what we have in AArch64InstrInfo.td since loads return i32 there, so patterns there contain extloadi8/extloadi16 instead of load, whereas in gMIR G_DUP takes i8 and i16 correspondingly.
def : Pat<(v2i32 (AArch64dup (i32 (load GPR64sp:$Rn)))),
This and the following 3 patterns I've added just to have together full set of patterns to select G_DUP. Should I remove them?
Ld1Lane64PatGISel and Ld1Lane128PatGISel are slightly different from Ld1Lane64Pat/Ld1Lane128Pat, because vector_insert used in latter patterns isn't imported to GlobalISel. If I replace it with insertelt there, the pattern doesn't go off in SelectionDAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was checking if this patch can be split up, since matching intrinsics I would have thought would be separate to matching generic load patterns? Not a big deal either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, did that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it's easier to review when we keep the patches independent.
Similar to llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp.
…5630) Similar to llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp.
Similar to llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp.