-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ISel] Add pattern matching for depositing subreg value #75978
Conversation
@llvm/pr-subscribers-backend-x86 Author: David Li (david-xl) ChangesDepositing value into the lowest byte/word is a common code pattern. This patch improves the code generation for it to avoid redundant AND and OR operations. Full diff: https://github.com/llvm/llvm-project/pull/75978.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86InstrMisc.td b/llvm/lib/Target/X86/X86InstrMisc.td
index 2ea10e317e12b4..f9ae88a8fa8ae8 100644
--- a/llvm/lib/Target/X86/X86InstrMisc.td
+++ b/llvm/lib/Target/X86/X86InstrMisc.td
@@ -561,6 +561,16 @@ def MOV64rm : RI<0x8B, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src),
[(set GR64:$dst, (load addr:$src))]>;
}
+def : Pat<(or (and GR64:$dst, -256),
+ (i64 (zextloadi8 addr:$src))),
+ (INSERT_SUBREG (i64 (COPY $dst)), (MOV8rm i8mem:$src), sub_8bit)
+>;
+
+def : Pat<(or (and GR64:$dst, -65536),
+ (i64 (zextloadi16 addr:$src))),
+ (INSERT_SUBREG (i64 (COPY $dst)), (MOV16rm i16mem:$src), sub_16bit)
+>;
+
let SchedRW = [WriteStore] in {
def MOV8mr : I<0x88, MRMDestMem, (outs), (ins i8mem :$dst, GR8 :$src),
"mov{b}\t{$src, $dst|$dst, $src}",
diff --git a/llvm/test/CodeGen/X86/insert.ll b/llvm/test/CodeGen/X86/insert.ll
new file mode 100644
index 00000000000000..30b0bca8c63bfe
--- /dev/null
+++ b/llvm/test/CodeGen/X86/insert.ll
@@ -0,0 +1,35 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+;RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s --check-prefixes=X64
+
+define i64 @sub8(i64 noundef %res, ptr %byte) {
+; X64-LABEL: sub8:
+; X64: # %bb.0: # %entry
+; X64-NEXT: movq %rdi, %rax
+; X64-NEXT: movb (%rsi), %al
+; X64-NEXT: retq
+entry:
+ %and = and i64 %res, -256
+ %d = load i8, ptr %byte, align 1
+ %conv2 = zext i8 %d to i64
+ %or = or i64 %and, %conv2
+ ret i64 %or
+}
+
+
+define i64 @sub16(i64 noundef %res, ptr %byte) {
+; X64-LABEL: sub16:
+; X64: # %bb.0: # %entry
+; X64-NEXT: movq %rdi, %rax
+; X64-NEXT: movw (%rsi), %ax
+; X64-NEXT: retq
+entry:
+ %and = and i64 %res, -65536
+ %d = load i16, ptr %byte, align 1
+ %conv2 = zext i16 %d to i64
+ %or = or i64 %and, %conv2
+ ret i64 %or
+}
+
+
+
+
|
llvm/test/CodeGen/X86/insert.ll
Outdated
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 | ||
;RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s --check-prefixes=X64 | ||
|
||
define i64 @sub8(i64 noundef %res, ptr %byte) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space before i64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
llvm/test/CodeGen/X86/insert.ll
Outdated
} | ||
|
||
|
||
define i64 @sub16(i64 noundef %res, ptr %byte) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
llvm/lib/Target/X86/X86InstrMisc.td
Outdated
@@ -561,6 +561,16 @@ def MOV64rm : RI<0x8B, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src), | |||
[(set GR64:$dst, (load addr:$src))]>; | |||
} | |||
|
|||
def : Pat<(or (and GR64:$dst, -256), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about GR32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patterns probably belong in X86InstrCompiler.td where we keep most of these kinds of patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
32bit move has implicit zero extension, so won't be applicable.
Moved the change to X86InstrCompiler.td
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant GPR32 for the destination type
def : Pat<(or (and GPR32:$dst, -256),
(i32 (zextloadi8 addr:$src))),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I misunderstood. Fixed now.
Also updated the test. Note that sub16_32 case does not yet produce the optimized code for i386 because the pattern change (due to arg passing).
llvm/test/CodeGen/X86/insert.ll
Outdated
@@ -0,0 +1,34 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 | |||
;RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s --check-prefixes=X64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space between ; and RUN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the default check prefix? X64
is kind of misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
x64 is used elsewhere too. Anyway change it to x86_64 and I386 for clarity.
llvm/test/CodeGen/X86/insert.ll
Outdated
|
||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop trailing new lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// Depositing value to 8/16 bit subreg: | ||
def : Pat<(or (and GR64:$dst, -256), | ||
(i64 (zextloadi8 addr:$src))), | ||
(INSERT_SUBREG (i64 (COPY $dst)), (MOV8rm i8mem:$src), sub_8bit)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align with (or
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
d3e5dcc
to
7c26b93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
llvm/test/CodeGen/X86/insert.ll
Outdated
@@ -0,0 +1,93 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 | |||
; RUN: llc < %s -mtriple=i386-unknown-unknown | FileCheck %s --check-prefixes=I386 | |||
; RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s --check-prefixes=X86_64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(style) We try to use 'X86' for 32-bit triple checks and 'X64' for 64-bit triple checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Depositing value into the lowest byte/word is a common code pattern. This patch improves the code generation for it to avoid redundant AND and OR operations.