[ISel] Add pattern matching for depositing subreg value #75978

david-xl · 2023-12-19T22:50:32Z

Depositing value into the lowest byte/word is a common code pattern. This patch improves the code generation for it to avoid redundant AND and OR operations.

llvmbot · 2023-12-19T22:51:03Z

@llvm/pr-subscribers-backend-x86

Author: David Li (david-xl)

Changes

Depositing value into the lowest byte/word is a common code pattern. This patch improves the code generation for it to avoid redundant AND and OR operations.

Full diff: https://github.com/llvm/llvm-project/pull/75978.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86InstrMisc.td (+10)
(added) llvm/test/CodeGen/X86/insert.ll (+35)

diff --git a/llvm/lib/Target/X86/X86InstrMisc.td b/llvm/lib/Target/X86/X86InstrMisc.td
index 2ea10e317e12b4..f9ae88a8fa8ae8 100644
--- a/llvm/lib/Target/X86/X86InstrMisc.td
+++ b/llvm/lib/Target/X86/X86InstrMisc.td
@@ -561,6 +561,16 @@ def MOV64rm : RI<0x8B, MRMSrcMem, (outs GR64:$dst), (ins i64mem:$src),
                  [(set GR64:$dst, (load addr:$src))]>;
 }
 
+def : Pat<(or (and GR64:$dst, -256), 
+              (i64 (zextloadi8 addr:$src))),
+      (INSERT_SUBREG (i64 (COPY $dst)), (MOV8rm  i8mem:$src), sub_8bit)
+>; 
+
+def : Pat<(or (and GR64:$dst, -65536), 
+              (i64 (zextloadi16 addr:$src))),
+      (INSERT_SUBREG (i64 (COPY $dst)), (MOV16rm  i16mem:$src), sub_16bit)
+>; 
+
 let SchedRW = [WriteStore] in {
 def MOV8mr  : I<0x88, MRMDestMem, (outs), (ins i8mem :$dst, GR8 :$src),
                 "mov{b}\t{$src, $dst|$dst, $src}",
diff --git a/llvm/test/CodeGen/X86/insert.ll b/llvm/test/CodeGen/X86/insert.ll
new file mode 100644
index 00000000000000..30b0bca8c63bfe
--- /dev/null
+++ b/llvm/test/CodeGen/X86/insert.ll
@@ -0,0 +1,35 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+;RUN: llc < %s -mtriple=x86_64-unknown-unknown | FileCheck %s --check-prefixes=X64
+
+define  i64 @sub8(i64 noundef %res, ptr %byte) {
+; X64-LABEL: sub8:
+; X64:       # %bb.0: # %entry
+; X64-NEXT:    movq %rdi, %rax
+; X64-NEXT:    movb (%rsi), %al
+; X64-NEXT:    retq
+entry:
+  %and = and i64 %res, -256
+  %d = load i8, ptr %byte, align 1
+  %conv2 = zext i8 %d to i64
+  %or = or i64 %and, %conv2
+  ret i64 %or
+}
+
+
+define  i64 @sub16(i64 noundef %res, ptr %byte) {
+; X64-LABEL: sub16:
+; X64:       # %bb.0: # %entry
+; X64-NEXT:    movq %rdi, %rax
+; X64-NEXT:    movw (%rsi), %ax
+; X64-NEXT:    retq
+entry:
+  %and = and i64 %res, -65536
+  %d = load i16, ptr %byte, align 1
+  %conv2 = zext i16 %d to i64
+  %or = or i64 %and, %conv2
+  ret i64 %or
+}
+
+
+
+

topperc · 2023-12-19T22:55:00Z

llvm/test/CodeGen/X86/insert.ll

extra space before i64

topperc · 2023-12-19T22:55:04Z

llvm/test/CodeGen/X86/insert.ll

topperc · 2023-12-19T22:55:22Z

llvm/lib/Target/X86/X86InstrMisc.td

What about GR32?

The patterns probably belong in X86InstrCompiler.td where we keep most of these kinds of patterns.

32bit move has implicit zero extension, so won't be applicable.

Moved the change to X86InstrCompiler.td

I meant GPR32 for the destination type

def : Pat<(or (and GPR32:$dst, -256),
(i32 (zextloadi8 addr:$src))),

Sorry I misunderstood. Fixed now.

Also updated the test. Note that sub16_32 case does not yet produce the optimized code for i386 because the pattern change (due to arg passing).

KanRobert · 2023-12-20T01:25:52Z

llvm/test/CodeGen/X86/insert.ll

space between ; and RUN.

Could we use the default check prefix? X64 is kind of misleading.

Done.

x64 is used elsewhere too. Anyway change it to x86_64 and I386 for clarity.

KanRobert · 2023-12-20T01:27:26Z

llvm/test/CodeGen/X86/insert.ll

drop trailing new lines

KanRobert · 2023-12-20T01:29:21Z

llvm/lib/Target/X86/X86InstrCompiler.td

align with (or

KanRobert

LGTM

RKSimon · 2023-12-20T10:33:36Z

llvm/test/CodeGen/X86/insert.ll

(style) We try to use 'X86' for 32-bit triple checks and 'X64' for 64-bit triple checks.

david-xl requested a review from RKSimon December 19, 2023 22:50

llvmbot added the backend:X86 label Dec 19, 2023

topperc reviewed Dec 19, 2023

View reviewed changes

david-xl force-pushed the main branch from 1e2d9bf to fa59020 Compare December 20, 2023 00:42

KanRobert reviewed Dec 20, 2023

View reviewed changes

david-xl force-pushed the main branch 2 times, most recently from d3e5dcc to 7c26b93 Compare December 20, 2023 04:47

KanRobert approved these changes Dec 20, 2023

View reviewed changes

RKSimon reviewed Dec 20, 2023

View reviewed changes

david-xl force-pushed the main branch from 7c26b93 to f66fe02 Compare December 20, 2023 19:27

ISel improvement for subreg insertion pattern

91f1aa0

david-xl force-pushed the main branch from f66fe02 to 91f1aa0 Compare December 20, 2023 22:36

david-xl merged commit f44079d into llvm:main Dec 21, 2023

[ISel] Add pattern matching for depositing subreg value #75978

[ISel] Add pattern matching for depositing subreg value #75978

Uh oh!

Conversation

david-xl commented Dec 19, 2023

Uh oh!

llvmbot commented Dec 19, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KanRobert left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!