Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve TODO: Use TokenFactor for inline memset #87002

Merged
merged 1 commit into from
May 26, 2024

Conversation

AtariDreams
Copy link
Contributor

@AtariDreams AtariDreams commented Mar 28, 2024

We can rewrite this as a TokenFactor like memcpy is.

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 28, 2024

@llvm/pr-subscribers-backend-x86

Author: AtariDreams (AtariDreams)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/87002.diff

1 Files Affected:

  • (modified) llvm/lib/Target/X86/X86SelectionDAGInfo.cpp (+23-20)
diff --git a/llvm/lib/Target/X86/X86SelectionDAGInfo.cpp b/llvm/lib/Target/X86/X86SelectionDAGInfo.cpp
index 7c630a2b0da080..0f46110f22dc93 100644
--- a/llvm/lib/Target/X86/X86SelectionDAGInfo.cpp
+++ b/llvm/lib/Target/X86/X86SelectionDAGInfo.cpp
@@ -67,7 +67,7 @@ SDValue X86SelectionDAGInfo::EmitTargetCodeForMemset(
   // The libc version is likely to be faster for these cases. It can use the
   // address value and run time information about the CPU.
   if (Alignment < Align(4) || !ConstantSize ||
-      ConstantSize->getZExtValue() > Subtarget.getMaxInlineSizeThreshold()) 
+      ConstantSize->getZExtValue() > Subtarget.getMaxInlineSizeThreshold())
     return SDValue();
 
   uint64_t SizeVal = ConstantSize->getZExtValue();
@@ -128,26 +128,29 @@ SDValue X86SelectionDAGInfo::EmitTargetCodeForMemset(
   InGlue = Chain.getValue(1);
 
   SDVTList Tys = DAG.getVTList(MVT::Other, MVT::Glue);
-  SDValue Ops[] = { Chain, DAG.getValueType(AVT), InGlue };
-  Chain = DAG.getNode(X86ISD::REP_STOS, dl, Tys, Ops);
-
-  if (BytesLeft) {
-    // Handle the last 1 - 7 bytes.
-    unsigned Offset = SizeVal - BytesLeft;
-    EVT AddrVT = Dst.getValueType();
-    EVT SizeVT = Size.getValueType();
-
-    Chain =
-        DAG.getMemset(Chain, dl,
-                      DAG.getNode(ISD::ADD, dl, AddrVT, Dst,
-                                  DAG.getConstant(Offset, dl, AddrVT)),
-                      Val, DAG.getConstant(BytesLeft, dl, SizeVT), Alignment,
-                      isVolatile, AlwaysInline,
-                      /* isTailCall */ false, DstPtrInfo.getWithOffset(Offset));
-  }
+  SDValue Ops[] = {Chain, DAG.getValueType(AVT), InGlue};
+  SDValue RepStos = DAG.getNode(X86ISD::REP_STOS, dl, Tys, Ops);
+
+  /// RepStos can process the whole length.
+  if (BytesLeft == 0)
+    return RepStos;
 
-  // TODO: Use a Tokenfactor, as in memcpy, instead of a single chain.
-  return Chain;
+  // Handle the last 1 - 7 bytes.
+  SmallVector<SDValue, 4> Results;
+  Results.push_back(RepStos);
+  unsigned Offset = SizeVal - BytesLeft;
+  EVT AddrVT = Dst.getValueType();
+  EVT SizeVT = Size.getValueType();
+
+  Results.push_back(
+      DAG.getMemset(Chain, dl,
+                    DAG.getNode(ISD::ADD, dl, AddrVT, Dst,
+                                DAG.getConstant(Offset, dl, AddrVT)),
+                    Val, DAG.getConstant(BytesLeft, dl, SizeVT), Alignment,
+                    isVolatile, AlwaysInline,
+                    /* isTailCall */ false, DstPtrInfo.getWithOffset(Offset)));
+
+  return DAG.getNode(ISD::TokenFactor, dl, MVT::Other, Results);
 }
 
 /// Emit a single REP MOVS{B,W,D,Q} instruction.

@AtariDreams AtariDreams force-pushed the AArch64-2 branch 2 times, most recently from 0596ef7 to da9aee1 Compare March 28, 2024 22:28
@XiaodongLoong
Copy link
Contributor

Could you please add a .ll test for the PR?

@AtariDreams
Copy link
Contributor Author

Could you please add a .ll test for the PR?

It's a refactor.

@AtariDreams AtariDreams changed the title [X86] Resolve TODO: Use Tokenfactor rather than the whole chain [X86] (NFC) Resolve TODO: Use Tokenfactor rather than the whole chain Apr 2, 2024
@AtariDreams AtariDreams changed the title [X86] (NFC) Resolve TODO: Use Tokenfactor rather than the whole chain [X86] [NFC] Resolve TODO: Use Tokenfactor rather than the whole chain Apr 2, 2024
@phoebewang
Copy link
Contributor

I don't see problem to refactor to TokenFactor, but it's not a NFC. Maybe it's OK if it's not easy to create a test case, but should remove [NFC] from title.

@AtariDreams AtariDreams changed the title [X86] [NFC] Resolve TODO: Use Tokenfactor rather than the whole chain [X86] Resolve TODO: Use Tokenfactor rather than the whole chain Apr 6, 2024
@AtariDreams AtariDreams force-pushed the AArch64-2 branch 2 times, most recently from b44273d to cb239d9 Compare April 7, 2024 14:24
@AtariDreams AtariDreams changed the title [X86] Resolve TODO: Use Tokenfactor rather than the whole chain Resolve TODO: Use TokenFactor rather than the whole chain Apr 14, 2024
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@RKSimon
Copy link
Collaborator

RKSimon commented Apr 15, 2024

@AtariDreams Please can you update the patch subject to be a more clear description of where/what the patch is doing

@AtariDreams AtariDreams changed the title Resolve TODO: Use TokenFactor rather than the whole chain Resolve TODO: Use TokenFactor for inline memset May 2, 2024
@AtariDreams
Copy link
Contributor Author

@AtariDreams Please can you update the patch subject to be a more clear description of where/what the patch is doing

Done!

We can rewrite this using TokenFactor like memcpy is.
@AtariDreams AtariDreams merged commit 972f297 into llvm:main May 26, 2024
4 checks passed
@AtariDreams AtariDreams deleted the AArch64-2 branch May 26, 2024 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants