Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Refine CLD insertion to trigger only when the direction flag is used #86557

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AtariDreams
Copy link
Contributor

@AtariDreams AtariDreams commented Mar 25, 2024

Rather than try to update every instruction that is affected by the direction flag, we could instead use findRegisterUseOperand.

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 25, 2024

@llvm/pr-subscribers-backend-x86

Author: AtariDreams (AtariDreams)

Changes

=Rather than try to update every instruction that is affected by the direction flag, we could instead use findRegisterUseOperand.


Full diff: https://github.com/llvm/llvm-project/pull/86557.diff

1 Files Affected:

  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+28-39)
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index d914e1b61ab075..9302d0d2e27573 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -1418,34 +1418,6 @@ bool X86FrameLowering::needsDwarfCFI(const MachineFunction &MF) const {
   return !isWin64Prologue(MF) && MF.needsFrameMoves();
 }
 
-/// Return true if an opcode is part of the REP group of instructions
-static bool isOpcodeRep(unsigned Opcode) {
-  switch (Opcode) {
-  case X86::REPNE_PREFIX:
-  case X86::REP_MOVSB_32:
-  case X86::REP_MOVSB_64:
-  case X86::REP_MOVSD_32:
-  case X86::REP_MOVSD_64:
-  case X86::REP_MOVSQ_32:
-  case X86::REP_MOVSQ_64:
-  case X86::REP_MOVSW_32:
-  case X86::REP_MOVSW_64:
-  case X86::REP_PREFIX:
-  case X86::REP_STOSB_32:
-  case X86::REP_STOSB_64:
-  case X86::REP_STOSD_32:
-  case X86::REP_STOSD_64:
-  case X86::REP_STOSQ_32:
-  case X86::REP_STOSQ_64:
-  case X86::REP_STOSW_32:
-  case X86::REP_STOSW_64:
-    return true;
-  default:
-    break;
-  }
-  return false;
-}
-
 /// emitPrologue - Push callee-saved registers onto the stack, which
 /// automatically adjust the stack pointer. Adjust the stack pointer to allocate
 /// space for local variables. Also emit labels used by the exception handler to
@@ -2225,33 +2197,50 @@ void X86FrameLowering::emitPrologue(MachineFunction &MF,
   // Create "cld" instruction only in these cases:
   // 1. The interrupt handling function uses any of the "rep" instructions.
   // 2. Interrupt handling function calls another function.
-  // 3. If there are any inline asm blocks, as we do not know what they do
+  // 3. If DF is clobbered by any instruction.
   //
-  // TODO: We should also emit cld if we detect the use of std, but as of now,
-  // the compiler does not even emit that instruction or even define it, so in
-  // practice, this would only happen with inline asm, which we cover anyway.
   if (Fn.getCallingConv() == CallingConv::X86_INTR) {
     bool NeedsCLD = false;
 
     for (const MachineBasicBlock &B : MF) {
       for (const MachineInstr &MI : B) {
-        if (MI.isCall()) {
+        if (MI.isInlineAsm()) {
           NeedsCLD = true;
           break;
         }
 
-        if (isOpcodeRep(MI.getOpcode())) {
-          NeedsCLD = true;
+        if (MI.findRegisterDefOperand(X86::DF)) {
+          // We do not need CLD because we clobber DF anyway before the flag is
+          // even used.
+          // FIXME: Is this even possible? Only cld and std can do this.
+          NeedsCLD = false;
           break;
         }
 
-        if (MI.isInlineAsm()) {
-          // TODO: Parse asm for rep instructions or call sites?
-          // For now, let's play it safe and emit a cld instruction
-          // just in case.
+        if (MI.isCall()) {
           NeedsCLD = true;
           break;
         }
+
+        if (MI.findRegisterUseOperand(X86::DF)) {
+          // Because eflags being pushed and popped save the instruction, it
+          // counts as a use, but we ignore them because the purpose is to
+          // preserve eflags.
+          switch (MI.getOpcode()) {
+          case X86::PUSHF16:
+          case X86::PUSHF32:
+          case X86::PUSHF64:
+          case X86::PUSHFS16:
+          case X86::PUSHFS32:
+          case X86::PUSHFS64:
+            break;
+          default:
+            NeedsCLD = true;
+            break;
+          }
+          if (NeedsCLD)
+            break;
+        }
       }
     }
 

Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

✅ With the latest revision this PR passed the Python code formatter.

… used

Rather than try to update every instruction that is affected by the direction flag, we could instead use findRegisterUseOperand.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants