Skip to content

Conversation

woruyu
Copy link
Member

@woruyu woruyu commented Sep 22, 2025

Summary

This PR resolves #158585.

@llvmbot
Copy link
Member

llvmbot commented Sep 22, 2025

@llvm/pr-subscribers-backend-x86

Author: woruyu (woruyu)

Changes

Summary

This PR resolves #158585.


Full diff: https://github.com/llvm/llvm-project/pull/160039.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+18)
  • (added) llvm/test/MC/X86/encoder-fail-VEX-EVEX.s (+16)
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index ce5e92135f706..d1ebe6bdb2dfd 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -4044,6 +4044,24 @@ bool X86AsmParser::validateInstruction(MCInst &Inst, const OperandVector &Ops) {
     }
   }
 
+  unsigned Enc = TSFlags & X86II::EncodingMask;
+  if (Enc == X86II::VEX || Enc == X86II::EVEX || Enc == X86II::XOP) {
+    unsigned NumOps = Inst.getNumOperands();
+    for (unsigned i = 0; i != NumOps; ++i) {
+      const MCOperand &MO = Inst.getOperand(i);
+      if (!MO.isReg())
+        continue;
+      MCRegister Reg = MO.getReg();
+      if (Reg == X86::AH || Reg == X86::BH || Reg == X86::CH ||
+          Reg == X86::DH) {
+        StringRef RegName = X86IntelInstPrinter::getRegisterName(Reg);
+        return Error(Ops[0]->getStartLoc(),
+                     "can't encode '" + RegName +
+                         "' in a VEX/EVEX-prefixed instruction");
+      }
+    }
+  }
+
   if ((Opcode == X86::PREFETCHIT0 || Opcode == X86::PREFETCHIT1)) {
     const MCOperand &MO = Inst.getOperand(X86::AddrBaseReg);
     if (!MO.isReg() || MO.getReg() != X86::RIP)
diff --git a/llvm/test/MC/X86/encoder-fail-VEX-EVEX.s b/llvm/test/MC/X86/encoder-fail-VEX-EVEX.s
new file mode 100644
index 0000000000000..90512741c9c6c
--- /dev/null
+++ b/llvm/test/MC/X86/encoder-fail-VEX-EVEX.s
@@ -0,0 +1,16 @@
+// RUN: not llvm-mc -triple x86_64-unknown-unknown -x86-asm-syntax=intel --show-encoding %s 2>&1 | FileCheck %s
+
+// CHECK: error: can't encode 'ah' in a VEX/EVEX-prefixed instruction
+add ah, ah, ah
+
+// CHECK: error: can't encode 'ah' in a VEX/EVEX-prefixed instruction
+and ah, byte ptr [-13426159], ah
+
+// CHECK: error: can't encode 'ah' in a VEX/EVEX-prefixed instruction
+ccmpa {dfv=of,cf} byte ptr [r8 + 4*rax + 291], ah
+
+// CHECK: error: can't encode 'ah' in a VEX/EVEX-prefixed instruction
+ccmpae {dfv=of,cf} byte ptr [r8 + 4*rax + 291], ah
+
+// CHECK: error: can't encode 'ah' in a VEX/EVEX-prefixed instruction
+sar ah, byte ptr [-13426159]
\ No newline at end of file

@woruyu woruyu self-assigned this Sep 22, 2025
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// * REX2
// VEX/XOP don't use REX; they are excluded from the legacy check.
const unsigned Enc = TSFlags & X86II::EncodingMask;
if (Enc != X86II::VEX && Enc != X86II::XOP) {
MCRegister HReg;
bool UsesRex = TSFlags & X86II::REX_W;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems an incomplete fix. In addition to REX_W, we may use REX prefix when other bit is set, e.g. REX_R/X/B.

Copy link
Contributor

@phoebewang phoebewang Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't understand this. Because I think REX_R/X/B means you use r8..r15 in opreand, so in current codes for add ah, byte ptr [r8], you will get UsesRex = true,HReg is noZero, so you will get error msg in the second if branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean we don't need extra change here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I missed the line 4041

@woruyu woruyu force-pushed the fix/EVEX-VEX-prefixes branch from 3feb2a2 to c9bfee5 Compare September 24, 2025 11:52
@phoebewang
Copy link
Contributor

The failures look unrelated. Please merge main and try again.

"REX prefix");
if (HReg) {
if (Enc == X86II::EVEX || ForcedOpcodePrefix == OpcodePrefix_REX2 ||
ForcedOpcodePrefix == OpcodePrefix_REX) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we || UsesRex here? Or move ForcedOpcodePrefix == OpcodePrefix_REX to below condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! combine all check.

Copy link
Contributor

@KanRobert KanRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general

@woruyu woruyu force-pushed the fix/EVEX-VEX-prefixes branch from c9bfee5 to 130d292 Compare September 25, 2025 02:14
@phoebewang phoebewang merged commit 7b2b375 into llvm:main Sep 25, 2025
9 checks passed
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MC][x86-64] Clang silently mis-assembles instructions with high-byte registers and EVEX/VEX prefixes in release mode and crashes with assertion enabled

4 participants