Skip to content

Conversation

trcrsired
Copy link
Contributor

@trcrsired trcrsired commented Oct 11, 2025

My WebAssembly Memory Tagging paper gets accepted by CCSW 2025. I am trying to see whether this can be merged upstream so that LLVM team and Wasm team can give me advices and finally put it into LLVM itself.

Wasm Memory Tagging is much more useful than ARM MTE since ARM MTE is an ARM only thing, while wasm memory tagging works on all platforms and it translates the instructions to ARM MTE if the hardware supports it.

With the rise of Progressive Web Apps, Wasm memory tagging is the future of memory safety for C/C++.

Here is the abstract of the paper:

  • WebAssembly has become increasingly popular in web development, offering a versatile and efficient platform for executing code in languages beyond JavaScript, such as C and C++. However, WebAssembly's flat memory model exposes memory-safety vulnerabilities. While C and C++ code are primarily susceptible to memory safety issues, Rust, though improved, still presents vulnerabilities. Inspired by the ARM Memory Tagging Extension, this paper proposes a memory tagging solution for WebAssembly. Our evaluation indicates that the proposed memory tagging mechanism introduces average time overheads of 48.91% for Wasm64 and 72.38% for Wasm32 in pure software implementations. On ARM Memory Tagging Extension-supported CPUs, the time overheads decrease to 5.71% for Wasm64 and 18.05% for Wasm32. Additionally, we compare our WebAssembly memory tagging to the host Address Sanitizer to demonstrate the efficacy of our approach. Finally, we conduct a case study on real-world CVEs to demonstrate the impact of our work.

trcrsired and others added 30 commits January 7, 2024 10:52
The reason is that ARM MTE allows to operate the tag part,
it is very difficult to emulate the behavior when the architecture changes.
I do not want the users to assume the size of memtag because maybe other
architectures will use 8 bits instead of 4 bits for memtag in the future.

However, I find that it is mainly used for tagging stack to avoid repeatly generating
random numbers. This hint can be very useful since the VM can just emit the hint
idx modding the actual tagging size which can work
I forgot the offset can still be useful by avoiding add

Fix
They can provide deterministic result. so no need to add side effects
trcrsired and others added 25 commits October 16, 2024 19:30
@trcrsired trcrsired marked this pull request as draft October 11, 2025 04:55
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions inc,cpp,h -- llvm/lib/Target/WebAssembly/WebAssemblyGlobalsTagging.cpp llvm/lib/Target/WebAssembly/WebAssemblyStackTagging.cpp clang/lib/CodeGen/TargetBuiltins/WebAssembly.cpp clang/lib/Driver/SanitizerArgs.cpp clang/lib/Driver/ToolChain.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp compiler-rt/lib/builtins/fp_compare_impl.inc lldb/source/Host/android/HostInfoAndroid.cpp llvm/include/llvm/TargetParser/Triple.h llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp llvm/lib/MC/MCWasmStreamer.cpp llvm/lib/Target/WebAssembly/WebAssembly.h llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp mlir/lib/ExecutionEngine/CRunnerUtils.cpp

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/clang/lib/CodeGen/TargetBuiltins/WebAssembly.cpp b/clang/lib/CodeGen/TargetBuiltins/WebAssembly.cpp
index e490391e5..8d74104f1 100644
--- a/clang/lib/CodeGen/TargetBuiltins/WebAssembly.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/WebAssembly.cpp
@@ -674,34 +674,34 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
   case WebAssembly::BI__builtin_wasm_memtag_status: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_status,
-      {ConvertType(E->getType())});
+                                        {ConvertType(E->getType())});
     return Builder.CreateCall(Callee, {Index});
   }
   case WebAssembly::BI__builtin_wasm_memtag_tagbits: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_tagbits,
-      {ConvertType(E->getType())});
+                                        {ConvertType(E->getType())});
     return Builder.CreateCall(Callee, {Index});
   }
   case WebAssembly::BI__builtin_wasm_memtag_startbit: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_startbit,
-      {ConvertType(E->getType())});
+                                        {ConvertType(E->getType())});
     return Builder.CreateCall(Callee, {Index});
   }
   case WebAssembly::BI__builtin_wasm_memtag_extract: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_extract,
-      {ConvertType(E->getType())});
+                                        {ConvertType(E->getType())});
     return Builder.CreateCall(Callee, {Index, Ptr});
   }
   case WebAssembly::BI__builtin_wasm_memtag_insert: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *Newtag = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_insert,
-      {Newtag->getType()});
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_insert, {Newtag->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, Newtag});
   }
   case WebAssembly::BI__builtin_wasm_memtag_copy: {
@@ -712,7 +712,8 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     return Builder.CreateCall(Callee, {Index, Ptr0, Ptr1});
   }
   case WebAssembly::BI__builtin_wasm_memtag_sub: {
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_sub, ConvertType(E->getType()));
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_sub, ConvertType(E->getType()));
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr0 = EmitScalarExpr(E->getArg(1));
     Value *Ptr1 = EmitScalarExpr(E->getArg(2));
@@ -728,14 +729,16 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_store, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_store, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_storez: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_storez, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_storez, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_untag: {
@@ -748,14 +751,16 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_untagstore, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_untagstore, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_untagstorez: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_untagstorez, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_untagstorez, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_random: {
@@ -768,21 +773,24 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstore, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstore, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_randomstorez: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstorez, B16->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstorez, B16->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, B16});
   }
   case WebAssembly::BI__builtin_wasm_memtag_randommask: {
     Value *Index = EmitScalarExpr(E->getArg(0));
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *Mask = EmitScalarExpr(E->getArg(2));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_random, Mask->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_random, Mask->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, Mask});
   }
   case WebAssembly::BI__builtin_wasm_memtag_randommaskstore: {
@@ -791,7 +799,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *B16 = EmitScalarExpr(E->getArg(2));
     Value *Mask = EmitScalarExpr(E->getArg(3));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstore,
-      {B16->getType(), Mask->getType()});
+                                        {B16->getType(), Mask->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, Mask});
   }
   case WebAssembly::BI__builtin_wasm_memtag_randommaskstorez: {
@@ -799,8 +807,8 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *B16 = EmitScalarExpr(E->getArg(2));
     Value *Mask = EmitScalarExpr(E->getArg(3));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstorez, 
-      {B16->getType(), Mask->getType()});
+    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_randomstorez,
+                                        {B16->getType(), Mask->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, Mask});
   }
   case WebAssembly::BI__builtin_wasm_memtag_add: {
@@ -808,7 +816,8 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *PtrOffset = EmitScalarExpr(E->getArg(2));
     Value *TagOffset = EmitScalarExpr(E->getArg(3));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_add, TagOffset->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_add, TagOffset->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, PtrOffset, TagOffset});
   }
   case WebAssembly::BI__builtin_wasm_memtag_addstore: {
@@ -818,7 +827,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *PtrOffset = EmitScalarExpr(E->getArg(3));
     Value *TagOffset = EmitScalarExpr(E->getArg(4));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_addstore,
-      {B16->getType(), TagOffset->getType()});
+                                        {B16->getType(), TagOffset->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, PtrOffset, TagOffset});
   }
   case WebAssembly::BI__builtin_wasm_memtag_addstorez: {
@@ -828,7 +837,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *PtrOffset = EmitScalarExpr(E->getArg(3));
     Value *TagOffset = EmitScalarExpr(E->getArg(4));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_addstorez,
-      {B16->getType(), TagOffset->getType()});
+                                        {B16->getType(), TagOffset->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, PtrOffset, TagOffset});
   }
   case WebAssembly::BI__builtin_wasm_memtag_hint: {
@@ -836,7 +845,8 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *Ptr = EmitScalarExpr(E->getArg(1));
     Value *HintPtr = EmitScalarExpr(E->getArg(2));
     Value *HintIdx = EmitScalarExpr(E->getArg(3));
-    Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_hint, HintIdx->getType());
+    Function *Callee =
+        CGM.getIntrinsic(Intrinsic::wasm_memtag_hint, HintIdx->getType());
     return Builder.CreateCall(Callee, {Index, Ptr, HintPtr, HintIdx});
   }
   case WebAssembly::BI__builtin_wasm_memtag_hintstore: {
@@ -846,7 +856,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *HintPtr = EmitScalarExpr(E->getArg(3));
     Value *HintIdx = EmitScalarExpr(E->getArg(4));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_hintstore,
-      {B16->getType(), HintIdx->getType()});
+                                        {B16->getType(), HintIdx->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, HintPtr, HintIdx});
   }
   case WebAssembly::BI__builtin_wasm_memtag_hintstorez: {
@@ -856,7 +866,7 @@ Value *CodeGenFunction::EmitWebAssemblyBuiltinExpr(unsigned BuiltinID,
     Value *HintPtr = EmitScalarExpr(E->getArg(3));
     Value *HintIdx = EmitScalarExpr(E->getArg(4));
     Function *Callee = CGM.getIntrinsic(Intrinsic::wasm_memtag_hintstorez,
-      {B16->getType(), HintIdx->getType()});
+                                        {B16->getType(), HintIdx->getType()});
     return Builder.CreateCall(Callee, {Index, Ptr, B16, HintPtr, HintIdx});
   }
   default:
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index fe23ae76d..28d2098ae 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -1577,8 +1577,7 @@ void SanitizerArgs::addArgs(const ToolChain &TC, const llvm::opt::ArgList &Args,
         << "-fvisibility=";
   }
 
-  if (Sanitizers.has(SanitizerKind::MemtagStack) &&
-      !TC.getTriple().isWasm() &&
+  if (Sanitizers.has(SanitizerKind::MemtagStack) && !TC.getTriple().isWasm() &&
       !hasTargetFeatureMTE(CmdArgs))
     TC.getDriver().Diag(diag::err_stack_tagging_requires_hardware_feature);
 }
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 2abdcd45e..0c07670ef 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1753,17 +1753,14 @@ bool tools::addSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
 
   if (SanArgs.hasMemTag()) {
     if (TC.getTriple().isWasm()) {
-    }
-    else if (TC.getTriple().isAndroid()) {
-      CmdArgs.push_back(
-          Args.MakeArgString("--android-memtag-mode=" + SanArgs.getMemtagMode()));
+    } else if (TC.getTriple().isAndroid()) {
+      CmdArgs.push_back(Args.MakeArgString("--android-memtag-mode=" +
+                                           SanArgs.getMemtagMode()));
       if (SanArgs.hasMemtagHeap())
         CmdArgs.push_back("--android-memtag-heap");
       if (SanArgs.hasMemtagStack())
         CmdArgs.push_back("--android-memtag-stack");
-    }
-    else
-    {
+    } else {
       TC.getDriver().Diag(diag::err_drv_unsupported_opt_for_target)
           << "-fsanitize=memtag*" << TC.getTriple().str();
     }
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 9dc48e46a..b3510a49c 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -783,8 +783,7 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
 
     if (T.isWasm()) {
       supportMemtagGlobals = true;
-    }
-    else if (arch == Triple::aarch64) {
+    } else if (arch == Triple::aarch64) {
       supportMemtagGlobals = true;
     }
 
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyGlobalsTagging.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyGlobalsTagging.cpp
index 01da537c9..6f878aaf2 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyGlobalsTagging.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyGlobalsTagging.cpp
@@ -1,4 +1,5 @@
-//===- WebAssemblyGlobalsTagging.cpp - Global tagging in IR -------------------===//
+//===- WebAssemblyGlobalsTagging.cpp - Global tagging in IR
+//-------------------===//
 //
 //                     The LLVM Compiler Infrastructure
 //
@@ -106,7 +107,9 @@ public:
 
   bool runOnModule(Module &M) override;
 
-  StringRef getPassName() const override { return "WebAssembly Globals Tagging"; }
+  StringRef getPassName() const override {
+    return "WebAssembly Globals Tagging";
+  }
 
 private:
   std::set<GlobalVariable *> GlobalsToTag;

@trcrsired
Copy link
Contributor Author

@nikic @fmayer

@fmayer
Copy link
Contributor

fmayer commented Oct 14, 2025

@pcc FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants