Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TySan] A Type Sanitizer (Clang) #76260

Open
wants to merge 2 commits into
base: users/fhahn/tysan-a-type-sanitizer-llvm
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Dec 22, 2023

This patch introduces the runtime components of a type sanitizer: a sanitizer for type-based aliasing violations.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.

The Clang changes seems mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0.

Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.

Based on https://reviews.llvm.org/D32199.

#76259 (LLVM support)

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen labels Dec 22, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 22, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Florian Hahn (fhahn)

Changes

This patch introduces the runtime components of a type sanitizer: a sanitizer for type-based aliasing violations.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.

The Clang changes seems mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0.

Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.

Based on https://reviews.llvm.org/D32199.


Patch is 22.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76260.diff

17 Files Affected:

  • (modified) clang/include/clang/Basic/Features.def (+1)
  • (modified) clang/include/clang/Basic/Sanitizers.def (+3)
  • (modified) clang/include/clang/Driver/SanitizerArgs.h (+1)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+6)
  • (modified) clang/lib/CodeGen/CGDecl.cpp (+2-1)
  • (modified) clang/lib/CodeGen/CGDeclCXX.cpp (+4)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+2)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+7-5)
  • (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+4-2)
  • (modified) clang/lib/CodeGen/SanitizerMetadata.cpp (+34-10)
  • (modified) clang/lib/CodeGen/SanitizerMetadata.h (+7-6)
  • (modified) clang/lib/Driver/SanitizerArgs.cpp (+10-5)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+5-1)
  • (modified) clang/lib/Driver/ToolChains/Darwin.cpp (+5)
  • (modified) clang/lib/Driver/ToolChains/Linux.cpp (+2)
  • (added) clang/test/CodeGen/sanitize-type-attr.cpp (+74)
  • (modified) clang/test/Driver/sanitizer-ld.c (+23)
diff --git a/clang/include/clang/Basic/Features.def b/clang/include/clang/Basic/Features.def
index 06efac0cf1abd7..6a1bc04963af68 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -98,6 +98,7 @@ FEATURE(nullability_nullable_result, true)
 FEATURE(memory_sanitizer,
         LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
                                    SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def b/clang/include/clang/Basic/Sanitizers.def
index c2137e3f61f645..a0f54d9e86327a 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h b/clang/include/clang/Driver/SanitizerArgs.h
index 07070ec4fc0653..52b482a0e8a1a9 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -86,6 +86,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
     return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 480410db1021b7..d7c233b4da3ff5 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -79,6 +79,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -685,6 +686,11 @@ static void addSanitizers(const Triple &TargetTriple,
       MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
     }
 
+    if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+      MPM.addPass(ModuleTypeSanitizerPass());
+      MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+    }
+
     auto ASanPass = [&](SanitizerMask Mask, bool CompileKernel) {
       if (LangOpts.Sanitize.has(Mask)) {
         bool UseGlobalGC = asanUseGlobalsGC(TargetTriple, CodeGenOpts);
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index a5da0aa2965a00..3f8e58db65fe1b 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -481,7 +481,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index e08a1e5f42df20..08b3e06cb5a450 100644
--- a/clang/lib/CodeGen/CGDeclCXX.cpp
+++ b/clang/lib/CodeGen/CGDeclCXX.cpp
@@ -472,6 +472,10 @@ llvm::Function *CodeGenModule::CreateGlobalInitOrCleanUpFunction(
       !isInNoSanitizeList(SanitizerKind::MemtagStack, Fn, Loc))
     Fn->addFnAttr(llvm::Attribute::SanitizeMemTag);
 
+  if (getLangOpts().Sanitize.has(SanitizerKind::Type) &&
+      !isInNoSanitizeList(SanitizerKind::Type, Fn, Loc))
+    Fn->addFnAttr(llvm::Attribute::SanitizeType);
+
   if (getLangOpts().Sanitize.has(SanitizerKind::Thread) &&
       !isInNoSanitizeList(SanitizerKind::Thread, Fn, Loc))
     Fn->addFnAttr(llvm::Attribute::SanitizeThread);
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 2199d7b58fb96e..26825802fc3d93 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -790,6 +790,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       Fn->addFnAttr(llvm::Attribute::SanitizeMemTag);
     if (SanOpts.has(SanitizerKind::Thread))
       Fn->addFnAttr(llvm::Attribute::SanitizeThread);
+    if (SanOpts.has(SanitizerKind::Type))
+      Fn->addFnAttr(llvm::Attribute::SanitizeType);
     if (SanOpts.hasOneOf(SanitizerKind::Memory | SanitizerKind::KernelMemory))
       Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
   }
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index d78f2594a23764..891af7252b24a6 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -391,8 +391,8 @@ CodeGenModule::CodeGenModule(ASTContext &C,
   if (LangOpts.HLSL)
     createHLSLRuntime();
 
-  // Enable TBAA unless it's suppressed. ThreadSanitizer needs TBAA even at O0.
-  if (LangOpts.Sanitize.has(SanitizerKind::Thread) ||
+  // Enable TBAA unless it's suppressed. TSan and TySan need TBAA even at O0.
+  if (LangOpts.Sanitize.hasOneOf(SanitizerKind::Thread | SanitizerKind::Type) ||
       (!CodeGenOpts.RelaxedAliasing && CodeGenOpts.OptimizationLevel > 0))
     TBAA.reset(new CodeGenTBAA(Context, TheModule, CodeGenOpts, getLangOpts(),
                                getCXXABI().getMangleContext()));
@@ -4924,7 +4924,7 @@ CodeGenModule::GetOrCreateLLVMGlobal(StringRef MangledName, llvm::Type *Ty,
   }
 
   if (D)
-    SanitizerMD->reportGlobal(GV, *D);
+    SanitizerMD->reportGlobalToASan(GV, *D);
 
   LangAS ExpectedAS =
       D ? D->getType().getAddressSpace()
@@ -5465,7 +5465,8 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
   if (NeedsGlobalCtor || NeedsGlobalDtor)
     EmitCXXGlobalVarDeclInitFunc(D, GV, NeedsGlobalCtor);
 
-  SanitizerMD->reportGlobal(GV, *D, NeedsGlobalCtor);
+  SanitizerMD->reportGlobalToASan(GV, *D, NeedsGlobalCtor);
+  SanitizerMD->reportGlobalToTySan(GV, *D);
 
   // Emit global variable debug information.
   if (CGDebugInfo *DI = getModuleDebugInfo())
@@ -6341,7 +6342,8 @@ CodeGenModule::GetAddrOfConstantStringFromLiteral(const StringLiteral *S,
   if (Entry)
     *Entry = GV;
 
-  SanitizerMD->reportGlobal(GV, S->getStrTokenLoc(0), "<string literal>");
+  SanitizerMD->reportGlobalToASan(GV, S->getStrTokenLoc(0), "<string literal>");
+  // FIXME: Should we also report to the TySan?
 
   return ConstantAddress(castStringLiteralToDefaultAddressSpace(*this, GV),
                          GV->getValueType(), Alignment);
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index dc288bc3f6157a..861b2b35a27d13 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -226,8 +226,10 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
 }
 
 llvm::MDNode *CodeGenTBAA::getTypeInfo(QualType QTy) {
-  // At -O0 or relaxed aliasing, TBAA is not emitted for regular types.
-  if (CodeGenOpts.OptimizationLevel == 0 || CodeGenOpts.RelaxedAliasing)
+  // At -O0 or relaxed aliasing, TBAA is not emitted for regular types (unless
+  // we're running TypeSanitizer).
+  if (!Features.Sanitize.has(SanitizerKind::Type) &&
+      (CodeGenOpts.OptimizationLevel == 0 || CodeGenOpts.RelaxedAliasing))
     return nullptr;
 
   // If the type has the may_alias attribute (even on a typedef), it is
diff --git a/clang/lib/CodeGen/SanitizerMetadata.cpp b/clang/lib/CodeGen/SanitizerMetadata.cpp
index 53161c316c58a4..c0d38d3c747dea 100644
--- a/clang/lib/CodeGen/SanitizerMetadata.cpp
+++ b/clang/lib/CodeGen/SanitizerMetadata.cpp
@@ -34,11 +34,11 @@ SanitizerMask expandKernelSanitizerMasks(SanitizerMask Mask) {
   return Mask;
 }
 
-void SanitizerMetadata::reportGlobal(llvm::GlobalVariable *GV,
-                                     SourceLocation Loc, StringRef Name,
-                                     QualType Ty,
-                                     SanitizerMask NoSanitizeAttrMask,
-                                     bool IsDynInit) {
+void SanitizerMetadata::reportGlobalToASan(llvm::GlobalVariable *GV,
+                                           SourceLocation Loc, StringRef Name,
+                                           QualType Ty,
+                                           SanitizerMask NoSanitizeAttrMask,
+                                           bool IsDynInit) {
   SanitizerSet FsanitizeArgument = CGM.getLangOpts().Sanitize;
   if (!isAsanHwasanOrMemTag(FsanitizeArgument))
     return;
@@ -75,8 +75,8 @@ void SanitizerMetadata::reportGlobal(llvm::GlobalVariable *GV,
   GV->setSanitizerMetadata(Meta);
 }
 
-void SanitizerMetadata::reportGlobal(llvm::GlobalVariable *GV, const VarDecl &D,
-                                     bool IsDynInit) {
+void SanitizerMetadata::reportGlobalToASan(llvm::GlobalVariable *GV,
+                                           const VarDecl &D, bool IsDynInit) {
   if (!isAsanHwasanOrMemTag(CGM.getLangOpts().Sanitize))
     return;
   std::string QualName;
@@ -94,10 +94,34 @@ void SanitizerMetadata::reportGlobal(llvm::GlobalVariable *GV, const VarDecl &D,
     return NoSanitizeMask;
   };
 
-  reportGlobal(GV, D.getLocation(), OS.str(), D.getType(), getNoSanitizeMask(D),
-               IsDynInit);
+  reportGlobalToASan(GV, D.getLocation(), OS.str(), D.getType(),
+                     getNoSanitizeMask(D), IsDynInit);
+}
+
+void SanitizerMetadata::reportGlobalToTySan(llvm::GlobalVariable *GV,
+                                            const VarDecl &D) {
+  if (!CGM.getLangOpts().Sanitize.has(SanitizerKind::Type))
+    return;
+
+  for (auto Attr : D.specific_attrs<NoSanitizeAttr>())
+    if (Attr->getMask() & SanitizerKind::Type)
+      return;
+
+  QualType QTy = D.getType();
+  llvm::MDNode *TBAAInfo = CGM.getTBAATypeInfo(QTy);
+  if (!TBAAInfo || TBAAInfo == CGM.getTBAATypeInfo(CGM.getContext().CharTy))
+    return;
+
+  llvm::Metadata *GlobalMetadata[] = {llvm::ConstantAsMetadata::get(GV),
+                                      TBAAInfo};
+
+  llvm::MDNode *ThisGlobal =
+      llvm::MDNode::get(CGM.getLLVMContext(), GlobalMetadata);
+  llvm::NamedMDNode *TysanGlobals =
+      CGM.getModule().getOrInsertNamedMetadata("llvm.tysan.globals");
+  TysanGlobals->addOperand(ThisGlobal);
 }
 
 void SanitizerMetadata::disableSanitizerForGlobal(llvm::GlobalVariable *GV) {
-  reportGlobal(GV, SourceLocation(), "", QualType(), SanitizerKind::All);
+  reportGlobalToASan(GV, SourceLocation(), "", QualType(), SanitizerKind::All);
 }
diff --git a/clang/lib/CodeGen/SanitizerMetadata.h b/clang/lib/CodeGen/SanitizerMetadata.h
index 000f02cf8dcf11..9de087c518c6ad 100644
--- a/clang/lib/CodeGen/SanitizerMetadata.h
+++ b/clang/lib/CodeGen/SanitizerMetadata.h
@@ -37,12 +37,13 @@ class SanitizerMetadata {
 
 public:
   SanitizerMetadata(CodeGenModule &CGM);
-  void reportGlobal(llvm::GlobalVariable *GV, const VarDecl &D,
-                    bool IsDynInit = false);
-  void reportGlobal(llvm::GlobalVariable *GV, SourceLocation Loc,
-                    StringRef Name, QualType Ty = {},
-                    SanitizerMask NoSanitizeAttrMask = {},
-                    bool IsDynInit = false);
+  void reportGlobalToASan(llvm::GlobalVariable *GV, const VarDecl &D,
+                          bool IsDynInit = false);
+  void reportGlobalToASan(llvm::GlobalVariable *GV, SourceLocation Loc,
+                          StringRef Name, QualType Ty = {},
+                          SanitizerMask NoSanitizeAttrMask = {},
+                          bool IsDynInit = false);
+  void reportGlobalToTySan(llvm::GlobalVariable *GV, const VarDecl &D);
   void disableSanitizerForGlobal(llvm::GlobalVariable *GV);
 };
 } // end namespace CodeGen
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index ad68c086b71790..6865451b8bb7e4 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -42,14 +42,14 @@ static const SanitizerMask NotAllowedWithExecuteOnly =
 static const SanitizerMask RequiresPIE =
     SanitizerKind::DataFlow | SanitizerKind::Scudo;
 static const SanitizerMask NeedsUnwindTables =
-    SanitizerKind::Address | SanitizerKind::HWAddress | SanitizerKind::Thread |
-    SanitizerKind::Memory | SanitizerKind::DataFlow;
+    SanitizerKind::Address | SanitizerKind::HWAddress | SanitizerKind::Type |
+    SanitizerKind::Thread | SanitizerKind::Memory | SanitizerKind::DataFlow;
 static const SanitizerMask SupportsCoverage =
     SanitizerKind::Address | SanitizerKind::HWAddress |
     SanitizerKind::KernelAddress | SanitizerKind::KernelHWAddress |
-    SanitizerKind::MemtagStack | SanitizerKind::MemtagHeap |
-    SanitizerKind::MemtagGlobals | SanitizerKind::Memory |
-    SanitizerKind::KernelMemory | SanitizerKind::Leak |
+    SanitizerKind::Type | SanitizerKind::MemtagStack |
+    SanitizerKind::MemtagHeap | SanitizerKind::MemtagGlobals |
+    SanitizerKind::Memory | SanitizerKind::KernelMemory | SanitizerKind::Leak |
     SanitizerKind::Undefined | SanitizerKind::Integer | SanitizerKind::Bounds |
     SanitizerKind::ImplicitConversion | SanitizerKind::Nullability |
     SanitizerKind::DataFlow | SanitizerKind::Fuzzer |
@@ -178,6 +178,7 @@ static void addDefaultIgnorelists(const Driver &D, SanitizerMask Kinds,
                      {"memtag_ignorelist.txt", SanitizerKind::MemTag},
                      {"msan_ignorelist.txt", SanitizerKind::Memory},
                      {"tsan_ignorelist.txt", SanitizerKind::Thread},
+                     {"tysan_blacklist.txt", SanitizerKind::Type},
                      {"dfsan_abilist.txt", SanitizerKind::DataFlow},
                      {"cfi_ignorelist.txt", SanitizerKind::CFI},
                      {"ubsan_ignorelist.txt",
@@ -516,6 +517,10 @@ SanitizerArgs::SanitizerArgs(const ToolChain &TC,
   std::pair<SanitizerMask, SanitizerMask> IncompatibleGroups[] = {
       std::make_pair(SanitizerKind::Address,
                      SanitizerKind::Thread | SanitizerKind::Memory),
+      std::make_pair(SanitizerKind::Type,
+                     SanitizerKind::Address | SanitizerKind::KernelAddress |
+                         SanitizerKind::Memory | SanitizerKind::Leak |
+                         SanitizerKind::Thread | SanitizerKind::KernelAddress),
       std::make_pair(SanitizerKind::Thread, SanitizerKind::Memory),
       std::make_pair(SanitizerKind::Leak,
                      SanitizerKind::Thread | SanitizerKind::Memory),
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 6eb0ed8f3fed9a..bb2b85c0553d72 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1337,8 +1337,10 @@ collectSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
     if (SanArgs.needsScudoRt()) {
       SharedRuntimes.push_back("scudo_standalone");
     }
-    if (SanArgs.needsTsanRt())
+    if (SanArgs.needsTsanRt() && SanArgs.linkRuntimes())
       SharedRuntimes.push_back("tsan");
+    if (SanArgs.needsTysanRt())
+      StaticRuntimes.push_back("tysan");
     if (SanArgs.needsHwasanRt()) {
       if (SanArgs.needsHwasanAliasesRt())
         SharedRuntimes.push_back("hwasan_aliases");
@@ -1403,6 +1405,8 @@ collectSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
     if (SanArgs.linkCXXRuntimes())
       StaticRuntimes.push_back("tsan_cxx");
   }
+  if (!SanArgs.needsSharedRt() && SanArgs.needsTysanRt())
+    StaticRuntimes.push_back("tysan");
   if (!SanArgs.needsSharedRt() && SanArgs.needsUbsanRt()) {
     if (SanArgs.requiresMinimalRuntime()) {
       StaticRuntimes.push_back("ubsan_minimal");
diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp
index 65846cace461e3..02028bc1a50cc7 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -1505,6 +1505,8 @@ void DarwinClang::AddLinkRuntimeLibArgs(const ArgList &Args,
              "Static sanitizer runtimes not supported");
       AddLinkSanitizerLibArgs(Args, CmdArgs, "tsan");
     }
+    if (Sanitize.needsTysanRt())
+      AddLinkSanitizerLibArgs(Args, CmdArgs, "tysan");
     if (Sanitize.needsFuzzer() && !Args.hasArg(options::OPT_dynamiclib)) {
       AddLinkSanitizerLibArgs(Args, CmdArgs, "fuzzer", /*shared=*/false);
 
@@ -3373,6 +3375,9 @@ SanitizerMask Darwin::getSupportedSanitizers() const {
        isTargetTvOSSimulator() || isTargetWatchOSSimulator())) {
     Res |= SanitizerKind::Thread;
   }
+  if ((IsX86_64 || IsAArch64) && isTargetMacOSBased()) {
+    Res |= SanitizerKind::Type;
+  }
   return Res;
 }
 
diff --git a/clang/lib/Driver/ToolChains/Linux.cpp b/clang/lib/Driver/ToolChains/Linux.cpp
index 735af54f114cef..97a8d2a29f1b95 100644
--- a/clang/lib/Driver/ToolChains/Linux.cpp
+++ b/clang/lib/Driver/ToolChains/Linux.cpp
@@ -803,6 +803,8 @@ SanitizerMask Linux::getSupportedSanitizers() const {
   if (IsX86_64 || IsMIPS64 || IsAArch64 || IsPowerPC64 || IsSystemZ ||
       IsLoongArch64 || IsRISCV64)
     Res |= SanitizerKind::Thread;
+  if (IsX86_64 || IsAArch64)
+    Res |= SanitizerKind::Type;
   if (IsX86_64 || IsSystemZ)
     Res |= SanitizerKind::KernelMemory;
   if (IsX86_64 || IsMIPS64 || IsAArch64 || IsX86 || IsMIPS || IsArmArch ||
diff --git a/clang/test/CodeGen/sanitize-type-attr.cpp b/clang/test/CodeGen/sanitize-type-attr.cpp
new file mode 100644
index 00000000000000..4da8488e1f9486
--- /dev/null
+++ b/clang/test/CodeGen/sanitize-type-attr.cpp
@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s | FileCheck -check-prefix=WITHOUT %s
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type | FileCheck -check-prefix=TYSAN %s
+// RUN: echo "src:%s" | sed -e 's/\\/\\\\/g' > %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=type -fsanitize-blacklist=%t | FileCheck -check-prefix=BL %s
+
+// The sanitize_type attribute should be attached to functions
+// when TypeSanitizer is enabled, unless no_sanitize("type") attribute
+// is present.
+
+// WITHOUT:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN1{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type"))) int NoTYSAN1(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN2{{.*}}) [[NOATTR]]
+// BL:  NoTYSAN2{{.*}}) [[NOATTR]]
+// TYSAN:  NoTYSAN2{{.*}}) [[NOATTR]]
+__attribute__((no_sanitize("type"))) int NoTYSAN2(int *a);
+int NoTYSAN2(int *a) { return *a; }
+
+// WITHOUT:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// BL:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+// TYSAN:  NoTYSAN3{{.*}}) [[NOATTR:#[0-9]+]]
+__attribute__((no_sanitize("type"))) int NoTYSAN3(int *a) { return *a; }
+
+// WITHOUT:  TYSANOk{{.*}}) [[NOATTR]]
+// BL:  TYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TYSANOk{{.*}}) [[WITH:#[0-9]+]]
+int TYSANOk(int *a) { return *a; }
+
+// WITHOUT:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// BL:  TemplateTYSANOk{{.*}}) [[NOATTR]]
+// TYSAN: TemplateTYSANOk{{.*}}) [[WITH]]
+template <int i>
+int TemplateTYSANOk() { return i; }
+
+// WITHOUT:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// BL:  TemplateNoTYSAN{{.*}}) [[NOATTR]]
+// TYSAN: TemplateNoTYSAN{{.*}}) [[NOATTR]]
+template <int i>
+__attribute__((no_sanitize("type"))) int TemplateNoTYSAN() { return i; }
+
+int force_instance = TemplateTYSANOk<42>() + TemplateNoTYSAN<42>();
+
+// Check that __cxx_global_var_init* get the sanitize_type attribute.
+int global1 = 0;
+int global2 = *(int *)((char *)&global1 + 1);
+// WITHOUT: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// BL: @__cxx_global_var_init{{.*}}[[NOATTR:#[0-9]+]]
+// TYSAN: @__cxx_global_var_init{{.*}}[[WITH:#[0-9]+]]
+
+// Make sure that we don't add globals to the list for which we don't have a
+// specific type description.
+// FIXME: We now have a type description for this type and a global...
[truncated]

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 58839969d139a2e492ac217a5b49ac345ed73497 4ee0decc0c0a3e7aad9fc5705720752de8c15b04 -- clang/test/CodeGen/sanitize-type-attr.cpp clang/include/clang/Driver/SanitizerArgs.h clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CGDecl.cpp clang/lib/CodeGen/CGDeclCXX.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenTBAA.cpp clang/lib/CodeGen/SanitizerMetadata.cpp clang/lib/CodeGen/SanitizerMetadata.h clang/lib/Driver/SanitizerArgs.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/Darwin.cpp clang/lib/Driver/ToolChains/Linux.cpp clang/test/Driver/sanitizer-ld.c
View the diff from clang-format here.
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index 6865451b8b..802f46735d 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -550,10 +550,10 @@ SanitizerArgs::SanitizerArgs(const ToolChain &TC,
                          SanitizerKind::Leak | SanitizerKind::Thread |
                          SanitizerKind::Memory | SanitizerKind::KernelAddress |
                          SanitizerKind::Scudo | SanitizerKind::SafeStack),
-      std::make_pair(SanitizerKind::MemTag,
-                     SanitizerKind::Address | SanitizerKind::KernelAddress |
-                         SanitizerKind::HWAddress |
-                         SanitizerKind::KernelHWAddress),
+      std::make_pair(SanitizerKind::MemTag, SanitizerKind::Address |
+                                                SanitizerKind::KernelAddress |
+                                                SanitizerKind::HWAddress |
+                                                SanitizerKind::KernelHWAddress),
       std::make_pair(SanitizerKind::KCFI, SanitizerKind::Function)};
   // Enable toolchain specific default sanitizers if not explicitly disabled.
   SanitizerMask Default = TC.getDefaultSanitizers() & ~AllRemove;

fhahn added a commit to fhahn/llvm-project that referenced this pull request Dec 22, 2023
This patch introduces the runtime components of a type sanitizer: a sanitizer for type-based aliasing violations.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.

The Clang changes seems mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0.

Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.

Based on https://reviews.llvm.org/D32199.

Pull Request: llvm#76260
@thesamesam thesamesam added the TBAA Type-Based Alias Analysis / Strict Aliasing label Dec 26, 2023
Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clang changes are ok, but this needs some level of documentation/release notes, which I don't see in the clang release. As this is a part of a larger feature, do we intend to push that later?

Also, the clang-format suggestion makes sense.

@fhahn fhahn requested a review from Endilll as a code owner April 18, 2024 21:59
@fhahn fhahn force-pushed the users/fhahn/tysan-a-type-sanitizer-clang branch from 76b7017 to 96912ae Compare April 18, 2024 22:01
@fhahn fhahn changed the base branch from users/fhahn/main.tysan-a-type-sanitizer-clang to users/fhahn/tysan-a-type-sanitizer-llvm April 18, 2024 22:01
@fhahn fhahn force-pushed the users/fhahn/tysan-a-type-sanitizer-clang branch from 7f93cb0 to ee7ed21 Compare April 18, 2024 22:04
@fhahn fhahn force-pushed the users/fhahn/tysan-a-type-sanitizer-llvm branch from eb97b07 to b1465a7 Compare June 27, 2024 16:09
@fhahn fhahn force-pushed the users/fhahn/tysan-a-type-sanitizer-clang branch from ee7ed21 to fb7da09 Compare June 27, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category TBAA Type-Based Alias Analysis / Strict Aliasing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants