Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM][Instrumentation] Add numerical sanitizer #85916

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alexander-shaposhnikov
Copy link
Collaborator

@alexander-shaposhnikov alexander-shaposhnikov commented Mar 20, 2024

This PR introduces the numerical sanitizer originally proposed by Clement Courbet on https://reviews.llvm.org/D97854
(https://arxiv.org/abs/2102.12782).

The main additions include:

  • Migration to LLVM opaque pointers
  • Migration to various updated APIs
  • Extended coverage for LLVM instructions/intrinsics, that enabled us to have a green run of tests again

The tool is still very experimental, the coverage (e.g. for intrinsics / library functions) is incomplete.

Test plan:

  1. ninja check-all
  2. ninja check-nsan

@llvmbot llvmbot added clang Clang issues not falling into any other category compiler-rt clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen compiler-rt:sanitizer llvm:ir llvm:transforms labels Mar 20, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 20, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-compiler-rt-sanitizer
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-clang-codegen

Author: Alexander Shaposhnikov (alexander-shaposhnikov)

Changes

This PR introduces the numerical sanitizer originally proposed by Clement Courbet on https://reviews.llvm.org/D97854
(https://arxiv.org/abs/2102.12782).

The main additions include:

  • Migration to LLVM opaque pointers
  • Migration to various updated APIs
  • Extended coverage for LLVM instructions/intrinsics, that enabled us to have a green run of tests again

Patch is 344.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85916.diff

77 Files Affected:

  • (modified) clang/include/clang/Basic/Features.def (+1)
  • (modified) clang/include/clang/Basic/Sanitizers.def (+3)
  • (modified) clang/include/clang/Driver/SanitizerArgs.h (+3)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+7)
  • (modified) clang/lib/CodeGen/CGDeclCXX.cpp (+4)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+2)
  • (modified) clang/lib/Driver/SanitizerArgs.cpp (+5-2)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+3)
  • (modified) clang/lib/Driver/ToolChains/Linux.cpp (+4)
  • (modified) clang/runtime/CMakeLists.txt (+2-1)
  • (modified) compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake (+1)
  • (modified) compiler-rt/cmake/config-ix.cmake (+12-1)
  • (added) compiler-rt/include/sanitizer/nsan_interface.h (+75)
  • (added) compiler-rt/lib/nsan/CMakeLists.txt (+61)
  • (added) compiler-rt/lib/nsan/nsan.cc (+828)
  • (added) compiler-rt/lib/nsan/nsan.h (+224)
  • (added) compiler-rt/lib/nsan/nsan.syms.extra (+2)
  • (added) compiler-rt/lib/nsan/nsan_flags.cc (+78)
  • (added) compiler-rt/lib/nsan/nsan_flags.h (+35)
  • (added) compiler-rt/lib/nsan/nsan_flags.inc (+49)
  • (added) compiler-rt/lib/nsan/nsan_interceptors.cc (+363)
  • (added) compiler-rt/lib/nsan/nsan_platform.h (+135)
  • (added) compiler-rt/lib/nsan/nsan_stats.cc (+158)
  • (added) compiler-rt/lib/nsan/nsan_stats.h (+92)
  • (added) compiler-rt/lib/nsan/nsan_suppressions.cc (+76)
  • (added) compiler-rt/lib/nsan/nsan_suppressions.h (+31)
  • (added) compiler-rt/lib/nsan/tests/CMakeLists.txt (+54)
  • (added) compiler-rt/lib/nsan/tests/NSanUnitTest.cpp (+67)
  • (added) compiler-rt/lib/nsan/tests/nsan_unit_test_main.cpp (+18)
  • (added) compiler-rt/test/nsan/CMakeLists.txt (+33)
  • (added) compiler-rt/test/nsan/alloca.cc (+24)
  • (added) compiler-rt/test/nsan/cadna_ex1.cc (+21)
  • (added) compiler-rt/test/nsan/cadna_ex2.cc (+52)
  • (added) compiler-rt/test/nsan/cadna_ex3.cc (+48)
  • (added) compiler-rt/test/nsan/cadna_ex4.cc (+37)
  • (added) compiler-rt/test/nsan/cadna_ex5.cc (+97)
  • (added) compiler-rt/test/nsan/cadna_ex6.cc (+67)
  • (added) compiler-rt/test/nsan/cadna_ex7.cc (+110)
  • (added) compiler-rt/test/nsan/cancellation_fn_ptr.cc (+66)
  • (added) compiler-rt/test/nsan/cancellation_libm.cc (+51)
  • (added) compiler-rt/test/nsan/cancellation_ok.cc (+53)
  • (added) compiler-rt/test/nsan/compare.cc (+28)
  • (added) compiler-rt/test/nsan/compute_pi.cc (+45)
  • (added) compiler-rt/test/nsan/helpers.h (+15)
  • (added) compiler-rt/test/nsan/infinity.cc (+24)
  • (added) compiler-rt/test/nsan/intercept_libc_str.cc (+149)
  • (added) compiler-rt/test/nsan/interface_dump_shadow_mem.cc (+62)
  • (added) compiler-rt/test/nsan/jmmuller.cc (+35)
  • (added) compiler-rt/test/nsan/lit.cfg.py (+45)
  • (added) compiler-rt/test/nsan/lit.site.cfg.py.in (+11)
  • (added) compiler-rt/test/nsan/memcpy.cc (+83)
  • (added) compiler-rt/test/nsan/memset_nonzero.cc (+23)
  • (added) compiler-rt/test/nsan/memset_zero.cc (+24)
  • (added) compiler-rt/test/nsan/rump_royal_pain.cc (+37)
  • (added) compiler-rt/test/nsan/simd.cc (+25)
  • (added) compiler-rt/test/nsan/stable_sort.cc (+52)
  • (added) compiler-rt/test/nsan/stack.cc (+20)
  • (added) compiler-rt/test/nsan/sums.cc (+82)
  • (added) compiler-rt/test/nsan/swap.cc (+46)
  • (added) compiler-rt/test/nsan/type_punning.cc (+26)
  • (added) compiler-rt/test/nsan/uninstrumented_write.cc (+22)
  • (added) compiler-rt/test/nsan/vector_push_back.cc (+17)
  • (added) compiler-rt/test/nsan/verificarlo_case4.cc (+29)
  • (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+1)
  • (modified) llvm/include/llvm/IR/Attributes.td (+4)
  • (added) llvm/include/llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h (+40)
  • (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+2)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+2)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassRegistry.def (+2)
  • (modified) llvm/lib/Transforms/Instrumentation/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp (+2261)
  • (modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+1)
  • (added) llvm/test/Instrumentation/NumericalStabilitySanitizer/basic.ll (+930)
  • (added) llvm/test/Instrumentation/NumericalStabilitySanitizer/cfg.ll (+113)
  • (added) llvm/test/Instrumentation/NumericalStabilitySanitizer/invoke.ll (+148)
  • (added) llvm/test/Instrumentation/NumericalStabilitySanitizer/memory.ll (+405)
diff --git a/clang/include/clang/Basic/Features.def b/clang/include/clang/Basic/Features.def
index eeed5f4751f2f4..c75bc2475ef74d 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -102,6 +102,7 @@ FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
 FEATURE(ptrauth_intrinsics, LangOpts.PointerAuthIntrinsics)
+FEATURE(numericalstability_sanitizer, LangOpts.Sanitize.has(SanitizerKind::NumericalStability))
 FEATURE(swiftasynccc,
   PP.getTargetInfo().checkCallingConvention(CC_SwiftAsync) ==
   clang::TargetInfo::CCCR_OK)
diff --git a/clang/include/clang/Basic/Sanitizers.def b/clang/include/clang/Basic/Sanitizers.def
index c2137e3f61f645..5f6249523e1d10 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -76,6 +76,9 @@ SANITIZER("fuzzer-no-link", FuzzerNoLink)
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
+// Numerical stability sanitizer.
+SANITIZER("numerical", NumericalStability)
+
 // LeakSanitizer
 SANITIZER("leak", Leak)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h b/clang/include/clang/Driver/SanitizerArgs.h
index 07070ec4fc0653..47ef175302679f 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -103,6 +103,9 @@ class SanitizerArgs {
   bool needsCfiDiagRt() const;
   bool needsStatsRt() const { return Stats; }
   bool needsScudoRt() const { return Sanitizers.has(SanitizerKind::Scudo); }
+  bool needsNsanRt() const {
+    return Sanitizers.has(SanitizerKind::NumericalStability);
+  }
 
   bool hasMemTag() const {
     return hasMemtagHeap() || hasMemtagStack() || hasMemtagGlobals();
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 82b30b8d815629..9ae3d3a6f82e73 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -75,6 +75,7 @@
 #include "llvm/Transforms/Instrumentation/KCFI.h"
 #include "llvm/Transforms/Instrumentation/MemProfiler.h"
 #include "llvm/Transforms/Instrumentation/MemorySanitizer.h"
+#include "llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h"
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
 #include "llvm/Transforms/Instrumentation/RemoveTrapsPass.h"
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
@@ -734,6 +735,12 @@ static void addSanitizers(const Triple &TargetTriple,
     if (LangOpts.Sanitize.has(SanitizerKind::DataFlow)) {
       MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles));
     }
+
+    if (LangOpts.Sanitize.has(SanitizerKind::NumericalStability)) {
+      MPM.addPass(NumericalStabilitySanitizerPass());
+      MPM.addPass(
+          createModuleToFunctionPassAdaptor(NumericalStabilitySanitizerPass()));
+    }
   };
   if (ClSanitizeOnOptimizerEarlyEP) {
     PB.registerOptimizerEarlyEPCallback(
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index e08a1e5f42df20..70e45e8f322a52 100644
--- a/clang/lib/CodeGen/CGDeclCXX.cpp
+++ b/clang/lib/CodeGen/CGDeclCXX.cpp
@@ -476,6 +476,10 @@ llvm::Function *CodeGenModule::CreateGlobalInitOrCleanUpFunction(
       !isInNoSanitizeList(SanitizerKind::Thread, Fn, Loc))
     Fn->addFnAttr(llvm::Attribute::SanitizeThread);
 
+  if (getLangOpts().Sanitize.has(SanitizerKind::NumericalStability) &&
+      !isInNoSanitizeList(SanitizerKind::NumericalStability, Fn, Loc))
+    Fn->addFnAttr(llvm::Attribute::SanitizeNumericalStability);
+
   if (getLangOpts().Sanitize.has(SanitizerKind::Memory) &&
       !isInNoSanitizeList(SanitizerKind::Memory, Fn, Loc))
     Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index fad26c43da3d34..c2522708a83641 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -795,6 +795,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       Fn->addFnAttr(llvm::Attribute::SanitizeMemTag);
     if (SanOpts.has(SanitizerKind::Thread))
       Fn->addFnAttr(llvm::Attribute::SanitizeThread);
+    if (SanOpts.has(SanitizerKind::NumericalStability))
+      Fn->addFnAttr(llvm::Attribute::SanitizeNumericalStability);
     if (SanOpts.hasOneOf(SanitizerKind::Memory | SanitizerKind::KernelMemory))
       Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
   }
diff --git a/clang/lib/Driver/SanitizerArgs.cpp b/clang/lib/Driver/SanitizerArgs.cpp
index 8bfe9f02a091d1..52b5345bd3ddc4 100644
--- a/clang/lib/Driver/SanitizerArgs.cpp
+++ b/clang/lib/Driver/SanitizerArgs.cpp
@@ -41,7 +41,8 @@ static const SanitizerMask NotAllowedWithExecuteOnly =
     SanitizerKind::Function | SanitizerKind::KCFI;
 static const SanitizerMask NeedsUnwindTables =
     SanitizerKind::Address | SanitizerKind::HWAddress | SanitizerKind::Thread |
-    SanitizerKind::Memory | SanitizerKind::DataFlow;
+    SanitizerKind::Memory | SanitizerKind::DataFlow |
+    SanitizerKind::NumericalStability;
 static const SanitizerMask SupportsCoverage =
     SanitizerKind::Address | SanitizerKind::HWAddress |
     SanitizerKind::KernelAddress | SanitizerKind::KernelHWAddress |
@@ -53,7 +54,8 @@ static const SanitizerMask SupportsCoverage =
     SanitizerKind::DataFlow | SanitizerKind::Fuzzer |
     SanitizerKind::FuzzerNoLink | SanitizerKind::FloatDivideByZero |
     SanitizerKind::SafeStack | SanitizerKind::ShadowCallStack |
-    SanitizerKind::Thread | SanitizerKind::ObjCCast | SanitizerKind::KCFI;
+    SanitizerKind::Thread | SanitizerKind::ObjCCast | SanitizerKind::KCFI |
+    SanitizerKind::NumericalStability;
 static const SanitizerMask RecoverableByDefault =
     SanitizerKind::Undefined | SanitizerKind::Integer |
     SanitizerKind::ImplicitConversion | SanitizerKind::Nullability |
@@ -175,6 +177,7 @@ static void addDefaultIgnorelists(const Driver &D, SanitizerMask Kinds,
                      {"hwasan_ignorelist.txt", SanitizerKind::HWAddress},
                      {"memtag_ignorelist.txt", SanitizerKind::MemTag},
                      {"msan_ignorelist.txt", SanitizerKind::Memory},
+                     {"nsan_ignorelist.txt", SanitizerKind::NumericalStability},
                      {"tsan_ignorelist.txt", SanitizerKind::Thread},
                      {"dfsan_abilist.txt", SanitizerKind::DataFlow},
                      {"cfi_ignorelist.txt", SanitizerKind::CFI},
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 4478865313636d..f09af3f251b511 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1526,11 +1526,14 @@ collectSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
     if (SanArgs.linkCXXRuntimes())
       StaticRuntimes.push_back("msan_cxx");
   }
+  if (SanArgs.needsNsanRt())
+    StaticRuntimes.push_back("nsan");
   if (!SanArgs.needsSharedRt() && SanArgs.needsTsanRt()) {
     StaticRuntimes.push_back("tsan");
     if (SanArgs.linkCXXRuntimes())
       StaticRuntimes.push_back("tsan_cxx");
   }
+
   if (!SanArgs.needsSharedRt() && SanArgs.needsUbsanRt()) {
     if (SanArgs.requiresMinimalRuntime()) {
       StaticRuntimes.push_back("ubsan_minimal");
diff --git a/clang/lib/Driver/ToolChains/Linux.cpp b/clang/lib/Driver/ToolChains/Linux.cpp
index 6c2f23e57bce05..7ad1df71ae9fef 100644
--- a/clang/lib/Driver/ToolChains/Linux.cpp
+++ b/clang/lib/Driver/ToolChains/Linux.cpp
@@ -825,6 +825,10 @@ SanitizerMask Linux::getSupportedSanitizers() const {
   if (IsX86_64 || IsAArch64) {
     Res |= SanitizerKind::KernelHWAddress;
   }
+  if (IsX86_64) {
+    Res |= SanitizerKind::NumericalStability;
+  }
+
   // Work around "Cannot represent a difference across sections".
   if (getTriple().getArch() == llvm::Triple::ppc64)
     Res &= ~SanitizerKind::Function;
diff --git a/clang/runtime/CMakeLists.txt b/clang/runtime/CMakeLists.txt
index 65fcdc2868f031..0248655d1df931 100644
--- a/clang/runtime/CMakeLists.txt
+++ b/clang/runtime/CMakeLists.txt
@@ -122,7 +122,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS ${COMPILER_RT_SRC_ROOT}/)
                            COMPONENT compiler-rt)
 
   # Add top-level targets that build specific compiler-rt runtimes.
-  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan ubsan ubsan-minimal)
+  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan nsan profile tsan ubsan ubsan-minimal)
   foreach(runtime ${COMPILER_RT_RUNTIMES})
     get_ext_project_build_command(build_runtime_cmd ${runtime})
     add_custom_target(${runtime}
@@ -149,6 +149,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS ${COMPILER_RT_SRC_ROOT}/)
         check-hwasan
         check-lsan
         check-msan
+        check-nsan
         check-profile
         check-safestack
         check-sanitizer
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index 423171532c2028..304ebd8f1fe737 100644
--- a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+++ b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
@@ -61,6 +61,7 @@ else()
 endif()
 set(ALL_MSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64} ${S390X}
     ${LOONGARCH64})
+set(ALL_NSAN_SUPPORTED_ARCH ${X86} ${X86_64})
 set(ALL_HWASAN_SUPPORTED_ARCH ${X86_64} ${ARM64} ${RISCV64})
 set(ALL_MEMPROF_SUPPORTED_ARCH ${X86_64})
 set(ALL_PROFILE_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${PPC32} ${PPC64}
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index 4f47142850a55e..2430ed55dc5829 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -621,6 +621,9 @@ if(APPLE)
   list_intersect(MSAN_SUPPORTED_ARCH
     ALL_MSAN_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
+  list_intersect(NSAN_SUPPORTED_ARCH
+    ALL_NSAN_SUPPORTED_ARCH
+    SANITIZER_COMMON_SUPPORTED_ARCH)
   list_intersect(HWASAN_SUPPORTED_ARCH
     ALL_HWASAN_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -686,6 +689,7 @@ else()
   filter_available_targets(SHADOWCALLSTACK_SUPPORTED_ARCH
     ${ALL_SHADOWCALLSTACK_SUPPORTED_ARCH})
   filter_available_targets(GWP_ASAN_SUPPORTED_ARCH ${ALL_GWP_ASAN_SUPPORTED_ARCH})
+  filter_available_targets(NSAN_SUPPORTED_ARCH ${ALL_NSAN_SUPPORTED_ARCH})
   filter_available_targets(ORC_SUPPORTED_ARCH ${ALL_ORC_SUPPORTED_ARCH})
 endif()
 
@@ -720,7 +724,7 @@ if(COMPILER_RT_SUPPORTED_ARCH)
 endif()
 message(STATUS "Compiler-RT supported architectures: ${COMPILER_RT_SUPPORTED_ARCH}")
 
-set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)
+set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;nsan;asan_abi)
 set(COMPILER_RT_SANITIZERS_TO_BUILD all CACHE STRING
     "sanitizers to build if supported on the target (all;${ALL_SANITIZERS})")
 list_replace(COMPILER_RT_SANITIZERS_TO_BUILD all "${ALL_SANITIZERS}")
@@ -898,4 +902,11 @@ if (GWP_ASAN_SUPPORTED_ARCH AND
 else()
   set(COMPILER_RT_HAS_GWP_ASAN FALSE)
 endif()
+
+if (COMPILER_RT_HAS_SANITIZER_COMMON AND NSAN_SUPPORTED_ARCH AND
+    OS_NAME MATCHES "Linux")
+  set(COMPILER_RT_HAS_NSAN TRUE)
+else()
+  set(COMPILER_RT_HAS_NSAN FALSE)
+endif()
 pythonize_bool(COMPILER_RT_HAS_GWP_ASAN)
diff --git a/compiler-rt/include/sanitizer/nsan_interface.h b/compiler-rt/include/sanitizer/nsan_interface.h
new file mode 100644
index 00000000000000..057ca0473bb3c6
--- /dev/null
+++ b/compiler-rt/include/sanitizer/nsan_interface.h
@@ -0,0 +1,75 @@
+//===-- sanitizer/nsan_interface.h ------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Public interface for nsan.
+//
+//===----------------------------------------------------------------------===//
+#ifndef SANITIZER_NSAN_INTERFACE_H
+#define SANITIZER_NSAN_INTERFACE_H
+
+#include <sanitizer/common_interface_defs.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/// User-provided default option settings.
+///
+/// You can provide your own implementation of this function to return a string
+/// containing NSan runtime options (for example,
+/// <c>verbosity=1:halt_on_error=0</c>).
+///
+/// \returns Default options string.
+const char *__nsan_default_options(void);
+
+// Dumps nsan shadow data for a block of `size_bytes` bytes of application
+// memory at location `addr`.
+//
+// Each line contains application address, shadow types, then values.
+// Unknown types are shown as `__`, while known values are shown as
+// `f`, `d`, `l` for float, double, and long double respectively. Position is
+// shown as a single hex digit. The shadow value itself appears on the line that
+// contains the first byte of the value.
+// FIXME: Show both shadow and application value.
+//
+// Example: `__nsan_dump_shadow_mem(addr, 32, 8, 0)` might print:
+//
+//  0x0add7359:  __ f0 f1 f2 f3 __ __ __   (42.000)
+//  0x0add7361:  __ d1 d2 d3 d4 d5 d6 d7
+//  0x0add7369:  d8 f0 f1 f2 f3 __ __ f2   (-1.000) (12.5)
+//  0x0add7371:  f3 __ __ __ __ __ __ __
+//
+// This means that there is:
+//   - a shadow double for the float at address 0x0add7360, with value 42;
+//   - a shadow float128 for the double at address 0x0add7362, with value -1;
+//   - a shadow double for the float at address 0x0add736a, with value 12.5;
+// There was also a shadow double for the float at address 0x0add736e, but bytes
+// f0 and f1 were overwritten by one or several stores, so that the shadow value
+// is no longer valid.
+// The argument `reserved` can be any value. Its true value is provided by the
+// instrumentation.
+void __nsan_dump_shadow_mem(const char *addr, size_t size_bytes,
+                            size_t bytes_per_line, size_t reserved);
+
+// Explicitly dumps a value.
+// FIXME: vector versions ?
+void __nsan_dump_float(float value);
+void __nsan_dump_double(double value);
+void __nsan_dump_longdouble(long double value);
+
+// Explicitly checks a value.
+// FIXME: vector versions ?
+void __nsan_check_float(float value);
+void __nsan_check_double(double value);
+void __nsan_check_longdouble(long double value);
+
+#ifdef __cplusplus
+} // extern "C"
+#endif
+
+#endif // SANITIZER_NSAN_INTERFACE_H
diff --git a/compiler-rt/lib/nsan/CMakeLists.txt b/compiler-rt/lib/nsan/CMakeLists.txt
new file mode 100644
index 00000000000000..00b16473bff0ef
--- /dev/null
+++ b/compiler-rt/lib/nsan/CMakeLists.txt
@@ -0,0 +1,61 @@
+add_compiler_rt_component(nsan)
+
+include_directories(..)
+
+set(NSAN_SOURCES
+  nsan.cc
+  nsan_flags.cc
+  nsan_interceptors.cc
+  nsan_stats.cc
+  nsan_suppressions.cc
+)
+
+set(NSAN_HEADERS
+  nsan.h
+  nsan_flags.h
+  nsan_flags.inc
+  nsan_platform.h
+  nsan_stats.h
+  nsan_suppressions.h
+)
+
+append_list_if(COMPILER_RT_HAS_FPIC_FLAG -fPIC NSAN_CFLAGS)
+
+set(NSAN_DYNAMIC_LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS})
+
+set(NSAN_CFLAGS ${SANITIZER_COMMON_CFLAGS})
+#-fno-rtti -fno-exceptions
+#    -nostdinc++ -pthread -fno-omit-frame-pointer)
+
+# Remove -stdlib= which is unused when passing -nostdinc++.
+# string(REGEX REPLACE "-stdlib=[a-zA-Z+]*" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS})
+
+if (COMPILER_RT_HAS_NSAN)
+  foreach(arch ${NSAN_SUPPORTED_ARCH})
+    add_compiler_rt_runtime(
+      clang_rt.nsan
+      STATIC
+      ARCHS ${arch}
+      SOURCES ${NSAN_SOURCES}
+              $<TARGET_OBJECTS:RTInterception.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonCoverage.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonSymbolizer.${arch}>
+              $<TARGET_OBJECTS:RTUbsan.${arch}>
+      ADDITIONAL_HEADERS ${NSAN_HEADERS}
+      CFLAGS ${NSAN_CFLAGS}
+      PARENT_TARGET nsan
+    )
+  endforeach()
+
+  add_compiler_rt_object_libraries(RTNsan
+      ARCHS ${NSAN_SUPPORTED_ARCH}
+      SOURCES ${NSAN_SOURCES}
+      ADDITIONAL_HEADERS ${NSAN_HEADERS}
+      CFLAGS ${NSAN_CFLAGS})
+endif()
+
+if(COMPILER_RT_INCLUDE_TESTS)
+  add_subdirectory(tests)
+endif()
diff --git a/compiler-rt/lib/nsan/nsan.cc b/compiler-rt/lib/nsan/nsan.cc
new file mode 100644
index 00000000000000..29351ca111a3ff
--- /dev/null
+++ b/compiler-rt/lib/nsan/nsan.cc
@@ -0,0 +1,828 @@
+//===-- nsan.cc -----------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// NumericalStabilitySanitizer runtime.
+//
+// This implements:
+//  - The public nsan interface (include/sanitizer/nsan_interface.h).
+//  - The private nsan interface (./nsan.h).
+//  - The internal instrumentation interface. These are function emitted by the
+//    instrumentation pass:
+//        * __nsan_get_shadow_ptr_for_{float,double,longdouble}_load
+//          These return the shadow memory pointer for loading the shadow value,
+//          after checking that the types are consistent. If the types are not
+//          consistent, returns nullptr.
+//        * __nsan_get_shadow_ptr_for_{float,double,longdouble}_store
+//          Sets the shadow types appropriately and returns the shadow memory
+//          pointer for storing the shadow value.
+//        * __nsan_internal_check_{float,double,long double}_{f,d,l} checks the
+//          accuracy of a value against its shadow and emits a warning depending
+//          on the runtime configuration. The middle part indicates the type of
+//          the application value, the suffix (f,d,l) indicates the type of the
+//          shadow, and depends on the instrumentation configuration.
+//        * __nsan_fcmp_fail_* emits a warning for an fcmp instruction whose
+//          corresponding shadow fcmp result differs.
+//
+//===----------------------------------------------------------------------===//
+
+#include <assert.h>
+#include <math.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "sanitizer_common/sanitizer_atomic.h"
+#include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_libc.h"
+#include "sanitizer_common/sanitizer_report_decorator.h"
+#include "sanitizer_common/sanitizer_stacktrace.h"
+#include "sanitizer_common/sanitizer_symbolizer.h"
+
+#include "nsan/nsan.h"
+#include "nsan/nsan_flags.h"
+#include "nsan/nsan_stats.h"
+#include "nsan/nsan_suppressions.h"
+
+using namespace __sanitizer;
+using namespace __nsan;
+
+static constexpr const int kMaxVectorWidth = 8;
+
+// When copying application memory, we also copy its shadow and shadow type.
+// FIXME: We could provide fixed-size versions that would nicely
+// vectorize for known sizes.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__nsan_copy_values(const char *daddr, const char *saddr, uptr size) {
+  internal_memmove((void *)getShadowTypeAddrFor(daddr),
+                   getShadowTypeAddrFor(saddr), size);
+  internal_memmove((void *)getShadowAddrFor(daddr), getShadowAddrFor(saddr),
+                   size * kShadowScale);
+}
+
+// FIXME: We could provide fixed-size versions that would nicely
+// vectorize for known sizes.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__nsan_set_value_unknown(const char *addr, uptr size) {
+  internal_memset((void *)getShadowTypeAddrFor(addr), 0, size);
+}
+
+namespace __nsan {
+
+const char *FTInfo<float>::kCppTypeName = "float";
+const char *FTInfo<double>::kCppTypeName = "double";
+const char *FTInfo<long double>::kCppTypeName = "long double";
+const char *FTInfo<__float128>::kCppTypeName = "__float128";
+
+const char FTInfo<float>::kTypePattern[sizeof(float)];
+const char FTInfo<double>::kTypePattern[sizeof(double)];
+const char FTInfo<long double>::kTypePattern[sizeof(long double)];
+
+// Helper for __nsan_dump_shadow_mem: Reads the value at address `Ptr`,
+// identified by its type id.
+template <ty...
[truncated]

Copy link

github-actions bot commented Mar 20, 2024

✅ With the latest revision this PR passed the Python code formatter.

@vitalybuka
Copy link
Collaborator

Can we split the patch into smaller pieces?
e.g. clang, transform, runtime ?

This one has transform, comments, so we can keep it for transform.

@vitalybuka
Copy link
Collaborator

vitalybuka commented Mar 26, 2024

clang changes LGTM, but they are missing tests

@alexander-shaposhnikov
Copy link
Collaborator Author

@vitalybuka , @arsenm - thanks a lot, yeah, sure, I can split the patch.
P.S. will update/address comments soon.

Copy link

github-actions bot commented May 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff d4f5cf267936a082196b0c22fe45c730b24b9fe0 bc248edaa5a2669eadec85178208a99c93a8dec1 -- llvm/include/llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp llvm/include/llvm/Bitcode/LLVMBitCodes.h llvm/lib/Bitcode/Reader/BitcodeReader.cpp llvm/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/lib/Passes/PassBuilder.cpp llvm/lib/Transforms/Utils/CodeExtractor.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp
index 8d5be4cb3c..9cf93ecfef 100644
--- a/llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/NumericalStabilitySanitizer.cpp
@@ -1245,8 +1245,7 @@ Value *NumericalStabilitySanitizer::handleExt(FPExtInst &Ext, Type *VT,
   //     fpext double  %1 to x86_fp80  ->  fpext float    s(%1) to double
 
   // See (A) above.
-  Value *Source =
-      ExtendedSourceTy ? Map.getShadow(OrigSource) : OrigSource;
+  Value *Source = ExtendedSourceTy ? Map.getShadow(OrigSource) : OrigSource;
   Type *SourceTy = ExtendedSourceTy ? ExtendedSourceTy : OrigSourceTy;
   // See (B) above.
   if (SourceTy == ExtendedVT)

@alexander-shaposhnikov
Copy link
Collaborator Author

@arsenm , @vitalybuka : I've split out the changes to clang and compiler-rt from this PR.
Addressed comments (except one).

@alexander-shaposhnikov alexander-shaposhnikov force-pushed the rebased_nsan branch 2 times, most recently from 3c21e54 to ca29922 Compare May 15, 2024 05:54
Copy link
Contributor

@jcranmer-intel jcranmer-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not comprehensive look yet:

// `dqq`.
static cl::opt<std::string> ClShadowMapping(
"nsan-shadow-type-mapping", cl::init("dqq"),
cl::desc("One shadow type id for each of `float`, `double`, `long double`. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the LLVM level, long double can map to one of four different types (double, x86_fp80, ppc128, and fp128), and it's not going to be clear from any information we have in the IR which one is intended.

I'd recommend using the LLVM names for these types instead of the C names here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current schema is close to the source code, it is (the whole tool) kind of ~equivalent (not quite, but at least ~conceptually) to what one could do at the source level (use higher precision types).

@alexander-shaposhnikov
Copy link
Collaborator Author

alexander-shaposhnikov commented May 29, 2024

@arsenm, @jcranmer-intel - thanks a lot for the review,
addressed most of the comments but added a few inline replies - would like to try to proceed here if possible

case PPC128ShadowConfig::kNsanTypeId:
return std::make_unique<PPC128ShadowConfig>();
}
errs() << "nsan: invalid shadow type id'" << TypeId << "'\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing instances of bad practice aren't a justification to add new instances of bad practices. Most of those contexts do not look like this type of use. If you want lazy error handling, I don't see why not just use report_fatal_error.

@alexander-shaposhnikov alexander-shaposhnikov force-pushed the rebased_nsan branch 2 times, most recently from 1c76a07 to 0a788e2 Compare May 30, 2024 21:27
@alexander-shaposhnikov
Copy link
Collaborator Author

Addressed comments

@alexander-shaposhnikov alexander-shaposhnikov changed the title Add numerical sanitizer [Instrumentation] Add numerical sanitizer May 30, 2024
@alexander-shaposhnikov alexander-shaposhnikov changed the title [Instrumentation] Add numerical sanitizer [LLVM][Instrumentation] Add numerical sanitizer May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category compiler-rt:sanitizer compiler-rt llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants