Skip to content

Conversation

melver
Copy link
Contributor

@melver melver commented Oct 6, 2025

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.


This change is part of the following series:

  1. [AllocToken] Introduce sanitize_alloc_token attribute and alloc_token metadata #160131
  2. [AllocToken] Introduce AllocToken instrumentation pass #156838
  3. [Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098
  4. [Clang][CodeGen] Emit !alloc_token for new expressions #162099
  5. [Clang] Wire up -fsanitize=alloc-token #156839
  6. [AllocToken, Clang] Implement TypeHashPointerSplit mode #156840
  7. [AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
  8. [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id #156842

melver added 2 commits October 6, 2025 16:55
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
@llvmbot
Copy link
Member

llvmbot commented Oct 6, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Marco Elver (melver)

Changes

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.


This change is part of the following series:

  1. [AllocToken] Introduce sanitize_alloc_token attribute and alloc_token metadata #160131
  2. [AllocToken] Introduce AllocToken instrumentation pass #156838
  3. [Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098
  4. [Clang][CodeGen] Emit !alloc_token for new expressions #162099
  5. [Clang] Wire up -fsanitize=alloc-token #156839
  6. [AllocToken, Clang] Implement TypeHashPointerSplit mode #156840
  7. [AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
  8. [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id #156842

Full diff: https://github.com/llvm/llvm-project/pull/162098.diff

2 Files Affected:

  • (modified) clang/include/clang/Basic/Sanitizers.def (+3)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+2)
diff --git a/clang/include/clang/Basic/Sanitizers.def b/clang/include/clang/Basic/Sanitizers.def
index 1d0e97cc7fb4c..da85431625026 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -195,6 +195,9 @@ SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
 // Scudo hardened allocator
 SANITIZER("scudo", Scudo)
 
+// AllocToken
+SANITIZER("alloc-token", AllocToken)
+
 // Magic group, containing all sanitizers. For example, "-fno-sanitize=all"
 // can be used to disable all the sanitizers.
 SANITIZER_GROUP("all", All, ~SanitizerMask())
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index b2fe9171372d8..acf8de4dee147 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -846,6 +846,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       Fn->addFnAttr(llvm::Attribute::SanitizeNumericalStability);
     if (SanOpts.hasOneOf(SanitizerKind::Memory | SanitizerKind::KernelMemory))
       Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
+    if (SanOpts.has(SanitizerKind::AllocToken))
+      Fn->addFnAttr(llvm::Attribute::SanitizeAllocToken);
   }
   if (SanOpts.has(SanitizerKind::SafeStack))
     Fn->addFnAttr(llvm::Attribute::SafeStack);

melver added a commit to melver/llvm-project that referenced this pull request Oct 7, 2025
Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

  clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

Pull Request: llvm#162098
melver added 4 commits October 7, 2025 11:53
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
melver added a commit to melver/llvm-project that referenced this pull request Oct 7, 2025
Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

  clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

Pull Request: llvm#162098
melver added a commit that referenced this pull request Oct 7, 2025
… metadata (#160131)

In preparation of adding the "AllocToken" pass, add the pre-requisite
`sanitize_alloc_token` function attribute and `alloc_token` metadata.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
melver added 2 commits October 7, 2025 12:56
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…alloc_token metadata (#160131)

In preparation of adding the "AllocToken" pass, add the pre-requisite
`sanitize_alloc_token` function attribute and `alloc_token` metadata.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added a commit that referenced this pull request Oct 7, 2025
Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.

Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.

The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:

- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU

For the `TypeHash` mode introduce support for `!alloc_token` metadata:
the metadata can be attached to allocation calls to provide richer
semantic
information to be consumed by the AllocToken pass. Optimization remarks
can be enabled to show where no metadata was available.

An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
@melver melver changed the base branch from users/melver/spr/main.clangcodegen-introduce-the-alloctoken-sanitizerkind to main October 7, 2025 11:30
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…56838)

Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.

Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.

The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:

- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU

For the `TypeHash` mode introduce support for `!alloc_token` metadata:
the metadata can be attached to allocation calls to provide richer
semantic
information to be consumed by the AllocToken pass. Optimization remarks
can be enabled to show where no metadata was available.

An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
Created using spr 1.3.8-beta.1
@melver melver merged commit 0cee4db into main Oct 7, 2025
9 checks passed
@melver melver deleted the users/melver/spr/clangcodegen-introduce-the-alloctoken-sanitizerkind branch October 7, 2025 18:22
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…162098)

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target
'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added a commit that referenced this pull request Oct 7, 2025
For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…62099)

For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
@thurstond
Copy link
Contributor

This change is causing a buildbot clang crash: https://lab.llvm.org/buildbot/#/builders/169/builds/15726

(I manually re-ran the buildbot at this change - 0cee4db - which crashed; it did not crash on the immediately preceding commit, 93f2e0a)

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang -fsyntax-only -I /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json -isystem /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json/system -fmodules -fimplicit-module-maps -fmodules-cache-path=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/test/Preprocessor/Output/print-header-json.c.tmp /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/print-header-json.c -o /dev/null
1.	/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json/system/system0.h:2:2: current parser token 'include'
 #0 0x000064d4020a8a76 ___interceptor_backtrace /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4530:13
 #1 0x000064d4097963f8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:834:13
 #2 0x000064d40978fec9 llvm::sys::RunSignalHandlers() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Signals.cpp:0:5
 #3 0x000064d409794494 llvm::sys::CleanupOnSignal(unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #4 0x000064d4095de721 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #5 0x000064d4095dee07 CrashRecoverySignalHandler(int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:391:1
 #6 0x00007c3738e458d0 (/lib/x86_64-linux-gnu/libc.so.6+0x458d0)
 #7 0x00007c3738ea49bc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0xa49bc)
 #8 0x00007c3738e4579e raise (/lib/x86_64-linux-gnu/libc.so.6+0x4579e)
 #9 0x00007c3738e288cd abort (/lib/x86_64-linux-gnu/libc.so.6+0x288cd)
#10 0x000064d40212b75c (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang+0x11fcc75c)
#11 0x000064d4021295fe __sanitizer::Die() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5
#12 0x000064d40210a25b push_back /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common.h:543:7
#13 0x000064d40210a25b __asan::ScopedInErrorReport::~ScopedInErrorReport() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_report.cpp:193:29
#14 0x000064d40210c0ed __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_report.cpp:536:1
#15 0x000064d40210cfb6 __asan_report_load16 /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:132:1
#16 0x000064d40bcb6d1f __copy_non_overlapping_range<const unsigned long *, const unsigned long *> /home/b/sanitizer-x86_64-linux-fast/build/libcxx_install_asan_ubsan/include/c++/v1/string:2144:38
#17 0x000064d40bcb6d1f void std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init_with_size[abi:nn220000]<unsigned long const*, unsigned long const*>(unsigned long const*, unsigned long const*, unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/libcxx_install_asan_ubsan/include/c++/v1/string:2685:18
#18 0x000064d40bb77198 clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:10172:7
#19 0x000064d40bb9229b clang::ASTReader::ParseLanguageOptions(llvm::SmallVector<unsigned long, 64u> const&, llvm::StringRef, bool, clang::ASTReaderListener&, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6475:12
#20 0x000064d40bb83454 clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:0:11
#21 0x000064d40bb994b9 clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:3249:15
#22 0x000064d40bb9e1d3 clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:5182:7
#23 0x000064d40bbb3678 clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:4828:11
#24 0x000064d40b69c575 clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1805:27
#25 0x000064d40b69fcf0 clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1956:31
#26 0x000064d4129e38fd clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2426:5
#27 0x000064d4129d7003 clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2101:17
#28 0x000064d4129d8147 clang::Preprocessor::HandleDirective(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:0:14
#29 0x000064d41293d29d clang::Lexer::LexTokenInternal(clang::Token&, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Lexer.cpp:4514:7
#30 0x000064d412933fec clang::Lexer::Lex(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Lexer.cpp:3731:3
#31 0x000064d412a69ddb clang::Preprocessor::Lex(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Preprocessor.cpp:896:3
#32 0x000064d40f16f731 clang::ParseAST(clang::Sema&, bool, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseAST.cpp:164:5
#33 0x000064d40b7c8e73 clang::FrontendAction::Execute() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1315:10
#34 0x000064d40b6913be getPtr /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Error.h:278:42
#35 0x000064d40b6913be operator bool /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Error.h:241:16
#36 0x000064d40b6913be clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1008:23
#37 0x000064d40bacf613 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:310:25
#38 0x000064d4021572d5 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:300:15
#39 0x000064d40214c620 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:227:12
#40 0x000064d402154760 release /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:232:9
#41 0x000064d402154760 ~IntrusiveRefCntPtr /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:196:27
#42 0x000064d402154760 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:369:5
#43 0x000064d402154760 int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#44 0x000064d40b3b7df5 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Job.cpp:436:30
#45 0x000064d40b3b7df5 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::__1::optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const::$_0>(long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#46 0x000064d4095de4d6 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:0:12
#47 0x000064d4095de4d6 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
#48 0x000064d40b3b4dfd clang::driver::CC1Command::Execute(llvm::ArrayRef<std::__1::optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Job.cpp:436:7
#49 0x000064d40b30eaab clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Compilation.cpp:196:15
#50 0x000064d40b30f0f7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&, bool) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Compilation.cpp:246:13
#51 0x000064d40b34613f empty /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:82:46
#52 0x000064d40b34613f clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Driver.cpp:2244:23
#53 0x000064d40214b141 clang_main(int, char**, llvm::ToolContext const&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:21
#54 0x000064d402177126 main /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/driver/clang-driver.cpp:17:10
#55 0x00007c3738e2a578 (/lib/x86_64-linux-gnu/libc.so.6+0x2a578)
#56 0x00007c3738e2a63b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a63b)
#57 0x000064d40205f0e5 _start (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang+0x11f000e5)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 22.0.0git
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin
Build config: +assertions, +asan, +ubsan
clang: note: diagnostic msg: 
********************

thurstond added a commit that referenced this pull request Oct 8, 2025
)

Reverts #162099

Reason: this commit depends on #162098, which I am reverting due to
build breakage (see
#162098 (comment)).
thurstond added a commit that referenced this pull request Oct 8, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
…ions" (#162412)

Reverts llvm/llvm-project#162099

Reason: this commit depends on #162098, which I am reverting due to
build breakage (see
llvm/llvm-project#162098 (comment)).
thurstond added a commit that referenced this pull request Oct 8, 2025
@thurstond
Copy link
Contributor

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
@melver
Copy link
Contributor Author

melver commented Oct 8, 2025

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

Sigh, this is a brittle test, or rather an unfortunate side-effect of incrementally building & testing on a CI where the test outputs are not cleared. I was able to reproduce this when I checked out 93f2e0a, then checked out this change, and retested with the zorg scripts. Then, if I run:

rm -rf zorg-test/llvm_build_asan_ubsan/tools/clang/test/Preprocessor/Output/print-header-json.c.tmp/

And rerun the tests, the tests pass:

../zorg-test/llvm_build_asan_ubsan/bin/llvm-lit -v clang/test/Preprocessor/print-header-json.c
-- Testing: 1 tests, 1 workers --
PASS: Clang :: Preprocessor/print-header-json.c (1 of 1)

Testing Time: 3.57s

Total Discovered Tests: 1
  Passed: 1 (100.00%)

We could try to fix the test to clear the cache dir or fix the test scripts. I suspect fixing the test is the better option, because everyone who does incremental build + test will have this problem.

Summary of the problem is this: After a patch (such as one adding new sanitizer kind) that changes the binary format of PCMs (because they track codegen options), reusing a stale cached PCM is no longer binary-compatible. Here, adding a new sanitizer option altered the implicit binary layout of the serialized LangOptions. The build & test system is oblivious to this. When the new compiler attempted to read the old module file, it misinterpreted the data due to the layout mismatch, resulting in a heap-buffer-overflow.

TLDR; Clang's PCM binary format doesn't encode a version and attempting to load version-incompatible PCMs from previous test invocations after an implicit change results in a heap buffer overflow and assorted failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants