Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASan][X86] Mark asan_globals section large #74514

Merged
merged 2 commits into from
Dec 7, 2023
Merged

Conversation

aeubanks
Copy link
Contributor

@aeubanks aeubanks commented Dec 5, 2023

We'd like to make the asan_globals section large to make it not
contribute to relocation pressure since there are no direct PC32
references to it.

Following #74498, we can do that by marking the code model for the
global explicitly large.

Without this change, asan_globals gets placed between .data and .bss.
With this change, it gets placed after .bss.

We'd like to make the asan_globals section large to make it not
contribute to relocation pressure since there are no direct PC32
references to it.

Following llvm#74498, we can do that by marking the code model for the
global explicitly large.

Without this change, asan_globals gets placed between .data and .bss.
With this change, it gets placed after .bss.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 5, 2023

@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-transforms

Author: Arthur Eubanks (aeubanks)

Changes

We'd like to make the asan_globals section large to make it not
contribute to relocation pressure since there are no direct PC32
references to it.

Following #74498, we can do that by marking the code model for the
global explicitly large.

Without this change, asan_globals gets placed between .data and .bss.
With this change, it gets placed after .bss.


Full diff: https://github.com/llvm/llvm-project/pull/74514.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp (+5)
  • (added) llvm/test/Instrumentation/AddressSanitizer/global_metadata_code_model.ll (+10)
  • (modified) llvm/test/Instrumentation/AddressSanitizer/global_with_comdat.ll (+2-2)
diff --git a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
index da157c966bfcb..a86198591808a 100644
--- a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -2144,6 +2144,11 @@ ModuleAddressSanitizer::CreateMetadataGlobal(Module &M, Constant *Initializer,
       M, Initializer->getType(), false, Linkage, Initializer,
       Twine("__asan_global_") + GlobalValue::dropLLVMManglingEscape(OriginalName));
   Metadata->setSection(getGlobalMetadataSection());
+  // Move metadata section out of the way for x86-64 ELF binaries to mitigate
+  // relocation pressure.
+  if (TargetTriple.getArch() == Triple::x86_64 &&
+      TargetTriple.getObjectFormat() == Triple::ELF)
+    Metadata->setCodeModel(CodeModel::Large);
   return Metadata;
 }
 
diff --git a/llvm/test/Instrumentation/AddressSanitizer/global_metadata_code_model.ll b/llvm/test/Instrumentation/AddressSanitizer/global_metadata_code_model.ll
new file mode 100644
index 0000000000000..c1e7694d4cd53
--- /dev/null
+++ b/llvm/test/Instrumentation/AddressSanitizer/global_metadata_code_model.ll
@@ -0,0 +1,10 @@
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -passes=asan -S | FileCheck %s --check-prefix=LARGE
+; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -passes=asan -S | FileCheck %s --check-prefix=NORMAL
+; RUN: opt < %s -mtriple=x86_64-pc-windows -passes=asan -S | FileCheck %s --check-prefix=NORMAL
+
+; check that asan globals metadata are emitted to a large section for x86-64 ELF
+
+; LARGE: @__asan_global_global = {{.*}}global {{.*}}, code_model "large"
+; NORMAL-NOT: code_model "large"
+
+@global = global i32 0, align 4
diff --git a/llvm/test/Instrumentation/AddressSanitizer/global_with_comdat.ll b/llvm/test/Instrumentation/AddressSanitizer/global_with_comdat.ll
index 699b8287d358a..74f8fc9997d40 100644
--- a/llvm/test/Instrumentation/AddressSanitizer/global_with_comdat.ll
+++ b/llvm/test/Instrumentation/AddressSanitizer/global_with_comdat.ll
@@ -102,8 +102,8 @@ target triple = "x86_64-unknown-linux-gnu"
 ;; Don't place the instrumented globals in a comdat when the unique module ID is empty.
 ; NOMODULEID:      @.str = internal constant { [4 x i8], [28 x i8] } { [4 x i8] c"str\00", [28 x i8] zeroinitializer }, align 32
 ; NOMODULEID:      @_ZL3buf = internal global { [4 x i8], [28 x i8] } zeroinitializer, align 32
-; NOMODULEID:      @__asan_global_.str = private global {{.*}}, section "asan_globals", !associated !0
-; NOMODULEID:      @__asan_global__ZL3buf = private global {{.*}}, section "asan_globals", !associated !1
+; NOMODULEID:      @__asan_global_.str = private global {{.*}}, section "asan_globals"{{.*}}, !associated !0
+; NOMODULEID:      @__asan_global__ZL3buf = private global {{.*}}, section "asan_globals"{{.*}}, !associated !1
 ; NOMODULEID:      @llvm.compiler.used = appending global [4 x ptr] [ptr @.str, ptr @_ZL3buf, ptr @__asan_global_.str, ptr @__asan_global__ZL3buf]
 
 ; NOMODULEID:      define internal void @asan.module_ctor() #[[#]] comdat {

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a cl::opt to control this, or is linker support for SHF_X86_64_LARGE sufficiently widespread that we don't need to worry? @MaskRay

@aeubanks aeubanks requested a review from MaskRay December 5, 2023 19:08
@aeubanks
Copy link
Contributor Author

aeubanks commented Dec 5, 2023

I imagine linkers that don't support SHF_X86_64_LARGE would just pass it through without doing anything about it, which is fine.

@aeubanks
Copy link
Contributor Author

aeubanks commented Dec 5, 2023

using bfd still places asan_globals between .data and .bss while passing through SHF_X86_64_LARGE, so that seems right

@@ -2144,6 +2144,11 @@ ModuleAddressSanitizer::CreateMetadataGlobal(Module &M, Constant *Initializer,
M, Initializer->getType(), false, Linkage, Initializer,
Twine("__asan_global_") + GlobalValue::dropLLVMManglingEscape(OriginalName));
Metadata->setSection(getGlobalMetadataSection());
// Move metadata section out of the way for x86-64 ELF binaries to mitigate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Place metadata in a large section for ... to mitigate ...

@MaskRay
Copy link
Member

MaskRay commented Dec 7, 2023

Do we need a cl::opt to control this, or is linker support for SHF_X86_64_LARGE sufficiently widespread that we don't need to worry? @MaskRay

binutils has supported for SHF_X86_64_LARGE since 2005 and there appears to be mostly no change since then. We do not need to worry about compatibility.

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@aeubanks aeubanks merged commit 4de7d4e into llvm:main Dec 7, 2023
3 of 4 checks passed
@aeubanks aeubanks deleted the largeasan branch December 7, 2023 21:48
aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Dec 7, 2023
We'd like to make various instrprof globals large to make them not
contribute to relocation pressure since there are no direct accesses
to them in the module.

Similar to what was done for asan_globals in llvm#74514.

This affects the __llvm_prf_vals, __llvm_prf_vnds, and __llvm_prf_names
sections.
aeubanks added a commit that referenced this pull request Dec 8, 2023
We'd like to make various instrprof globals large to make them not
contribute to relocation pressure since there are no direct accesses
to them in the module.

Similar to what was done for asan_globals in #74514.

This affects the __llvm_prf_vals, __llvm_prf_vnds, and __llvm_prf_names
sections.
aeubanks added a commit that referenced this pull request Dec 8, 2023
…74778)

We'd like to make various instrprof globals large to make them not
contribute to relocation pressure since there are no direct accesses
to them in the module.

Similar to what was done for asan_globals in #74514.

This affects the __llvm_prf_vals, __llvm_prf_vnds, and __llvm_prf_names
sections.

The reland fixes platform.ll.
aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Dec 14, 2023
…ge code models

In llvm#74514 and llvm#74778 we marked various instrumentation-added sections as
large. This causes an extra PT_LOAD segment if using the small code
model. Since people using the small code model presumably aren't hitting
relocation limits, disable this when using the small code model to avoid
the extra segment.

This is annoying API-wise because we need to pass TargetMachine around
to any pass that wants to do this. Module::getCodeModel(), like anything
else that reads Module metadata, is unreliable as a source of truth.
aeubanks added a commit that referenced this pull request Dec 15, 2023
…ge code models (#75542)

In #74514 and #74778 we marked various instrumentation-added sections as
large. This causes an extra PT_LOAD segment if using the small code
model. Since people using the small code model presumably aren't hitting
relocation limits, disable this when using the small code model to avoid
the extra segment.

This uses Module::getCodeModel() which isn't necessarily reliable since
it reads module metadata (which right now only the clang frontend sets),
but it would be nice to get to a point where we reliably put this sort
of information (e.g. PIC/code model/etc) in the IR. This requires
duplicating the existing tests since opt/llc currently don't set these
metadata. If we get to a point where they do set the code model metadata
based on command line arguments then we can deduplicate these tests.
aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Jan 16, 2024
… section

If multiple globals are placed in an explicit section, there's a chance
that the large data threshold will cause the different globals to be
inconsistent in whether they're large or small. Mixing sections with
mismatched large section flags can cause undesirable issues like
increased relocation pressure because there may be 32-bit references to
the section in some TUs, but the section is considered large since input
section flags are unioned and other TUs added the large section flag.

An explicit code model on the global still overrides the decision. We
can do this for globals without any references to them, like what we did
with asan_globals in llvm#74514. If we have some precompiled small code
model files where asan_globals is not considered large mixed with
medium/large code model files, that's ok because the section is
considered large and placed farther. However, overriding the code model
for globals in some TUs but not others and having references to them
from code will still result in the above undesired behavior.

This mitigates a whole class of mismatched large section flag issues
like what llvm#77986 was trying to fix.

This ends up not adding the SHF_X86_64_LARGE section flag on explicit
sections in the medium/large code model. This is ok for the large code
model since all references from large text must use 64-bit relocations
anyway.
aeubanks added a commit that referenced this pull request Jan 17, 2024
… section (#78348)

If multiple globals are placed in an explicit section, there's a chance
that the large data threshold will cause the different globals to be
inconsistent in whether they're large or small. Mixing sections with
mismatched large section flags can cause undesirable issues like
increased relocation pressure because there may be 32-bit references to
the section in some TUs, but the section is considered large since input
section flags are unioned and other TUs added the large section flag.

An explicit code model on the global still overrides the decision. We
can do this for globals without any references to them, like what we did
with asan_globals in #74514. If we have some precompiled small code
model files where asan_globals is not considered large mixed with
medium/large code model files, that's ok because the section is
considered large and placed farther. However, overriding the code model
for globals in some TUs but not others and having references to them
from code will still result in the above undesired behavior.

This mitigates a whole class of mismatched large section flag issues
like what #77986 was trying to fix.

This ends up not adding the SHF_X86_64_LARGE section flag on explicit
sections in the medium/large code model. This is ok for the large code
model since all references from large text must use 64-bit relocations
anyway.
ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
… section (llvm#78348)

If multiple globals are placed in an explicit section, there's a chance
that the large data threshold will cause the different globals to be
inconsistent in whether they're large or small. Mixing sections with
mismatched large section flags can cause undesirable issues like
increased relocation pressure because there may be 32-bit references to
the section in some TUs, but the section is considered large since input
section flags are unioned and other TUs added the large section flag.

An explicit code model on the global still overrides the decision. We
can do this for globals without any references to them, like what we did
with asan_globals in llvm#74514. If we have some precompiled small code
model files where asan_globals is not considered large mixed with
medium/large code model files, that's ok because the section is
considered large and placed farther. However, overriding the code model
for globals in some TUs but not others and having references to them
from code will still result in the above undesired behavior.

This mitigates a whole class of mismatched large section flag issues
like what llvm#77986 was trying to fix.

This ends up not adding the SHF_X86_64_LARGE section flag on explicit
sections in the medium/large code model. This is ok for the large code
model since all references from large text must use 64-bit relocations
anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants