[X86] Place data in large sections for large code model #70265

aeubanks · 2023-10-25T22:23:52Z

This allows better interoperability mixing small/medium/large code model
code since large code model data can be put into separate large sections.

And respect large data threshold under large code model.
gcc also does this: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html.

See https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU.

llvmbot · 2023-10-25T22:25:06Z

@llvm/pr-subscribers-backend-x86

Author: Arthur Eubanks (aeubanks)

Changes

This allows better interoperability mixing small/medium/large code model
code since small code model data won't be mixed with large code model
data.

Full diff: https://github.com/llvm/llvm-project/pull/70265.diff

2 Files Affected:

(modified) llvm/lib/Target/TargetMachine.cpp (+13-6)
(modified) llvm/test/CodeGen/X86/code-model-elf-sections.ll (+5-2)

diff --git a/llvm/lib/Target/TargetMachine.cpp b/llvm/lib/Target/TargetMachine.cpp
index 45fb612cb91da19..e11355de2eed3b5 100644
--- a/llvm/lib/Target/TargetMachine.cpp
+++ b/llvm/lib/Target/TargetMachine.cpp
@@ -42,13 +42,20 @@ TargetMachine::~TargetMachine() = default;
 bool TargetMachine::isLargeData(const GlobalVariable *GV) const {
   if (getTargetTriple().getArch() != Triple::x86_64 || GV->isThreadLocal())
     return false;
-  // Large data under the large code model still needs to be thought about, so
-  // restrict this to medium.
-  if (getCodeModel() != CodeModel::Medium)
+
+  switch (getCodeModel()) {
+  case CodeModel::Large: {
+    return true;
+  }
+  case CodeModel::Medium: {
+    const DataLayout &DL = GV->getParent()->getDataLayout();
+    uint64_t Size = DL.getTypeSizeInBits(GV->getValueType()) / 8;
+    return Size == 0 || Size > LargeDataThreshold;
+  }
+  default: {
     return false;
-  const DataLayout &DL = GV->getParent()->getDataLayout();
-  uint64_t Size = DL.getTypeSizeInBits(GV->getValueType()) / 8;
-  return Size == 0 || Size > LargeDataThreshold;
+  }
+  }
 }
 
 bool TargetMachine::isPositionIndependent() const {
diff --git a/llvm/test/CodeGen/X86/code-model-elf-sections.ll b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
index fe659fa9a46e727..5d02fed05814039 100644
--- a/llvm/test/CodeGen/X86/code-model-elf-sections.ll
+++ b/llvm/test/CodeGen/X86/code-model-elf-sections.ll
@@ -7,14 +7,17 @@
 ; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=medium -large-data-threshold=80 -o %t
 ; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=SMALL
 ; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=large -o %t
-; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=SMALL
+; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=LARGE
+; Check that the large code model ignores -large-data-threshold.
+; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=large -large-data-threshold=800 -o %t
+; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=LARGE
 
 ; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=small -data-sections -o %t
 ; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=SMALL-DS
 ; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=medium -data-sections -o %t
 ; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=LARGE-DS
 ; RUN: llc < %s -relocation-model=pic -filetype=obj -code-model=large -data-sections -o %t
-; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=SMALL-DS
+; RUN: llvm-readelf -S %t | FileCheck %s --check-prefix=LARGE-DS
 
 ; SMALL: .data {{.*}} WA {{.*}}
 ; SMALL: foo {{.*}} WA {{.*}}

MaskRay · 2023-10-25T22:36:30Z

I have recently changed GCC https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
-mcmodel=large also respects -mlarge-data-threshold=

MaskRay · 2023-10-25T22:37:04Z

Consider adding a link to https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU

This allows better interoperability mixing small/medium/large code model code since small code model data won't be mixed with large code model data.

aeubanks · 2023-11-10T22:11:55Z

I'm still of the opinion that large code model data should never contribute toward relocation pressure, but I'll follow gcc here.

aeubanks · 2023-11-17T22:21:22Z

ping

rnk · 2023-11-17T23:14:52Z

llvm/lib/Target/TargetMachine.cpp

-  // restrict this to medium.
-  if (getCodeModel() != CodeModel::Medium)
+
+  if (getCodeModel() != CodeModel::Medium && getCodeModel() != CodeModel::Large)


What controls the default for the large data size threshold, under the large code model? I think we discussed making it zero, so effectively all globals will be large data under the large code model, which sounds like the right behavior. I don't know which way GCC went on that.

I think we want to keep the .bss/.data/.rodata prefix checks, even under the large code model. I expect folks to mix large and small code, and we don't want to accidentally make small sections large.

we're relying on the frontend to set the large data threshold in TargetMachine, otherwise it's just 0

gcc currently defaults to 65536 as the large data threshold for the large code model

MaskRay · 2023-11-17T23:33:08Z

I'm still of the opinion that large code model data should never contribute toward relocation pressure, but I'll follow gcc here.

The decision is deliberate. Some users may expect .data.* section name and build features on top of it. These users (say JIT programs) may not care about he small/large code model mixing story.
-mlarge-data-threshold= gives users choices.

This allows better interoperability mixing small/medium/large code model code since large code model data can be put into separate large sections. And respect large data threshold under large code model. gcc also does this: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html. See https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU.

aeubanks requested review from rnk and MaskRay October 25, 2023 22:24

llvmbot added the backend:X86 label Oct 25, 2023

[X86] Treat all data under large code model as large

63d4532

This allows better interoperability mixing small/medium/large code model code since small code model data won't be mixed with large code model data.

aeubanks force-pushed the largedata branch from e9afa7d to 63d4532 Compare November 10, 2023 22:09

aeubanks changed the title ~~[X86] Treat all data under large code model as large~~ [X86] Respect large data threshold under large code model Nov 10, 2023

aeubanks changed the title ~~[X86] Respect large data threshold under large code model~~ [X86] Place data in large sections for large code model Nov 15, 2023

aeubanks requested a review from jyknight November 15, 2023 00:13

rnk approved these changes Nov 17, 2023

View reviewed changes

MaskRay approved these changes Nov 17, 2023

View reviewed changes

aeubanks merged commit 635756e into llvm:main Nov 17, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] Place data in large sections for large code model #70265

[X86] Place data in large sections for large code model #70265

aeubanks commented Oct 25, 2023 •

edited

llvmbot commented Oct 25, 2023

MaskRay commented Oct 25, 2023

MaskRay commented Oct 25, 2023

aeubanks commented Nov 10, 2023

aeubanks commented Nov 17, 2023

rnk Nov 17, 2023

aeubanks Nov 17, 2023

MaskRay commented Nov 17, 2023

[X86] Place data in large sections for large code model #70265

[X86] Place data in large sections for large code model #70265

Conversation

aeubanks commented Oct 25, 2023 • edited

llvmbot commented Oct 25, 2023

MaskRay commented Oct 25, 2023

MaskRay commented Oct 25, 2023

aeubanks commented Nov 10, 2023

aeubanks commented Nov 17, 2023

rnk Nov 17, 2023

Choose a reason for hiding this comment

aeubanks Nov 17, 2023

Choose a reason for hiding this comment

MaskRay commented Nov 17, 2023

aeubanks commented Oct 25, 2023 •

edited