Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC][AArch64] Segregate constant pool caches by size. #86832

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

statham-arm
Copy link
Collaborator

If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:

  ldr w0, =something
  ldr x1, =something

then the first call to ConstantPool::addEntry will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register.

In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.

If you write a 32- and a 64-bit LDR instruction that both refer to the
same constant or symbol using the = syntax:

  ldr w0, =something
  ldr x1, =something

then the first call to `ConstantPool::addEntry` will insert the
constant into its cache of existing entries, and the second one will
find the cache entry and reuse it. This results in a 64-bit load from
a 32-bit constant, reading nonsense into the other half of the target
register.

In this patch I've done the simplest fix: include the size of the
constant pool entry as part of the key used to index the cache. So now
32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine
merging the two slots with appropriate overlap, so that the 32-bit
load loads the LSW of the 64-bit value. But that's much more
complicated: you have to take endianness into account, and maybe also
adjust the size of an existing entry. This is the simplest fix that
restores correctness.
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 27, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Simon Tatham (statham-arm)

Changes

If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:

  ldr w0, =something
  ldr x1, =something

then the first call to ConstantPool::addEntry will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register.

In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.


Full diff: https://github.com/llvm/llvm-project/pull/86832.diff

3 Files Affected:

  • (modified) llvm/include/llvm/MC/ConstantPools.h (+7-2)
  • (modified) llvm/lib/MC/ConstantPools.cpp (+5-4)
  • (added) llvm/test/MC/AArch64/constant-pool-sizes.s (+25)
diff --git a/llvm/include/llvm/MC/ConstantPools.h b/llvm/include/llvm/MC/ConstantPools.h
index 7eac75362effdc..ff21ccda07a836 100644
--- a/llvm/include/llvm/MC/ConstantPools.h
+++ b/llvm/include/llvm/MC/ConstantPools.h
@@ -43,8 +43,13 @@ struct ConstantPoolEntry {
 class ConstantPool {
   using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
   EntryVecTy Entries;
-  std::map<int64_t, const MCSymbolRefExpr *> CachedConstantEntries;
-  DenseMap<const MCSymbol *, const MCSymbolRefExpr *> CachedSymbolEntries;
+
+  // Caches of entries that already exist, indexed by their contents
+  // and also the size of the constant.
+  std::map<std::pair<int64_t, unsigned>, const MCSymbolRefExpr *>
+      CachedConstantEntries;
+  DenseMap<std::pair<const MCSymbol *, unsigned>, const MCSymbolRefExpr *>
+      CachedSymbolEntries;
 
 public:
   // Initialize a new empty constant pool
diff --git a/llvm/lib/MC/ConstantPools.cpp b/llvm/lib/MC/ConstantPools.cpp
index f895cc6413d74f..824d2463f30fc5 100644
--- a/llvm/lib/MC/ConstantPools.cpp
+++ b/llvm/lib/MC/ConstantPools.cpp
@@ -43,14 +43,15 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
 
   // Check if there is existing entry for the same constant. If so, reuse it.
   if (C) {
-    auto CItr = CachedConstantEntries.find(C->getValue());
+    auto CItr = CachedConstantEntries.find(std::make_pair(C->getValue(), Size));
     if (CItr != CachedConstantEntries.end())
       return CItr->second;
   }
 
   // Check if there is existing entry for the same symbol. If so, reuse it.
   if (S) {
-    auto SItr = CachedSymbolEntries.find(&(S->getSymbol()));
+    auto SItr =
+        CachedSymbolEntries.find(std::make_pair(&(S->getSymbol()), Size));
     if (SItr != CachedSymbolEntries.end())
       return SItr->second;
   }
@@ -60,9 +61,9 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
   Entries.push_back(ConstantPoolEntry(CPEntryLabel, Value, Size, Loc));
   const auto SymRef = MCSymbolRefExpr::create(CPEntryLabel, Context);
   if (C)
-    CachedConstantEntries[C->getValue()] = SymRef;
+    CachedConstantEntries[std::make_pair(C->getValue(), Size)] = SymRef;
   if (S)
-    CachedSymbolEntries[&(S->getSymbol())] = SymRef;
+    CachedSymbolEntries[std::make_pair(&(S->getSymbol()), Size)] = SymRef;
   return SymRef;
 }
 
diff --git a/llvm/test/MC/AArch64/constant-pool-sizes.s b/llvm/test/MC/AArch64/constant-pool-sizes.s
new file mode 100644
index 00000000000000..279402af025f34
--- /dev/null
+++ b/llvm/test/MC/AArch64/constant-pool-sizes.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple aarch64-none-linux-gnu %s | FileCheck %s
+
+  ldr w0, =symbol
+  ldr x1, =symbol
+
+  ldr w2, =1234567890
+  ldr x3, =1234567890
+
+// CHECK:             ldr     w0, .Ltmp0
+// CHECK:             ldr     x1, .Ltmp1
+// CHECK:             ldr     w2, .Ltmp2
+// CHECK:             ldr     x3, .Ltmp3
+
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp0:
+// CHECK-NEXT:        .word   symbol
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp1:
+// CHECK-NEXT:        .xword  symbol
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp2:
+// CHECK-NEXT:        .word   1234567890
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp3:
+// CHECK-NEXT:        .xword  1234567890

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 27, 2024

@llvm/pr-subscribers-mc

Author: Simon Tatham (statham-arm)

Changes

If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:

  ldr w0, =something
  ldr x1, =something

then the first call to ConstantPool::addEntry will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register.

In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.


Full diff: https://github.com/llvm/llvm-project/pull/86832.diff

3 Files Affected:

  • (modified) llvm/include/llvm/MC/ConstantPools.h (+7-2)
  • (modified) llvm/lib/MC/ConstantPools.cpp (+5-4)
  • (added) llvm/test/MC/AArch64/constant-pool-sizes.s (+25)
diff --git a/llvm/include/llvm/MC/ConstantPools.h b/llvm/include/llvm/MC/ConstantPools.h
index 7eac75362effdc..ff21ccda07a836 100644
--- a/llvm/include/llvm/MC/ConstantPools.h
+++ b/llvm/include/llvm/MC/ConstantPools.h
@@ -43,8 +43,13 @@ struct ConstantPoolEntry {
 class ConstantPool {
   using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
   EntryVecTy Entries;
-  std::map<int64_t, const MCSymbolRefExpr *> CachedConstantEntries;
-  DenseMap<const MCSymbol *, const MCSymbolRefExpr *> CachedSymbolEntries;
+
+  // Caches of entries that already exist, indexed by their contents
+  // and also the size of the constant.
+  std::map<std::pair<int64_t, unsigned>, const MCSymbolRefExpr *>
+      CachedConstantEntries;
+  DenseMap<std::pair<const MCSymbol *, unsigned>, const MCSymbolRefExpr *>
+      CachedSymbolEntries;
 
 public:
   // Initialize a new empty constant pool
diff --git a/llvm/lib/MC/ConstantPools.cpp b/llvm/lib/MC/ConstantPools.cpp
index f895cc6413d74f..824d2463f30fc5 100644
--- a/llvm/lib/MC/ConstantPools.cpp
+++ b/llvm/lib/MC/ConstantPools.cpp
@@ -43,14 +43,15 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
 
   // Check if there is existing entry for the same constant. If so, reuse it.
   if (C) {
-    auto CItr = CachedConstantEntries.find(C->getValue());
+    auto CItr = CachedConstantEntries.find(std::make_pair(C->getValue(), Size));
     if (CItr != CachedConstantEntries.end())
       return CItr->second;
   }
 
   // Check if there is existing entry for the same symbol. If so, reuse it.
   if (S) {
-    auto SItr = CachedSymbolEntries.find(&(S->getSymbol()));
+    auto SItr =
+        CachedSymbolEntries.find(std::make_pair(&(S->getSymbol()), Size));
     if (SItr != CachedSymbolEntries.end())
       return SItr->second;
   }
@@ -60,9 +61,9 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
   Entries.push_back(ConstantPoolEntry(CPEntryLabel, Value, Size, Loc));
   const auto SymRef = MCSymbolRefExpr::create(CPEntryLabel, Context);
   if (C)
-    CachedConstantEntries[C->getValue()] = SymRef;
+    CachedConstantEntries[std::make_pair(C->getValue(), Size)] = SymRef;
   if (S)
-    CachedSymbolEntries[&(S->getSymbol())] = SymRef;
+    CachedSymbolEntries[std::make_pair(&(S->getSymbol()), Size)] = SymRef;
   return SymRef;
 }
 
diff --git a/llvm/test/MC/AArch64/constant-pool-sizes.s b/llvm/test/MC/AArch64/constant-pool-sizes.s
new file mode 100644
index 00000000000000..279402af025f34
--- /dev/null
+++ b/llvm/test/MC/AArch64/constant-pool-sizes.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple aarch64-none-linux-gnu %s | FileCheck %s
+
+  ldr w0, =symbol
+  ldr x1, =symbol
+
+  ldr w2, =1234567890
+  ldr x3, =1234567890
+
+// CHECK:             ldr     w0, .Ltmp0
+// CHECK:             ldr     x1, .Ltmp1
+// CHECK:             ldr     w2, .Ltmp2
+// CHECK:             ldr     x3, .Ltmp3
+
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp0:
+// CHECK-NEXT:        .word   symbol
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp1:
+// CHECK-NEXT:        .xword  symbol
+// CHECK:             .p2align        2, 0x0
+// CHECK-NEXT:.Ltmp2:
+// CHECK-NEXT:        .word   1234567890
+// CHECK:             .p2align        3, 0x0
+// CHECK-NEXT:.Ltmp3:
+// CHECK-NEXT:        .xword  1234567890

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

If we want to optimize on AArch64 specifically, we could teach the assembler to rewrite ldr x0, =smallconstant to ldr w0, =smallconstant. But probably not the effort.

@AtariDreams
Copy link
Contributor

AtariDreams commented Mar 27, 2024

LGTM

If we want to optimize on AArch64 specifically, we could teach the assembler to rewrite ldr x0, =smallconstant to ldr w0, =smallconstant. But probably not the effort.

NASM does this for x86 and I think AArch64 too.

(NASM is also nice to optimize mov rdx, 7 to mov edx, 7 by default, and other similar ones)

Copy link
Member

@rengolin rengolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do end up with small bloat on your constant pool, but it's better than reading garbage. :)

I'd treat the fusion of these entries as a future optimization, since the endiannes is not a trivial thing and this fixes an actual bug with correct behaviour.

@statham-arm statham-arm merged commit 88b10f3 into llvm:main Mar 28, 2024
7 checks passed
@AtariDreams
Copy link
Contributor

/cherry-pick 88b10f3

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 30, 2024

/cherry-pick 88b10f3

Error: Command failed due to missing milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants