-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MC][AArch64] Segregate constant pool caches by size. #86832
Conversation
If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax: ldr w0, =something ldr x1, =something then the first call to `ConstantPool::addEntry` will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register. In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry. There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.
@llvm/pr-subscribers-backend-aarch64 Author: Simon Tatham (statham-arm) ChangesIf you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:
then the first call to In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry. There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness. Full diff: https://github.com/llvm/llvm-project/pull/86832.diff 3 Files Affected:
diff --git a/llvm/include/llvm/MC/ConstantPools.h b/llvm/include/llvm/MC/ConstantPools.h
index 7eac75362effdc..ff21ccda07a836 100644
--- a/llvm/include/llvm/MC/ConstantPools.h
+++ b/llvm/include/llvm/MC/ConstantPools.h
@@ -43,8 +43,13 @@ struct ConstantPoolEntry {
class ConstantPool {
using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
EntryVecTy Entries;
- std::map<int64_t, const MCSymbolRefExpr *> CachedConstantEntries;
- DenseMap<const MCSymbol *, const MCSymbolRefExpr *> CachedSymbolEntries;
+
+ // Caches of entries that already exist, indexed by their contents
+ // and also the size of the constant.
+ std::map<std::pair<int64_t, unsigned>, const MCSymbolRefExpr *>
+ CachedConstantEntries;
+ DenseMap<std::pair<const MCSymbol *, unsigned>, const MCSymbolRefExpr *>
+ CachedSymbolEntries;
public:
// Initialize a new empty constant pool
diff --git a/llvm/lib/MC/ConstantPools.cpp b/llvm/lib/MC/ConstantPools.cpp
index f895cc6413d74f..824d2463f30fc5 100644
--- a/llvm/lib/MC/ConstantPools.cpp
+++ b/llvm/lib/MC/ConstantPools.cpp
@@ -43,14 +43,15 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
// Check if there is existing entry for the same constant. If so, reuse it.
if (C) {
- auto CItr = CachedConstantEntries.find(C->getValue());
+ auto CItr = CachedConstantEntries.find(std::make_pair(C->getValue(), Size));
if (CItr != CachedConstantEntries.end())
return CItr->second;
}
// Check if there is existing entry for the same symbol. If so, reuse it.
if (S) {
- auto SItr = CachedSymbolEntries.find(&(S->getSymbol()));
+ auto SItr =
+ CachedSymbolEntries.find(std::make_pair(&(S->getSymbol()), Size));
if (SItr != CachedSymbolEntries.end())
return SItr->second;
}
@@ -60,9 +61,9 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
Entries.push_back(ConstantPoolEntry(CPEntryLabel, Value, Size, Loc));
const auto SymRef = MCSymbolRefExpr::create(CPEntryLabel, Context);
if (C)
- CachedConstantEntries[C->getValue()] = SymRef;
+ CachedConstantEntries[std::make_pair(C->getValue(), Size)] = SymRef;
if (S)
- CachedSymbolEntries[&(S->getSymbol())] = SymRef;
+ CachedSymbolEntries[std::make_pair(&(S->getSymbol()), Size)] = SymRef;
return SymRef;
}
diff --git a/llvm/test/MC/AArch64/constant-pool-sizes.s b/llvm/test/MC/AArch64/constant-pool-sizes.s
new file mode 100644
index 00000000000000..279402af025f34
--- /dev/null
+++ b/llvm/test/MC/AArch64/constant-pool-sizes.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple aarch64-none-linux-gnu %s | FileCheck %s
+
+ ldr w0, =symbol
+ ldr x1, =symbol
+
+ ldr w2, =1234567890
+ ldr x3, =1234567890
+
+// CHECK: ldr w0, .Ltmp0
+// CHECK: ldr x1, .Ltmp1
+// CHECK: ldr w2, .Ltmp2
+// CHECK: ldr x3, .Ltmp3
+
+// CHECK: .p2align 2, 0x0
+// CHECK-NEXT:.Ltmp0:
+// CHECK-NEXT: .word symbol
+// CHECK: .p2align 3, 0x0
+// CHECK-NEXT:.Ltmp1:
+// CHECK-NEXT: .xword symbol
+// CHECK: .p2align 2, 0x0
+// CHECK-NEXT:.Ltmp2:
+// CHECK-NEXT: .word 1234567890
+// CHECK: .p2align 3, 0x0
+// CHECK-NEXT:.Ltmp3:
+// CHECK-NEXT: .xword 1234567890
|
@llvm/pr-subscribers-mc Author: Simon Tatham (statham-arm) ChangesIf you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:
then the first call to In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry. There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness. Full diff: https://github.com/llvm/llvm-project/pull/86832.diff 3 Files Affected:
diff --git a/llvm/include/llvm/MC/ConstantPools.h b/llvm/include/llvm/MC/ConstantPools.h
index 7eac75362effdc..ff21ccda07a836 100644
--- a/llvm/include/llvm/MC/ConstantPools.h
+++ b/llvm/include/llvm/MC/ConstantPools.h
@@ -43,8 +43,13 @@ struct ConstantPoolEntry {
class ConstantPool {
using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
EntryVecTy Entries;
- std::map<int64_t, const MCSymbolRefExpr *> CachedConstantEntries;
- DenseMap<const MCSymbol *, const MCSymbolRefExpr *> CachedSymbolEntries;
+
+ // Caches of entries that already exist, indexed by their contents
+ // and also the size of the constant.
+ std::map<std::pair<int64_t, unsigned>, const MCSymbolRefExpr *>
+ CachedConstantEntries;
+ DenseMap<std::pair<const MCSymbol *, unsigned>, const MCSymbolRefExpr *>
+ CachedSymbolEntries;
public:
// Initialize a new empty constant pool
diff --git a/llvm/lib/MC/ConstantPools.cpp b/llvm/lib/MC/ConstantPools.cpp
index f895cc6413d74f..824d2463f30fc5 100644
--- a/llvm/lib/MC/ConstantPools.cpp
+++ b/llvm/lib/MC/ConstantPools.cpp
@@ -43,14 +43,15 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
// Check if there is existing entry for the same constant. If so, reuse it.
if (C) {
- auto CItr = CachedConstantEntries.find(C->getValue());
+ auto CItr = CachedConstantEntries.find(std::make_pair(C->getValue(), Size));
if (CItr != CachedConstantEntries.end())
return CItr->second;
}
// Check if there is existing entry for the same symbol. If so, reuse it.
if (S) {
- auto SItr = CachedSymbolEntries.find(&(S->getSymbol()));
+ auto SItr =
+ CachedSymbolEntries.find(std::make_pair(&(S->getSymbol()), Size));
if (SItr != CachedSymbolEntries.end())
return SItr->second;
}
@@ -60,9 +61,9 @@ const MCExpr *ConstantPool::addEntry(const MCExpr *Value, MCContext &Context,
Entries.push_back(ConstantPoolEntry(CPEntryLabel, Value, Size, Loc));
const auto SymRef = MCSymbolRefExpr::create(CPEntryLabel, Context);
if (C)
- CachedConstantEntries[C->getValue()] = SymRef;
+ CachedConstantEntries[std::make_pair(C->getValue(), Size)] = SymRef;
if (S)
- CachedSymbolEntries[&(S->getSymbol())] = SymRef;
+ CachedSymbolEntries[std::make_pair(&(S->getSymbol()), Size)] = SymRef;
return SymRef;
}
diff --git a/llvm/test/MC/AArch64/constant-pool-sizes.s b/llvm/test/MC/AArch64/constant-pool-sizes.s
new file mode 100644
index 00000000000000..279402af025f34
--- /dev/null
+++ b/llvm/test/MC/AArch64/constant-pool-sizes.s
@@ -0,0 +1,25 @@
+// RUN: llvm-mc -triple aarch64-none-linux-gnu %s | FileCheck %s
+
+ ldr w0, =symbol
+ ldr x1, =symbol
+
+ ldr w2, =1234567890
+ ldr x3, =1234567890
+
+// CHECK: ldr w0, .Ltmp0
+// CHECK: ldr x1, .Ltmp1
+// CHECK: ldr w2, .Ltmp2
+// CHECK: ldr x3, .Ltmp3
+
+// CHECK: .p2align 2, 0x0
+// CHECK-NEXT:.Ltmp0:
+// CHECK-NEXT: .word symbol
+// CHECK: .p2align 3, 0x0
+// CHECK-NEXT:.Ltmp1:
+// CHECK-NEXT: .xword symbol
+// CHECK: .p2align 2, 0x0
+// CHECK-NEXT:.Ltmp2:
+// CHECK-NEXT: .word 1234567890
+// CHECK: .p2align 3, 0x0
+// CHECK-NEXT:.Ltmp3:
+// CHECK-NEXT: .xword 1234567890
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If we want to optimize on AArch64 specifically, we could teach the assembler to rewrite ldr x0, =smallconstant
to ldr w0, =smallconstant
. But probably not the effort.
NASM does this for x86 and I think AArch64 too. (NASM is also nice to optimize mov rdx, 7 to mov edx, 7 by default, and other similar ones) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do end up with small bloat on your constant pool, but it's better than reading garbage. :)
I'd treat the fusion of these entries as a future optimization, since the endiannes is not a trivial thing and this fixes an actual bug with correct behaviour.
/cherry-pick 88b10f3 |
Error: Command failed due to missing milestone. |
If you write a 32- and a 64-bit LDR instruction that both refer to the same constant or symbol using the = syntax:
then the first call to
ConstantPool::addEntry
will insert the constant into its cache of existing entries, and the second one will find the cache entry and reuse it. This results in a 64-bit load from a 32-bit constant, reading nonsense into the other half of the target register.In this patch I've done the simplest fix: include the size of the constant pool entry as part of the key used to index the cache. So now 32- and 64-bit constant loads will never share a constant pool entry.
There's scope for doing this better, in principle: you could imagine merging the two slots with appropriate overlap, so that the 32-bit load loads the LSW of the 64-bit value. But that's much more complicated: you have to take endianness into account, and maybe also adjust the size of an existing entry. This is the simplest fix that restores correctness.