Skip to content

Commit 1c21d5c

Browse files
authored
[GlobalISel] Remove GI known bits cache (#157352)
There is a cache on the known-bit computed by global-isel. It only works inside a single query to computeKnownBits, which limits its usefulness, and according to the tests can sometimes limit the effectiveness of known-bits queries. (Although some AMD tests look longer). Keeping the cache valid and clearing it at the correct times can also require being careful about the functions called inside known-bits queries. I measured compile-time of removing it and came up with: ``` 7zip 2.06405E+11 2.06436E+11 0.015018992 Bullet 1.01298E+11 1.01186E+11 -0.110236169 ClamAV 57942466667 57848066667 -0.16292023 SPASS 45444466667 45402966667 -0.091320249 consumer 35432466667 35381233333 -0.144594317 kimwitu++ 40858833333 40927933333 0.169118877 lencod 70022366667 69950633333 -0.102443457 mafft 38439900000 38413233333 -0.069372362 sqlite3 35822266667 35770033333 -0.145812474 tramp3d 82083133333 82045600000 -0.045726 Average -0.068828739 ``` The last column is % difference between with / without the cache. So in total it seems to be costing slightly more to keep the current known-bits cache than if it was removed. (Measured in instruction count, similar to llvm-compile-time-tracker). The hit rate wasn't terrible - higher than I expected. In the llvm-test-suite+external projects it was hit 4791030 times out of 91107008 queries, slightly more than 5%. Note that as globalisel increases in complexity, more known bits calls might be made and the numbers might shift. If that is the case it might be better to have a cache that works across calls, providing it doesn't make effectiveness worse.
1 parent 6931bad commit 1c21d5c

File tree

8 files changed

+327
-351
lines changed

8 files changed

+327
-351
lines changed

llvm/include/llvm/CodeGen/GlobalISel/GISelValueTracking.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,6 @@ class LLVM_ABI GISelValueTracking : public GISelChangeObserver {
3737
const TargetLowering &TL;
3838
const DataLayout &DL;
3939
unsigned MaxDepth;
40-
/// Cache maintained during a computeKnownBits request.
41-
SmallDenseMap<Register, KnownBits, 16> ComputeKnownBitsCache;
4240

4341
void computeKnownBitsMin(Register Src0, Register Src1, KnownBits &Known,
4442
const APInt &DemandedElts, unsigned Depth = 0);

llvm/lib/CodeGen/GlobalISel/GISelValueTracking.cpp

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -93,12 +93,8 @@ KnownBits GISelValueTracking::getKnownBits(Register R) {
9393
KnownBits GISelValueTracking::getKnownBits(Register R,
9494
const APInt &DemandedElts,
9595
unsigned Depth) {
96-
// For now, we only maintain the cache during one request.
97-
assert(ComputeKnownBitsCache.empty() && "Cache should have been cleared");
98-
9996
KnownBits Known;
10097
computeKnownBitsImpl(R, Known, DemandedElts, Depth);
101-
ComputeKnownBitsCache.clear();
10298
return Known;
10399
}
104100

@@ -187,14 +183,6 @@ void GISelValueTracking::computeKnownBitsImpl(Register R, KnownBits &Known,
187183
#endif
188184

189185
unsigned BitWidth = DstTy.getScalarSizeInBits();
190-
auto CacheEntry = ComputeKnownBitsCache.find(R);
191-
if (CacheEntry != ComputeKnownBitsCache.end()) {
192-
Known = CacheEntry->second;
193-
LLVM_DEBUG(dbgs() << "Cache hit at ");
194-
LLVM_DEBUG(dumpResult(MI, Known, Depth));
195-
assert(Known.getBitWidth() == BitWidth && "Cache entry size doesn't match");
196-
return;
197-
}
198186
Known = KnownBits(BitWidth); // Don't know anything
199187

200188
// Depth may get bigger than max depth if it gets passed to a different
@@ -254,16 +242,6 @@ void GISelValueTracking::computeKnownBitsImpl(Register R, KnownBits &Known,
254242
// point of the pipeline, otherwise the main live-range will be
255243
// defined more than once, which is against SSA.
256244
assert(MI.getOperand(0).getSubReg() == 0 && "Is this code in SSA?");
257-
// Record in the cache that we know nothing for MI.
258-
// This will get updated later and in the meantime, if we reach that
259-
// phi again, because of a loop, we will cut the search thanks to this
260-
// cache entry.
261-
// We could actually build up more information on the phi by not cutting
262-
// the search, but that additional information is more a side effect
263-
// than an intended choice.
264-
// Therefore, for now, save on compile time until we derive a proper way
265-
// to derive known bits for PHIs within loops.
266-
ComputeKnownBitsCache[R] = KnownBits(BitWidth);
267245
// PHI's operand are a mix of registers and basic blocks interleaved.
268246
// We only care about the register ones.
269247
for (unsigned Idx = 1; Idx < MI.getNumOperands(); Idx += 2) {
@@ -700,9 +678,6 @@ void GISelValueTracking::computeKnownBitsImpl(Register R, KnownBits &Known,
700678
}
701679

702680
LLVM_DEBUG(dumpResult(MI, Known, Depth));
703-
704-
// Update the cache.
705-
ComputeKnownBitsCache[R] = Known;
706681
}
707682

708683
static bool outputDenormalIsIEEEOrPosZero(const MachineFunction &MF, LLT Ty) {

llvm/test/CodeGen/AArch64/rem-by-const.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ define i8 @ui8_7(i8 %a, i8 %b) {
8888
; CHECK-GI-NEXT: sub w9, w0, w8
8989
; CHECK-GI-NEXT: ubfx w9, w9, #1, #7
9090
; CHECK-GI-NEXT: add w8, w9, w8
91-
; CHECK-GI-NEXT: ubfx w8, w8, #2, #6
91+
; CHECK-GI-NEXT: lsr w8, w8, #2
9292
; CHECK-GI-NEXT: lsl w9, w8, #3
9393
; CHECK-GI-NEXT: sub w8, w9, w8
9494
; CHECK-GI-NEXT: sub w0, w0, w8
@@ -207,7 +207,7 @@ define i16 @ui16_7(i16 %a, i16 %b) {
207207
; CHECK-GI-NEXT: sub w9, w0, w8
208208
; CHECK-GI-NEXT: ubfx w9, w9, #1, #15
209209
; CHECK-GI-NEXT: add w8, w9, w8
210-
; CHECK-GI-NEXT: ubfx w8, w8, #2, #14
210+
; CHECK-GI-NEXT: lsr w8, w8, #2
211211
; CHECK-GI-NEXT: lsl w9, w8, #3
212212
; CHECK-GI-NEXT: sub w8, w9, w8
213213
; CHECK-GI-NEXT: sub w0, w0, w8

0 commit comments

Comments
 (0)