Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
f433ff8
[LoopIdiom] Add test for simplifying SCEV during expansion with flags.
fhahn Aug 19, 2025
b73cf05
[IndVars,LV] Add tests for missed SCEV simplifications with muls.
fhahn Aug 25, 2025
efc6ff0
[SCEV] Try to re-use existing LCSSA phis when expanding SCEVAddRecExp…
fhahn Jul 17, 2025
6f4e97f
[LAA] Add test variant with backward dep with overlap in loop.
fhahn Jul 21, 2025
86aa208
[SCEV] Don't require NUW at first add when checking A+C1 < (A+C2)<nuw…
fhahn Jul 23, 2025
ee153c8
[LV] Add additional SCEV expansion tests for #147824.
fhahn Jul 25, 2025
bcf033b
[SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. (#147824)
fhahn Jul 25, 2025
110e585
[LV] Add test for re-using existing phi for SCEV Add.
fhahn Jul 25, 2025
59fcf96
[SCEV] Make sure LCSSA is preserved when re-using phi if needed.
fhahn Jul 28, 2025
9184074
[SCEV] Add test for pushing constant add into zext.
fhahn Jul 30, 2025
4c941bf
[SECV] Try to push the op into ZExt: A + zext (-A + B) -> zext (B) (#…
fhahn Jul 30, 2025
e160af6
[SCEV] Use pattern match to check ZExt(Add()). (NFC)
fhahn Jul 31, 2025
f0e4da7
[SCEV] Allow adds of constants in tryToReuseLCSSAPhi. (#150693)
fhahn Jul 31, 2025
a996352
[SCEVExp] Check if getPtrToIntExpr resulted in CouldNotCompute.
fhahn Aug 26, 2025
0a096eb
[SCEV] Try to push op into ZExt: C * zext (A + B) -> zext (A*C + B*C)…
fhahn Aug 26, 2025
83fdf69
[SCEV] Add tests for applying guards to SCEVAddExpr sub-expressions.
fhahn Aug 29, 2025
9703b98
[SCEV] Rewrite some SCEVAdd sub-expressions using loop guards. (#156013)
fhahn Sep 1, 2025
2ed9951
[LAA] Rename var used to retry with RT-checks (NFC) (#147307)
artagnon Jul 22, 2025
68d4d9e
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#1…
fhahn Aug 1, 2025
75a5fcd
[SCEV][LAA] Support multiplication overflow computation (#155236)
annamthomas Aug 27, 2025
6cd6e8b
[LV] Add additional tests for reasoning about dereferenceable loads.
fhahn Sep 3, 2025
57bd0c8
Reapply "[LAA,Loads] Use loop guards and max BTC if needed when check…
fhahn Sep 3, 2025
29e4e6a
[LAA] Support assumptions with non-constant deref sizes. (#156758)
fhahn Sep 4, 2025
b18c2ca
[SCEV] Fold (C * A /u C) -> A, if A is a multiple of C and C a pow-of…
fhahn Sep 5, 2025
4be0a04
[VPlan] Make VPInstruction::AnyOf poison-safe. (#154156)
fhahn Aug 25, 2025
f6c35e2
[SCEV] Add tests for folding multiply/divide by constants.
fhahn Sep 5, 2025
4918512
[SCEV] Cover more multipler/divisor combinations in folding test.
fhahn Sep 5, 2025
c38a2e9
[SCEV] Generalize (C * A /u C) -> A fold to (C1 * A /u C2) -> C1/C2 *…
fhahn Sep 9, 2025
af12a13
[SCEV] Fold ((-1 * C1) * D / C1) -> -1 * D. (#157555)
fhahn Sep 10, 2025
ca229b8
[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1. (#157656)
fhahn Sep 11, 2025
41b6b30
[SCEV] Add more tests for MUL/UDIV folds.
fhahn Sep 17, 2025
d3587ec
Reapply "[SCEV] Fold (C1 * A /u C2) -> A /u (C2 /u C1), if C2 > C1."…
fhahn Sep 17, 2025
ad95f85
[InstComb] Fold inttoptr (add (ptrtoint %B), %O) -> GEP for ICMP user…
fhahn Aug 21, 2025
c0c10b1
[InstComb] Allow more user for (add (ptrtoint %B), %O) to GEP transfo…
fhahn Aug 22, 2025
1be7a05
[LSR] Move test from Analysis/ScalarEvolution to Transforms, regen.
fhahn Sep 3, 2025
0d207e0
[SCEV] Add tests that benefit from rewriting SCEVAddExpr with guards.
fhahn Sep 20, 2025
2e0796d
[SCEV] Add additional test with guards for 3-op AddRec.
fhahn Sep 22, 2025
14064d5
[LV] Add test showing missed optimization due to missing info from guard
fhahn Sep 22, 2025
921ddad
[SCEV] Add tests for computing trip counts with align assumptions.
fhahn Sep 26, 2025
037fffd
[SCEV] Pass loop pred branch as context instruction to getMinTrailing…
fhahn Oct 8, 2025
8d201e5
[SCEV] Use getConstantMultiple in to get divisibility info from guard…
fhahn Oct 9, 2025
b0ea535
[SCEV] Use APInt for DividesBy when collecting loop guard info (NFC).…
fhahn Oct 12, 2025
d245be6
[SCEV] Add test with ptrtoint guards and their order swapped.
fhahn Oct 13, 2025
198f96b
[SCEV] Add m_scev_Trunc pattern matcher. (#163169)
fhahn Oct 13, 2025
2af416b
[SCEV] Move URem matching to ScalarEvolutionPatternMatch.h (#163170)
fhahn Oct 13, 2025
d3f487a
[SCEV] Collect guard info for ICMP NE w/o constants. (#160500)
fhahn Oct 14, 2025
354788f
[SCEV] Use context instruction for SCEVUnknowns in getConstantMultipl…
fhahn Oct 14, 2025
711a060
[SCEV] Add tests with multiple NE guards and different orders.
fhahn Oct 16, 2025
96dbb37
[SCEV] Use m_scev_Mul in a few more places. (NFC) (#163364)
fhahn Oct 16, 2025
98e17c6
[SCEV] Rewrite A - B = UMin(1, A - B) lazily for A != B loop guards. …
fhahn Oct 18, 2025
ba69e1d
[SCEV] Preserve divisor info when adding guard info for ICMP_NE via S…
fhahn Oct 20, 2025
ea3b9c0
[SCEV] Add extra test coverage with URem & AddRec guards.
fhahn Oct 20, 2025
f2ef597
[SCEV] Move and clarify names of prev/next divisor helpers (NFC).
fhahn Oct 20, 2025
a738a80
[SCEV] Fix switch formatting in collectFromBlock (NFC).
fhahn Oct 31, 2025
208a5b2
[SCEV] Improve handling of divisibility information from loop guards.…
fhahn Nov 2, 2025
d67f605
[Loads] Add additional test coverage for assumptions.
fhahn Sep 2, 2025
ee66714
[LAA] Check if Ptr can be freed between Assume and CtxI. (#161725)
fhahn Oct 3, 2025
5c9f0df
[Clang] Allow non-constant sizes for __builtin_assume_dereferenceable…
fhahn Sep 5, 2025
c83eacd
[Loads] Check if Ptr can be freed between Assume and CtxI. (#161255)
fhahn Oct 7, 2025
dec6e12
Update tests.
fhahn Nov 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,8 @@ Non-comprehensive list of changes in this release
this build without optimizations (i.e. use `-O0` or use the `optnone` function
attribute) or use the `fno-sanitize-merge=` flag in optimized builds.

- ``__builtin_assume_dereferenceable`` now accepts non-constant size operands.

New Compiler Flags
------------------
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
Expand Down
2 changes: 1 addition & 1 deletion clang/include/clang/Basic/Builtins.td
Original file line number Diff line number Diff line change
Expand Up @@ -854,7 +854,7 @@ def BuiltinAssumeAligned : Builtin {
def BuiltinAssumeDereferenceable : Builtin {
let Spellings = ["__builtin_assume_dereferenceable"];
let Attributes = [NoThrow, Const];
let Prototype = "void(void const*, _Constant size_t)";
let Prototype = "void(void const*, size_t)";
}

def BuiltinFree : Builtin {
Expand Down
59 changes: 59 additions & 0 deletions clang/test/CodeGen/builtin-assume-dereferenceable.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,62 @@ int test2(int *a) {
__builtin_assume_dereferenceable(a, 32ull);
return a[0];
}

// CHECK-LABEL: @test3(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[A_ADDR:%.*]] = alloca ptr, align 8
// CHECK-NEXT: [[N_ADDR:%.*]] = alloca i32, align 4
// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8
// CHECK-NEXT: store i32 [[N:%.*]], ptr [[N_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[N_ADDR]], align 4
// CHECK-NEXT: [[CONV:%.*]] = sext i32 [[TMP1]] to i64
// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 [[CONV]]) ]
// CHECK-NEXT: [[TMP2:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i64 0
// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
// CHECK-NEXT: ret i32 [[TMP3]]
//
int test3(int *a, int n) {
__builtin_assume_dereferenceable(a, n);
return a[0];
}

// CHECK-LABEL: @test4(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[A_ADDR:%.*]] = alloca ptr, align 8
// CHECK-NEXT: [[N_ADDR:%.*]] = alloca i64, align 8
// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8
// CHECK-NEXT: store i64 [[N:%.*]], ptr [[N_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load i64, ptr [[N_ADDR]], align 8
// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 [[TMP1]]) ]
// CHECK-NEXT: [[TMP2:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i64 0
// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
// CHECK-NEXT: ret i32 [[TMP3]]
//
int test4(int *a, unsigned long long n) {
__builtin_assume_dereferenceable(a, n);
return a[0];
}

// CHECK-LABEL: @test5(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[A_ADDR:%.*]] = alloca ptr, align 8
// CHECK-NEXT: [[N_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8
// CHECK-NEXT: store float [[N:%.*]], ptr [[N_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load float, ptr [[N_ADDR]], align 4
// CHECK-NEXT: [[CONV:%.*]] = fptoui float [[TMP1]] to i64
// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 [[CONV]]) ]
// CHECK-NEXT: [[TMP2:%.*]] = load ptr, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i64 0
// CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
// CHECK-NEXT: ret i32 [[TMP3]]
//
int test5(int *a, float n) {
__builtin_assume_dereferenceable(a, n);
return a[0];
}
9 changes: 7 additions & 2 deletions clang/test/SemaCXX/builtin-assume-dereferenceable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ int test3(int *a) {
}

int test4(int *a, unsigned size) {
a = __builtin_assume_dereferenceable(a, size); // expected-error {{argument to '__builtin_assume_dereferenceable' must be a constant integer}}
__builtin_assume_dereferenceable(a, size);
return a[0];
}

int test5(int *a, unsigned long long size) {
a = __builtin_assume_dereferenceable(a, size); // expected-error {{argument to '__builtin_assume_dereferenceable' must be a constant integer}}
__builtin_assume_dereferenceable(a, size);
return a[0];
}

Expand Down Expand Up @@ -53,3 +53,8 @@ constexpr void *l = __builtin_assume_dereferenceable(p, 4); // expected-error {{
void *foo() {
return l;
}

int test10(int *a) {
__builtin_assume_dereferenceable(a, a); // expected-error {{cannot initialize a parameter of type 'unsigned long' with an lvalue of type 'int *'}}
return a[0];
}
59 changes: 43 additions & 16 deletions llvm/include/llvm/Analysis/LoopAccessAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -180,11 +180,15 @@ class MemoryDepChecker {
const SmallVectorImpl<Instruction *> &Instrs) const;
};

MemoryDepChecker(PredicatedScalarEvolution &PSE, const Loop *L,
MemoryDepChecker(PredicatedScalarEvolution &PSE, AssumptionCache *AC,
DominatorTree *DT, const Loop *L,
const DenseMap<Value *, const SCEV *> &SymbolicStrides,
unsigned MaxTargetVectorWidthInBits)
: PSE(PSE), InnermostLoop(L), SymbolicStrides(SymbolicStrides),
MaxTargetVectorWidthInBits(MaxTargetVectorWidthInBits) {}
unsigned MaxTargetVectorWidthInBits,
std::optional<ScalarEvolution::LoopGuards> &LoopGuards)
: PSE(PSE), AC(AC), DT(DT), InnermostLoop(L),
SymbolicStrides(SymbolicStrides),
MaxTargetVectorWidthInBits(MaxTargetVectorWidthInBits),
LoopGuards(LoopGuards) {}

/// Register the location (instructions are given increasing numbers)
/// of a write access.
Expand Down Expand Up @@ -236,8 +240,8 @@ class MemoryDepChecker {

/// In same cases when the dependency check fails we can still
/// vectorize the loop with a dynamic array access check.
bool shouldRetryWithRuntimeCheck() const {
return FoundNonConstantDistanceDependence &&
bool shouldRetryWithRuntimeChecks() const {
return ShouldRetryWithRuntimeChecks &&
Status == VectorizationSafetyStatus::PossiblySafeWithRtChecks;
}

Expand Down Expand Up @@ -288,6 +292,15 @@ class MemoryDepChecker {
return PointerBounds;
}

DominatorTree *getDT() const {
assert(DT && "requested DT, but it is not available");
return DT;
}
AssumptionCache *getAC() const {
assert(AC && "requested AC, but it is not available");
return AC;
}

private:
/// A wrapper around ScalarEvolution, used to add runtime SCEV checks, and
/// applies dynamic knowledge to simplify SCEV expressions and convert them
Expand All @@ -296,6 +309,10 @@ class MemoryDepChecker {
/// example we might assume a unit stride for a pointer in order to prove
/// that a memory access is strided and doesn't wrap.
PredicatedScalarEvolution &PSE;

AssumptionCache *AC;
DominatorTree *DT;

const Loop *InnermostLoop;

/// Reference to map of pointer values to
Expand Down Expand Up @@ -327,9 +344,9 @@ class MemoryDepChecker {
uint64_t MaxStoreLoadForwardSafeDistanceInBits =
std::numeric_limits<uint64_t>::max();

/// If we see a non-constant dependence distance we can still try to
/// vectorize this loop with runtime checks.
bool FoundNonConstantDistanceDependence = false;
/// Whether we should try to vectorize the loop with runtime checks, if the
/// dependencies are not safe.
bool ShouldRetryWithRuntimeChecks = false;

/// Result of the dependence checks, indicating whether the checked
/// dependences are safe for vectorization, require RT checks or are known to
Expand Down Expand Up @@ -358,7 +375,7 @@ class MemoryDepChecker {
PointerBounds;

/// Cache for the loop guards of InnermostLoop.
std::optional<ScalarEvolution::LoopGuards> LoopGuards;
std::optional<ScalarEvolution::LoopGuards> &LoopGuards;

/// Check whether there is a plausible dependence between the two
/// accesses.
Expand Down Expand Up @@ -516,8 +533,9 @@ class RuntimePointerChecking {
AliasSetId(AliasSetId), Expr(Expr), NeedsFreeze(NeedsFreeze) {}
};

RuntimePointerChecking(MemoryDepChecker &DC, ScalarEvolution *SE)
: DC(DC), SE(SE) {}
RuntimePointerChecking(MemoryDepChecker &DC, ScalarEvolution *SE,
std::optional<ScalarEvolution::LoopGuards> &LoopGuards)
: DC(DC), SE(SE), LoopGuards(LoopGuards) {}

/// Reset the state of the pointer runtime information.
void reset() {
Expand Down Expand Up @@ -631,6 +649,9 @@ class RuntimePointerChecking {
/// Holds a pointer to the ScalarEvolution analysis.
ScalarEvolution *SE;

/// Cache for the loop guards of the loop.
std::optional<ScalarEvolution::LoopGuards> &LoopGuards;

/// Set of run-time checks required to establish independence of
/// otherwise may-aliasing pointers in the loop.
SmallVector<RuntimePointerCheck, 4> Checks;
Expand Down Expand Up @@ -670,7 +691,7 @@ class LoopAccessInfo {
LLVM_ABI LoopAccessInfo(Loop *L, ScalarEvolution *SE,
const TargetTransformInfo *TTI,
const TargetLibraryInfo *TLI, AAResults *AA,
DominatorTree *DT, LoopInfo *LI,
DominatorTree *DT, LoopInfo *LI, AssumptionCache *AC,
bool AllowPartial = false);

/// Return true we can analyze the memory accesses in the loop and there are
Expand Down Expand Up @@ -806,6 +827,9 @@ class LoopAccessInfo {

Loop *TheLoop;

/// Cache for the loop guards of TheLoop.
std::optional<ScalarEvolution::LoopGuards> LoopGuards;

/// Determines whether we should generate partial runtime checks when not all
/// memory accesses could be analyzed.
bool AllowPartial;
Expand Down Expand Up @@ -922,7 +946,9 @@ LLVM_ABI std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
const Loop *Lp, const SCEV *PtrExpr, Type *AccessTy, const SCEV *BTC,
const SCEV *MaxBTC, ScalarEvolution *SE,
DenseMap<std::pair<const SCEV *, Type *>,
std::pair<const SCEV *, const SCEV *>> *PointerBounds);
std::pair<const SCEV *, const SCEV *>> *PointerBounds,
DominatorTree *DT, AssumptionCache *AC,
std::optional<ScalarEvolution::LoopGuards> &LoopGuards);

class LoopAccessInfoManager {
/// The cache.
Expand All @@ -935,12 +961,13 @@ class LoopAccessInfoManager {
LoopInfo &LI;
TargetTransformInfo *TTI;
const TargetLibraryInfo *TLI = nullptr;
AssumptionCache *AC;

public:
LoopAccessInfoManager(ScalarEvolution &SE, AAResults &AA, DominatorTree &DT,
LoopInfo &LI, TargetTransformInfo *TTI,
const TargetLibraryInfo *TLI)
: SE(SE), AA(AA), DT(DT), LI(LI), TTI(TTI), TLI(TLI) {}
const TargetLibraryInfo *TLI, AssumptionCache *AC)
: SE(SE), AA(AA), DT(DT), LI(LI), TTI(TTI), TLI(TLI), AC(AC) {}

LLVM_ABI const LoopAccessInfo &getInfo(Loop &L, bool AllowPartial = false);

Expand Down
21 changes: 12 additions & 9 deletions llvm/include/llvm/Analysis/ScalarEvolution.h
Original file line number Diff line number Diff line change
Expand Up @@ -1000,10 +1000,14 @@ class ScalarEvolution {
/// (at every loop iteration). It is, at the same time, the minimum number
/// of times S is divisible by 2. For example, given {4,+,8} it returns 2.
/// If S is guaranteed to be 0, it returns the bitwidth of S.
LLVM_ABI uint32_t getMinTrailingZeros(const SCEV *S);
/// If \p CtxI is not nullptr, return a constant multiple valid at \p CtxI.
LLVM_ABI uint32_t getMinTrailingZeros(const SCEV *S,
const Instruction *CtxI = nullptr);

/// Returns the max constant multiple of S.
LLVM_ABI APInt getConstantMultiple(const SCEV *S);
/// Returns the max constant multiple of S. If \p CtxI is not nullptr, return
/// a constant multiple valid at \p CtxI.
LLVM_ABI APInt getConstantMultiple(const SCEV *S,
const Instruction *CtxI = nullptr);

// Returns the max constant multiple of S. If S is exactly 0, return 1.
LLVM_ABI APInt getNonZeroConstantMultiple(const SCEV *S);
Expand Down Expand Up @@ -1339,6 +1343,7 @@ class ScalarEvolution {

class LoopGuards {
DenseMap<const SCEV *, const SCEV *> RewriteMap;
SmallDenseSet<std::pair<const SCEV *, const SCEV *>> NotEqual;
bool PreserveNUW = false;
bool PreserveNSW = false;
ScalarEvolution &SE;
Expand Down Expand Up @@ -1525,8 +1530,10 @@ class ScalarEvolution {
/// Return the Value set from which the SCEV expr is generated.
ArrayRef<Value *> getSCEVValues(const SCEV *S);

/// Private helper method for the getConstantMultiple method.
APInt getConstantMultipleImpl(const SCEV *S);
/// Private helper method for the getConstantMultiple method. If \p CtxI is
/// not nullptr, return a constant multiple valid at \p CtxI.
APInt getConstantMultipleImpl(const SCEV *S,
const Instruction *Ctx = nullptr);

/// Information about the number of times a particular loop exit may be
/// reached before exiting the loop.
Expand Down Expand Up @@ -2310,10 +2317,6 @@ class ScalarEvolution {
/// an add rec on said loop.
void getUsedLoops(const SCEV *S, SmallPtrSetImpl<const Loop *> &LoopsUsed);

/// Try to match the pattern generated by getURemExpr(A, B). If successful,
/// Assign A and B to LHS and RHS, respectively.
LLVM_ABI bool matchURem(const SCEV *Expr, const SCEV *&LHS, const SCEV *&RHS);

/// Look for a SCEV expression with type `SCEVType` and operands `Ops` in
/// `UniqueSCEVs`. Return if found, else nullptr.
SCEV *findExistingSCEVInCache(SCEVTypes SCEVType, ArrayRef<const SCEV *> Ops);
Expand Down
Loading