C++: Range analysis measure bounds #20645

paldepind · 2025-10-15T07:50:10Z

This PR intends to address (some) performance issues in the simple range analysis library.

The basic idea is rather simpler:

Do a simple pre-analysis that tries to estimate how many bounds the range analysis would produce for a given expression. For instance, the number of bounds for e1 + e2 is number the number of bounds for e1 times the number of bounds for e2.
If the estimate is large for a given expression then we turn on widening for that expression. This ensures that when range analysis runs the estimated blowup does not happen.

Note to reviewers:

Per commit review is probably easiest.
I've tried to thoroughly document stuff in comments (which I wont repeat that here). The comment for nrOfBoundsExpr is a good place to start.
A good chunk of code is quite straightforward. The tricky bits are around how phi nodes are handled. Things are a bit heuristic-y there, more optimal things might be possible, and there's a few details about the phi nodes here that I'm still a bit puzzled by. I've tried to add comments, but I think it's ok if a reviewer don't follow all the details there.
The limit beyond which widening is turned on is very large. This is to be conservative. We could lower it later.

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

+  /**
+   * Finds any expression that has a lower bound, but where `nrOfBounds` does
+   * not compute an estimate.
+   */


cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

Copilot

Pull Request Overview

This PR addresses performance issues in the C++ simple range analysis library by implementing a pre-analysis to estimate bounds growth and applying widening when the estimate exceeds a threshold. The solution prevents combinatorial explosions that could cause analysis timeouts.

Key changes:

Adds bounds estimation logic (BoundsEstimate module) that estimates potential bounds count before running full analysis
Implements selective widening based on bounds estimates to prevent performance issues
Updates test cases to reflect new analysis behavior with widening applied to expressions with many estimated bounds

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
SimpleRangeAnalysis.qll	Core implementation adding bounds estimation module and widening logic
test.c	New test cases demonstrating combinatorial explosion scenarios
upperBound.expected	Updated expected results reflecting widening behavior
lowerBound.expected	Updated expected results reflecting widening behavior
ternaryUpper.expected	Updated expected results for ternary expressions
ternaryLower.expected	Updated expected results for ternary expressions
nrOfBounds.ql	New test query for bounds estimation debugging

Comments suppressed due to low confidence (1)

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll:1

Corrected spelling of 'anncuracies' to 'inaccuracies'.

/**

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

geoffw0

Strategy seems sensible to me (but @MathiasVP has spent much more time working with this library). I will review the (second) DCA run when it finishes.

cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/test.c

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

cpp/ql/test/library-tests/rangeanalysis/SimpleRangeAnalysis/lowerBound.expected

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

paldepind · 2025-10-17T07:52:06Z

Thanks for the review @geoffw0 with some great catches 👍. I've applied your suggestions.

MathiasVP

A few minor comments. Thanks a lot for this, Simon!

I think it would be good to add an inline expectations test which shows the number of bounds for a given expression. This would also make it easier to spot problems with non-functionality of nrOfBoundsExpr. Would you mind adding such an inline expectations test while you're here?

MathiasVP · 2025-10-20T10:40:39Z

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

+  float getBoundsLimit() {
+    // This limit is arbitrary, but low enough that it prevents timeouts on
+    // specific observed customer databases (and the in the tests).
+    result = 2.0.pow(40)


I think this is a perfectly fine threshold to start with. FWIW when I introduce an "arbitrary threshold" like this I like to do a small amount of investigation into the underlying distribution. See for example what I did in this PR from last year where I plotted "number of nested bitwise operations" for each bitwise operation in a database. It would be interesting to see a similar plot for "number of bounds" for each expression on a database or two.

... but as I said: I think this very high arbitrary threshold is perfectly fine as a start.

That's a good point. @andersfugmann also suggested that we could create a statistics query and potentially get telemetry for this to make a more quantified upper bound.

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

MathiasVP · 2025-10-20T11:09:56Z

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll

+        or
+        exists(ConditionalExpr condExpr |
+          e = condExpr and
+          result = nrOfBoundsExpr(condExpr.getThen()) * nrOfBoundsExpr(condExpr.getElse())


This likely doesn't give us the best possible join ordering since this recursion is non-linear, but we've never bothered to actually fix this in the main recursion of SimpleRangeAnalysis itself so it's probably all fine.

paldepind · 2025-10-21T14:56:18Z

I've kicked off another DCA run for good measure, but assuming that one doesn't show anything then I think we're good to merge?

There is the option of doing a QA run as well. But the change seems low enough of a risk that that's worth it? Thoughts?

paldepind · 2025-10-22T09:05:51Z

There's some failures in DCA now and retrying didn't make them go away. Looking at the errors, they don't really look like they're caused by the QL changes.

jketema · 2025-10-22T09:43:12Z

There's some failures in DCA now and retrying didn't make them go away. Looking at the errors, they don't really look like they're caused by the QL changes.

Those errors were fixed late yesterday afternoon. Could you re-run?

paldepind · 2025-10-22T09:45:52Z

Thanks @jketema. I triggered retries 2 hours ago, should I start a new DCA run or just do the retry commands again?

jketema · 2025-10-22T09:50:11Z

I don't think retries work in this case: you'll need to start a new experiment.

paldepind · 2025-10-23T12:11:18Z

I think this is ready to merge now. Please double-check that y'all agree with my assessment of the DCA report.

geoffw0

DCA LGTM.

@MathiasVP were all your questions addressed?

MathiasVP · 2025-10-23T17:17:44Z

@MathiasVP were all your questions addressed?

The only thing I am missing is this part of my review:

I think it would be good to add an inline expectations test which shows the number of bounds for a given expression. This would also make it easier to spot problems with non-functionality of nrOfBoundsExpr. Would you mind adding such an inline expectations test while you're here?

paldepind · 2025-10-24T06:55:01Z

I agree that inline expectations would be nice (perhaps even more so for the lower/upper bounds themselves). But, is it ok if we don't do that for now?

geoffw0

The additional test can be done as follow-up.

github-actions bot added the C++ label Oct 15, 2025

github-advanced-security bot found potential problems Oct 15, 2025

View reviewed changes

paldepind added 2 commits October 15, 2025 11:11

C++: Factor out widening of bounds

8aaf9f6

C++: Add range analysis examples that explode

70a8c4f

paldepind force-pushed the cpp/range-analysis-measure branch 3 times, most recently from ab09ae5 to 4864e82 Compare October 16, 2025 10:44

github-advanced-security bot found potential problems Oct 16, 2025

View reviewed changes

paldepind force-pushed the cpp/range-analysis-measure branch from 0ac00f8 to ab836bb Compare October 16, 2025 12:17

paldepind added 5 commits October 16, 2025 15:05

C++: Apply widening based on number of bounds measure

7eacd87

C++: Add number of bounds test to simple range analysis

8896a72

C++: Add additional test for range analysis

99103a5

C++: Handle guard phi nodes differently

c1f0f3d

C++: Add debug predicates

9502d83

paldepind force-pushed the cpp/range-analysis-measure branch from ab836bb to 9502d83 Compare October 16, 2025 13:07

paldepind marked this pull request as ready for review October 16, 2025 13:47

Copilot AI review requested due to automatic review settings October 16, 2025 13:47

paldepind requested a review from a team as a code owner October 16, 2025 13:47

paldepind requested review from MathiasVP and geoffw0 October 16, 2025 13:47

Copilot AI reviewed Oct 16, 2025

View reviewed changes

cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll Show resolved Hide resolved

C++: Add change note

68d4240

github-actions bot added the documentation label Oct 16, 2025

geoffw0 reviewed Oct 16, 2025

View reviewed changes

C++: Apply suggested fixes from review

979b05c

MathiasVP reviewed Oct 20, 2025

View reviewed changes

paldepind added 2 commits October 21, 2025 09:47

C++: Address review comments

0badcfd

C++: Accept test changes

f207404

paldepind requested a review from geoffw0 October 23, 2025 12:10

geoffw0 reviewed Oct 23, 2025

View reviewed changes

geoffw0 approved these changes Oct 24, 2025

View reviewed changes

paldepind merged commit a0a6f28 into github:main Oct 24, 2025
16 checks passed

paldepind deleted the cpp/range-analysis-measure branch October 24, 2025 13:30

paldepind mentioned this pull request Oct 24, 2025

C++: A few small refactors to the simple range analysis library #20691

Merged

Uh oh!

C++: Range analysis measure bounds #20645

C++: Range analysis measure bounds #20645

Conversation

paldepind commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Check warning

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

geoffw0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paldepind commented Oct 17, 2025

Uh oh!

MathiasVP left a comment

Choose a reason for hiding this comment

Uh oh!

MathiasVP Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

paldepind Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MathiasVP Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

paldepind commented Oct 21, 2025

Uh oh!

paldepind commented Oct 22, 2025

Uh oh!

jketema commented Oct 22, 2025

Uh oh!

paldepind commented Oct 22, 2025

Uh oh!

jketema commented Oct 22, 2025

Uh oh!

paldepind commented Oct 23, 2025

Uh oh!

geoffw0 left a comment

Choose a reason for hiding this comment

Uh oh!

MathiasVP commented Oct 23, 2025

Uh oh!

paldepind commented Oct 24, 2025

Uh oh!

geoffw0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

paldepind commented Oct 15, 2025 •

edited

Loading