[clang][dataflow] Strengthen widening of boolean values. #73484

martinboehme · 2023-11-27T06:32:38Z

Before we widen to top, we now check if both values can be proved either true or
false in their respective environments; if so, widening returns a true or false
literal. The idea is that we avoid losing information if posssible.

This patch includes a test that fails without this change to widening.

This change does mean that we call the SAT solver in more places, but this seems
acceptable given the additional precision we gain.

In tests on an internal codebase, the number of SAT solver timeouts we observe
with Crubit's nullability checker does increase by about 25%. They can be
brought back to the previous level by doubling the SAT solver work limit.

Before we widen to top, we now check if both values can be proved either true or false in their respective environments; if so, widening returns a true or false literal. The idea is that we avoid losing information if posssible. This patch includes a test that fails without this change to widening. This change does mean that we call the SAT solver in more places, but this seems acceptable given the additional precision we gain. In tests on an internal codebase, the number of SAT solver timeouts we observe with Crubit's nullability checker does increase by about 25%. They can be brought back to the previous level by doubling the SAT solver work limit.

llvmbot · 2023-11-27T06:33:07Z

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

Before we widen to top, we now check if both values can be proved either true or
false in their respective environments; if so, widening returns a true or false
literal. The idea is that we avoid losing information if posssible.

This patch includes a test that fails without this change to widening.

This change does mean that we call the SAT solver in more places, but this seems
acceptable given the additional precision we gain.

In tests on an internal codebase, the number of SAT solver timeouts we observe
with Crubit's nullability checker does increase by about 25%. They can be
brought back to the previous level by doubling the SAT solver work limit.

Full diff: https://github.com/llvm/llvm-project/pull/73484.diff

2 Files Affected:

(modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+17-4)
(modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+28)

diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index 1a38be9c1374f65..525ab188b01b8aa 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -157,12 +157,25 @@ static Value &widenDistinctValues(QualType Type, Value &Prev,
                                   Environment &CurrentEnv,
                                   Environment::ValueModel &Model) {
   // Boolean-model widening.
-  if (isa<BoolValue>(&Prev)) {
-    assert(isa<BoolValue>(Current));
-    // Widen to Top, because we know they are different values. If previous was
-    // already Top, re-use that to (implicitly) indicate that no change occured.
+  if (auto *PrevBool = dyn_cast<BoolValue>(&Prev)) {
+    // If previous value was already Top, re-use that to (implicitly) indicate
+    // that no change occurred.
     if (isa<TopBoolValue>(Prev))
       return Prev;
+
+    // We may need to widen to Top, but before we do so, check whether both
+    // values are implied to be either true or false in the current environment.
+    // In that case, we can simply return a literal instead.
+    auto &CurBool = cast<BoolValue>(Current);
+    bool TruePrev = PrevEnv.proves(PrevBool->formula());
+    bool TrueCur = CurrentEnv.proves(CurBool.formula());
+    if (TruePrev && TrueCur)
+      return CurrentEnv.getBoolLiteralValue(true);
+    if (!TruePrev && !TrueCur &&
+        PrevEnv.proves(PrevEnv.arena().makeNot(PrevBool->formula())) &&
+        CurrentEnv.proves(CurrentEnv.arena().makeNot(CurBool.formula())))
+      return CurrentEnv.getBoolLiteralValue(false);
+
     return CurrentEnv.makeTopBoolValue();
   }
 
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index ade0d202ced2f37..8da55953a329869 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -4167,6 +4167,34 @@ TEST(TransferTest, LoopWithShortCircuitedConditionConverges) {
   ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code), llvm::Succeeded());
 }
 
+TEST(TransferTest, LoopCanProveInvariantForBoolean) {
+  // Check that we can prove `b` is always false in the loop.
+  // This test exercises the logic in `widenDistinctValues()` that preserves
+  // information if the boolean can be proved to be either true or false in both
+  // the previous and current iteration.
+  std::string Code = R"cc(
+    int return_int();
+    void target() {
+      bool b = return_int() == 0;
+      if (b) return;
+      while (true) {
+        b;
+        // [[p]]
+        b = return_int() == 0;
+        if (b) return;
+      }
+    }
+  )cc";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {
+        const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
+        auto &BVal = getValueForDecl<BoolValue>(ASTCtx, Env, "b");
+        EXPECT_TRUE(Env.proves(Env.arena().makeNot(BVal.formula())));
+      });
+}
+
 TEST(TransferTest, DoesNotCrashOnUnionThisExpr) {
   std::string Code = R"(
     union Union {

This will be required after llvm/llvm-project#73484, which adds calls to the SAT solver from widening. Empirically, that change increases the number of SAT solver timeouts in the nullability check by about 25% on an internal codebase. Doubling the work limit brings SAT solver timeouts back to around their previous level. PiperOrigin-RevId: 585631104 Change-Id: I72dd2aaf3a4883029cb59ea8933dd7a12c95a8f4

llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Nov 27, 2023

martinboehme requested review from ymand and Xazax-hun November 27, 2023 06:33

Xazax-hun approved these changes Nov 27, 2023

View reviewed changes

martinboehme merged commit 5bd643e into llvm:main Nov 27, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[clang][dataflow] Strengthen widening of boolean values. #73484

[clang][dataflow] Strengthen widening of boolean values. #73484

martinboehme commented Nov 27, 2023

llvmbot commented Nov 27, 2023 •

edited

[clang][dataflow] Strengthen widening of boolean values. #73484

[clang][dataflow] Strengthen widening of boolean values. #73484

Conversation

martinboehme commented Nov 27, 2023

llvmbot commented Nov 27, 2023 • edited

llvmbot commented Nov 27, 2023 •

edited