Skip to content

Conversation

@necto
Copy link
Contributor

@necto necto commented Oct 10, 2025

To avoid information loss, introduce a difference between unset stats and 0 for statistics that are supposed to be set once per entry point. Now, if the statistic is not set for an entry point, the corresponding CSV cell will be empty, and not 0.

Thanks to this differentiation, I noticed that PathRunningTime was actually never set, and fixed that.
Additionally, this patch enables the timers if DumpEntryPointStatsToCSV is set, because in most cases you dump these stats to get a detailed view on analyzer performance.

Finally, I added a dedicated debug checker that demonstrates the use of a statistic and tested the set and unset scenarios explicitly.

--

CPP-7097

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Arseniy Zaostrovnykh (necto)

Changes

To avoid information loss, introduce a difference between unset stats and 0 for statistics that are supposed to be set once per entry point. Now, if the statistic is not set for an entry point, the corresponding CSV cell will be empty, and not 0.

Thanks to this differentiation, I noticed that PathRunningTime was actually never set, and fixed that.
Additionally, this patch enables the timers if DumpEntryPointStatsToCSV is set, because in most cases you dump these stats to get a detailed view on analyzer performance.

Finally, I added a dedicated debug checker that demonstrates the use of a statistic and tested the set and unset scenarios explicitly.

--

CPP-7097


Full diff: https://github.com/llvm/llvm-project/pull/162839.diff

8 Files Affected:

  • (modified) clang/docs/analyzer/developer-docs/Statistics.rst (+1-1)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (+4)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (+1)
  • (added) clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp (+59)
  • (modified) clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp (+49-20)
  • (modified) clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp (+4-1)
  • (modified) clang/test/Analysis/analyzer-stats/entry-point-stats.cpp (+33-16)
diff --git a/clang/docs/analyzer/developer-docs/Statistics.rst b/clang/docs/analyzer/developer-docs/Statistics.rst
index 4f2484a89a6af..355759d468282 100644
--- a/clang/docs/analyzer/developer-docs/Statistics.rst
+++ b/clang/docs/analyzer/developer-docs/Statistics.rst
@@ -22,7 +22,7 @@ However, note that with ``LLVM_ENABLE_STATS`` disabled, only storage of the valu
 If you want to define a statistic only for entry point, EntryPointStats.h has four classes at your disposal:
 
 
-- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body".
+- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body". If no value is assigned during analysis of an entry point, the corresponding CSV cell will be empty.
 - ``CounterEPStat`` - an additive statistic. It starts with 0 and you can add to it as many times as needed. For example: "the number of bugs discovered".
 - ``UnsignedMaxEPStat`` - a maximizing statistic. It starts with 0 and when you join it with a value, it picks the maximum of the previous value and the new one. For example, "the longest execution path of a bug".
 
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 4473c54d8d6e3..0db6794b2310a 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -1502,6 +1502,10 @@ def TaintTesterChecker : Checker<"TaintTest">,
   HelpText<"Mark tainted symbols as such.">,
   Documentation<NotDocumented>;
 
+def UnsignedStatTesterChecker : Checker<"UnsignedStatTester">,
+  HelpText<"Test checker for demonstrating UnsignedEPStat usage.">,
+  Documentation<NotDocumented>;
+
 // This checker *technically* depends on SteamChecker, but we don't allow
 // dependency checkers to emit diagnostics, and a debug checker isn't worth
 // the chore needed to create a modeling portion on its own. Since this checker
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
index 389f17d36e65a..0a45deb33fcc9 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
@@ -85,7 +85,7 @@ class UnsignedEPStat : public EntryPointStat {
 
 public:
   explicit UnsignedEPStat(llvm::StringLiteral Name);
-  unsigned value() const { return Value.value_or(0); }
+  std::optional<unsigned> value() const { return Value; }
   void reset() { Value.reset(); }
   void set(unsigned V) {
     assert(!Value.has_value());
diff --git a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
index 29d2c4512d470..f70ae1a3e8a96 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
@@ -125,6 +125,7 @@ add_clang_library(clangStaticAnalyzerCheckers
   UninitializedObject/UninitializedPointee.cpp
   UnixAPIChecker.cpp
   UnreachableCodeChecker.cpp
+  UnsignedStatTesterChecker.cpp
   VforkChecker.cpp
   VLASizeChecker.cpp
   VAListChecker.cpp
diff --git a/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
new file mode 100644
index 0000000000000..a302f506a285a
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
@@ -0,0 +1,59 @@
+//== UnsignedStatTesterChecker.cpp --------------------------- -*- C++ -*--=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This checker demonstrates the use of UnsignedEPStat for per-entry-point
+// statistics. It conditionally sets a statistic based on the entry point name.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
+
+using namespace clang;
+using namespace ento;
+
+#define DEBUG_TYPE "UnsignedStatTester"
+
+static UnsignedEPStat DemoStat("DemoStat");
+
+namespace {
+class UnsignedStatTesterChecker : public Checker<check::BeginFunction> {
+public:
+  void checkBeginFunction(CheckerContext &C) const;
+};
+} // namespace
+
+void UnsignedStatTesterChecker::checkBeginFunction(CheckerContext &C) const {
+  const Decl *D = C.getLocationContext()->getDecl();
+  if (!D)
+    return;
+
+  std::string Name =
+      D->getAsFunction() ? D->getAsFunction()->getNameAsString() : "";
+
+  // Conditionally set the statistic based on the function name
+  if (Name == "func_one") {
+    DemoStat.set(1);
+  } else if (Name == "func_two") {
+    DemoStat.set(2);
+  } else if (Name == "func_three") {
+    DemoStat.set(3);
+  }
+  // For any other function (e.g., "func_none"), don't set the statistic
+}
+
+void ento::registerUnsignedStatTesterChecker(CheckerManager &mgr) {
+  mgr.registerChecker<UnsignedStatTesterChecker>();
+}
+
+bool ento::shouldRegisterUnsignedStatTesterChecker(const CheckerManager &mgr) {
+  return true;
+}
diff --git a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
index abfb176d6384d..33f5ccf6002ba 100644
--- a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
+++ b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
@@ -24,15 +24,21 @@ using namespace ento;
 
 namespace {
 struct Registry {
+  std::vector<UnsignedEPStat *> ExplicitlySetStats;
+  std::vector<UnsignedMaxEPStat *> MaxStats;
   std::vector<CounterEPStat *> CounterStats;
-  std::vector<UnsignedMaxEPStat *> UnsignedMaxStats;
-  std::vector<UnsignedEPStat *> UnsignedStats;
 
   bool IsLocked = false;
 
   struct Snapshot {
     const Decl *EntryPoint;
-    std::vector<unsigned> UnsignedStatValues;
+    // Explicitly set statistics may not have a value set, so they are separate
+    // from other unsigned statistics
+    std::vector<std::optional<unsigned>> ExplicitlySetStatValues;
+    // These are counting and maximizing statistics that initialize to 0, which
+    // is meaningful even if they are never updated, so their value is always
+    // present.
+    std::vector<unsigned> MaxOrCountStatValues;
 
     void dumpAsCSV(llvm::raw_ostream &OS) const;
   };
@@ -46,9 +52,12 @@ static llvm::ManagedStatic<Registry> StatsRegistry;
 
 namespace {
 template <typename Callback> void enumerateStatVectors(const Callback &Fn) {
+  // This order is important, it matches the order of the Snapshot fields:
+  // - ExplicitlySetStatValues
+  Fn(StatsRegistry->ExplicitlySetStats);
+  // - MaxOrCountStatValues
+  Fn(StatsRegistry->MaxStats);
   Fn(StatsRegistry->CounterStats);
-  Fn(StatsRegistry->UnsignedMaxStats);
-  Fn(StatsRegistry->UnsignedStats);
 }
 } // namespace
 
@@ -101,30 +110,37 @@ UnsignedMaxEPStat::UnsignedMaxEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedMaxStats.push_back(this);
+  StatsRegistry->MaxStats.push_back(this);
 }
 
 UnsignedEPStat::UnsignedEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedStats.push_back(this);
+  StatsRegistry->ExplicitlySetStats.push_back(this);
 }
 
-static std::vector<unsigned> consumeUnsignedStats() {
-  std::vector<unsigned> Result;
-  Result.reserve(StatsRegistry->CounterStats.size() +
-                 StatsRegistry->UnsignedMaxStats.size() +
-                 StatsRegistry->UnsignedStats.size());
-  for (auto *M : StatsRegistry->CounterStats) {
+static std::vector<std::optional<unsigned>>
+consumeExplicitlySetStats() {
+  std::vector<std::optional<unsigned>> Result;
+  Result.reserve(StatsRegistry->ExplicitlySetStats.size());
+  for (auto *M : StatsRegistry->ExplicitlySetStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedMaxStats) {
+  return Result;
+}
+
+static std::vector<unsigned> consumeMaxAndCounterStats() {
+  std::vector<unsigned> Result;
+  Result.reserve(StatsRegistry->CounterStats.size() +
+                 StatsRegistry->MaxStats.size());
+  // Order is important, it must match the order in enumerateStatVectors
+  for (auto *M : StatsRegistry->MaxStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedStats) {
+  for (auto *M : StatsRegistry->CounterStats) {
     Result.push_back(M->value());
     M->reset();
   }
@@ -150,20 +166,33 @@ static std::string getUSR(const Decl *D) {
 }
 
 void Registry::Snapshot::dumpAsCSV(llvm::raw_ostream &OS) const {
+  auto printAsUnsignOpt = [&OS](std::optional<unsigned> U) {
+    OS << (U.has_value() ? std::to_string(*U) : "");
+  };
+  auto commaIfNeeded = [&OS](const auto &Vec1, const auto &Vec2) {
+    if (!Vec1.empty() && !Vec2.empty())
+      OS << ",";
+  };
+  auto printAsUnsigned = [&OS](unsigned U) { OS << U; };
+
   OS << '"';
   llvm::printEscapedString(getUSR(EntryPoint), OS);
   OS << "\",\"";
   OS << StatsRegistry->EscapedCPPFileName << "\",\"";
   llvm::printEscapedString(
       clang::AnalysisDeclContext::getFunctionName(EntryPoint), OS);
-  OS << "\"";
-  OS << (UnsignedStatValues.empty() ? "" : ",");
-  llvm::interleave(UnsignedStatValues, OS, [&OS](unsigned U) { OS << U; }, ",");
+  OS << "\",";
+  llvm::interleave(ExplicitlySetStatValues, OS, printAsUnsignOpt, ",");
+  commaIfNeeded(ExplicitlySetStatValues, MaxOrCountStatValues);
+  llvm::interleave(MaxOrCountStatValues, OS, printAsUnsigned, ",");
 }
 
 void EntryPointStat::takeSnapshot(const Decl *EntryPoint) {
-  auto UnsignedValues = consumeUnsignedStats();
-  StatsRegistry->Snapshots.push_back({EntryPoint, std::move(UnsignedValues)});
+  auto ExplicitlySetValues = consumeExplicitlySetStats();
+  auto MaxOrCounterValues = consumeMaxAndCounterStats();
+  StatsRegistry->Snapshots.push_back({EntryPoint,
+                                      std::move(ExplicitlySetValues),
+                                      std::move(MaxOrCounterValues)});
 }
 
 void EntryPointStat::dumpStatsAsCSV(llvm::StringRef FileName) {
diff --git a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
index cf01e2f37c662..e97004f1bbe52 100644
--- a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
+++ b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
@@ -39,6 +39,7 @@
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/Timer.h"
 #include "llvm/Support/raw_ostream.h"
+#include <cmath>
 #include <memory>
 #include <utility>
 
@@ -142,7 +143,7 @@ class AnalysisConsumer : public AnalysisASTConsumer,
     DigestAnalyzerOptions();
 
     if (Opts.AnalyzerDisplayProgress || Opts.PrintStats ||
-        Opts.ShouldSerializeStats) {
+        Opts.ShouldSerializeStats || !Opts.DumpEntryPointStatsToCSV.empty()) {
       AnalyzerTimers = std::make_unique<llvm::TimerGroup>(
           "analyzer", "Analyzer timers");
       SyntaxCheckTimer = std::make_unique<llvm::Timer>(
@@ -788,6 +789,8 @@ void AnalysisConsumer::RunPathSensitiveChecks(Decl *D,
     ExprEngineTimer->stopTimer();
     llvm::TimeRecord ExprEngineEndTime = ExprEngineTimer->getTotalTime();
     ExprEngineEndTime -= ExprEngineStartTime;
+    PathRunningTime.set(static_cast<unsigned>(
+        std::lround(ExprEngineEndTime.getWallTime() * 1000)));
     DisplayTime(ExprEngineEndTime);
   }
 
diff --git a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
index 9cbe04550a8d3..395d2b2fefa65 100644
--- a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
+++ b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
@@ -1,13 +1,21 @@
 // REQUIRES: asserts
-// RUN: %clang_analyze_cc1 -analyzer-checker=core \
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.UnsignedStatTester \
 // RUN:    -analyzer-config dump-entry-point-stats-to-csv="%t.csv" \
 // RUN:    -verify %s
 // RUN: %csv2json "%t.csv" | FileCheck --check-prefix=CHECK %s
 //
 // CHECK:      {
 // CHECK-NEXT:   "c:@F@fib#i#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
 // CHECK-NEXT:     "DebugName": "fib(unsigned int)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -33,18 +41,31 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
+// CHECK-NEXT:   },
+// CHECK-NEXT:   "c:@F@func_one#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_one()",
+// CHECK:          "DemoStat": "1",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F@func_two#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_two()",
+// CHECK:          "DemoStat": "2",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F@main#I#**C#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "main(int, char **)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
-// CHECK-NEXT:   },
-// CHECK-NEXT:   "c:@F@main#I#**C#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
-// CHECK-NEXT:     "DebugName": "main(int, char **)",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -70,14 +91,7 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
 // CHECK-NEXT:   }
 // CHECK-NEXT: }
 // CHECK-NOT: non_entry_point
@@ -102,3 +116,6 @@ int main(int argc, char **argv) {
   int i = non_entry_point(argc);
   return i;
 }
+
+void func_one() {}
+void func_two() {}

@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2025

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

Changes

To avoid information loss, introduce a difference between unset stats and 0 for statistics that are supposed to be set once per entry point. Now, if the statistic is not set for an entry point, the corresponding CSV cell will be empty, and not 0.

Thanks to this differentiation, I noticed that PathRunningTime was actually never set, and fixed that.
Additionally, this patch enables the timers if DumpEntryPointStatsToCSV is set, because in most cases you dump these stats to get a detailed view on analyzer performance.

Finally, I added a dedicated debug checker that demonstrates the use of a statistic and tested the set and unset scenarios explicitly.

--

CPP-7097


Full diff: https://github.com/llvm/llvm-project/pull/162839.diff

8 Files Affected:

  • (modified) clang/docs/analyzer/developer-docs/Statistics.rst (+1-1)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (+4)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (+1)
  • (added) clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp (+59)
  • (modified) clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp (+49-20)
  • (modified) clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp (+4-1)
  • (modified) clang/test/Analysis/analyzer-stats/entry-point-stats.cpp (+33-16)
diff --git a/clang/docs/analyzer/developer-docs/Statistics.rst b/clang/docs/analyzer/developer-docs/Statistics.rst
index 4f2484a89a6af..355759d468282 100644
--- a/clang/docs/analyzer/developer-docs/Statistics.rst
+++ b/clang/docs/analyzer/developer-docs/Statistics.rst
@@ -22,7 +22,7 @@ However, note that with ``LLVM_ENABLE_STATS`` disabled, only storage of the valu
 If you want to define a statistic only for entry point, EntryPointStats.h has four classes at your disposal:
 
 
-- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body".
+- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body". If no value is assigned during analysis of an entry point, the corresponding CSV cell will be empty.
 - ``CounterEPStat`` - an additive statistic. It starts with 0 and you can add to it as many times as needed. For example: "the number of bugs discovered".
 - ``UnsignedMaxEPStat`` - a maximizing statistic. It starts with 0 and when you join it with a value, it picks the maximum of the previous value and the new one. For example, "the longest execution path of a bug".
 
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 4473c54d8d6e3..0db6794b2310a 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -1502,6 +1502,10 @@ def TaintTesterChecker : Checker<"TaintTest">,
   HelpText<"Mark tainted symbols as such.">,
   Documentation<NotDocumented>;
 
+def UnsignedStatTesterChecker : Checker<"UnsignedStatTester">,
+  HelpText<"Test checker for demonstrating UnsignedEPStat usage.">,
+  Documentation<NotDocumented>;
+
 // This checker *technically* depends on SteamChecker, but we don't allow
 // dependency checkers to emit diagnostics, and a debug checker isn't worth
 // the chore needed to create a modeling portion on its own. Since this checker
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
index 389f17d36e65a..0a45deb33fcc9 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
@@ -85,7 +85,7 @@ class UnsignedEPStat : public EntryPointStat {
 
 public:
   explicit UnsignedEPStat(llvm::StringLiteral Name);
-  unsigned value() const { return Value.value_or(0); }
+  std::optional<unsigned> value() const { return Value; }
   void reset() { Value.reset(); }
   void set(unsigned V) {
     assert(!Value.has_value());
diff --git a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
index 29d2c4512d470..f70ae1a3e8a96 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
@@ -125,6 +125,7 @@ add_clang_library(clangStaticAnalyzerCheckers
   UninitializedObject/UninitializedPointee.cpp
   UnixAPIChecker.cpp
   UnreachableCodeChecker.cpp
+  UnsignedStatTesterChecker.cpp
   VforkChecker.cpp
   VLASizeChecker.cpp
   VAListChecker.cpp
diff --git a/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
new file mode 100644
index 0000000000000..a302f506a285a
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
@@ -0,0 +1,59 @@
+//== UnsignedStatTesterChecker.cpp --------------------------- -*- C++ -*--=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This checker demonstrates the use of UnsignedEPStat for per-entry-point
+// statistics. It conditionally sets a statistic based on the entry point name.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
+
+using namespace clang;
+using namespace ento;
+
+#define DEBUG_TYPE "UnsignedStatTester"
+
+static UnsignedEPStat DemoStat("DemoStat");
+
+namespace {
+class UnsignedStatTesterChecker : public Checker<check::BeginFunction> {
+public:
+  void checkBeginFunction(CheckerContext &C) const;
+};
+} // namespace
+
+void UnsignedStatTesterChecker::checkBeginFunction(CheckerContext &C) const {
+  const Decl *D = C.getLocationContext()->getDecl();
+  if (!D)
+    return;
+
+  std::string Name =
+      D->getAsFunction() ? D->getAsFunction()->getNameAsString() : "";
+
+  // Conditionally set the statistic based on the function name
+  if (Name == "func_one") {
+    DemoStat.set(1);
+  } else if (Name == "func_two") {
+    DemoStat.set(2);
+  } else if (Name == "func_three") {
+    DemoStat.set(3);
+  }
+  // For any other function (e.g., "func_none"), don't set the statistic
+}
+
+void ento::registerUnsignedStatTesterChecker(CheckerManager &mgr) {
+  mgr.registerChecker<UnsignedStatTesterChecker>();
+}
+
+bool ento::shouldRegisterUnsignedStatTesterChecker(const CheckerManager &mgr) {
+  return true;
+}
diff --git a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
index abfb176d6384d..33f5ccf6002ba 100644
--- a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
+++ b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
@@ -24,15 +24,21 @@ using namespace ento;
 
 namespace {
 struct Registry {
+  std::vector<UnsignedEPStat *> ExplicitlySetStats;
+  std::vector<UnsignedMaxEPStat *> MaxStats;
   std::vector<CounterEPStat *> CounterStats;
-  std::vector<UnsignedMaxEPStat *> UnsignedMaxStats;
-  std::vector<UnsignedEPStat *> UnsignedStats;
 
   bool IsLocked = false;
 
   struct Snapshot {
     const Decl *EntryPoint;
-    std::vector<unsigned> UnsignedStatValues;
+    // Explicitly set statistics may not have a value set, so they are separate
+    // from other unsigned statistics
+    std::vector<std::optional<unsigned>> ExplicitlySetStatValues;
+    // These are counting and maximizing statistics that initialize to 0, which
+    // is meaningful even if they are never updated, so their value is always
+    // present.
+    std::vector<unsigned> MaxOrCountStatValues;
 
     void dumpAsCSV(llvm::raw_ostream &OS) const;
   };
@@ -46,9 +52,12 @@ static llvm::ManagedStatic<Registry> StatsRegistry;
 
 namespace {
 template <typename Callback> void enumerateStatVectors(const Callback &Fn) {
+  // This order is important, it matches the order of the Snapshot fields:
+  // - ExplicitlySetStatValues
+  Fn(StatsRegistry->ExplicitlySetStats);
+  // - MaxOrCountStatValues
+  Fn(StatsRegistry->MaxStats);
   Fn(StatsRegistry->CounterStats);
-  Fn(StatsRegistry->UnsignedMaxStats);
-  Fn(StatsRegistry->UnsignedStats);
 }
 } // namespace
 
@@ -101,30 +110,37 @@ UnsignedMaxEPStat::UnsignedMaxEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedMaxStats.push_back(this);
+  StatsRegistry->MaxStats.push_back(this);
 }
 
 UnsignedEPStat::UnsignedEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedStats.push_back(this);
+  StatsRegistry->ExplicitlySetStats.push_back(this);
 }
 
-static std::vector<unsigned> consumeUnsignedStats() {
-  std::vector<unsigned> Result;
-  Result.reserve(StatsRegistry->CounterStats.size() +
-                 StatsRegistry->UnsignedMaxStats.size() +
-                 StatsRegistry->UnsignedStats.size());
-  for (auto *M : StatsRegistry->CounterStats) {
+static std::vector<std::optional<unsigned>>
+consumeExplicitlySetStats() {
+  std::vector<std::optional<unsigned>> Result;
+  Result.reserve(StatsRegistry->ExplicitlySetStats.size());
+  for (auto *M : StatsRegistry->ExplicitlySetStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedMaxStats) {
+  return Result;
+}
+
+static std::vector<unsigned> consumeMaxAndCounterStats() {
+  std::vector<unsigned> Result;
+  Result.reserve(StatsRegistry->CounterStats.size() +
+                 StatsRegistry->MaxStats.size());
+  // Order is important, it must match the order in enumerateStatVectors
+  for (auto *M : StatsRegistry->MaxStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedStats) {
+  for (auto *M : StatsRegistry->CounterStats) {
     Result.push_back(M->value());
     M->reset();
   }
@@ -150,20 +166,33 @@ static std::string getUSR(const Decl *D) {
 }
 
 void Registry::Snapshot::dumpAsCSV(llvm::raw_ostream &OS) const {
+  auto printAsUnsignOpt = [&OS](std::optional<unsigned> U) {
+    OS << (U.has_value() ? std::to_string(*U) : "");
+  };
+  auto commaIfNeeded = [&OS](const auto &Vec1, const auto &Vec2) {
+    if (!Vec1.empty() && !Vec2.empty())
+      OS << ",";
+  };
+  auto printAsUnsigned = [&OS](unsigned U) { OS << U; };
+
   OS << '"';
   llvm::printEscapedString(getUSR(EntryPoint), OS);
   OS << "\",\"";
   OS << StatsRegistry->EscapedCPPFileName << "\",\"";
   llvm::printEscapedString(
       clang::AnalysisDeclContext::getFunctionName(EntryPoint), OS);
-  OS << "\"";
-  OS << (UnsignedStatValues.empty() ? "" : ",");
-  llvm::interleave(UnsignedStatValues, OS, [&OS](unsigned U) { OS << U; }, ",");
+  OS << "\",";
+  llvm::interleave(ExplicitlySetStatValues, OS, printAsUnsignOpt, ",");
+  commaIfNeeded(ExplicitlySetStatValues, MaxOrCountStatValues);
+  llvm::interleave(MaxOrCountStatValues, OS, printAsUnsigned, ",");
 }
 
 void EntryPointStat::takeSnapshot(const Decl *EntryPoint) {
-  auto UnsignedValues = consumeUnsignedStats();
-  StatsRegistry->Snapshots.push_back({EntryPoint, std::move(UnsignedValues)});
+  auto ExplicitlySetValues = consumeExplicitlySetStats();
+  auto MaxOrCounterValues = consumeMaxAndCounterStats();
+  StatsRegistry->Snapshots.push_back({EntryPoint,
+                                      std::move(ExplicitlySetValues),
+                                      std::move(MaxOrCounterValues)});
 }
 
 void EntryPointStat::dumpStatsAsCSV(llvm::StringRef FileName) {
diff --git a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
index cf01e2f37c662..e97004f1bbe52 100644
--- a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
+++ b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
@@ -39,6 +39,7 @@
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/Timer.h"
 #include "llvm/Support/raw_ostream.h"
+#include <cmath>
 #include <memory>
 #include <utility>
 
@@ -142,7 +143,7 @@ class AnalysisConsumer : public AnalysisASTConsumer,
     DigestAnalyzerOptions();
 
     if (Opts.AnalyzerDisplayProgress || Opts.PrintStats ||
-        Opts.ShouldSerializeStats) {
+        Opts.ShouldSerializeStats || !Opts.DumpEntryPointStatsToCSV.empty()) {
       AnalyzerTimers = std::make_unique<llvm::TimerGroup>(
           "analyzer", "Analyzer timers");
       SyntaxCheckTimer = std::make_unique<llvm::Timer>(
@@ -788,6 +789,8 @@ void AnalysisConsumer::RunPathSensitiveChecks(Decl *D,
     ExprEngineTimer->stopTimer();
     llvm::TimeRecord ExprEngineEndTime = ExprEngineTimer->getTotalTime();
     ExprEngineEndTime -= ExprEngineStartTime;
+    PathRunningTime.set(static_cast<unsigned>(
+        std::lround(ExprEngineEndTime.getWallTime() * 1000)));
     DisplayTime(ExprEngineEndTime);
   }
 
diff --git a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
index 9cbe04550a8d3..395d2b2fefa65 100644
--- a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
+++ b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
@@ -1,13 +1,21 @@
 // REQUIRES: asserts
-// RUN: %clang_analyze_cc1 -analyzer-checker=core \
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.UnsignedStatTester \
 // RUN:    -analyzer-config dump-entry-point-stats-to-csv="%t.csv" \
 // RUN:    -verify %s
 // RUN: %csv2json "%t.csv" | FileCheck --check-prefix=CHECK %s
 //
 // CHECK:      {
 // CHECK-NEXT:   "c:@F@fib#i#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
 // CHECK-NEXT:     "DebugName": "fib(unsigned int)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -33,18 +41,31 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
+// CHECK-NEXT:   },
+// CHECK-NEXT:   "c:@F@func_one#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_one()",
+// CHECK:          "DemoStat": "1",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F@func_two#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_two()",
+// CHECK:          "DemoStat": "2",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F@main#I#**C#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "main(int, char **)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
-// CHECK-NEXT:   },
-// CHECK-NEXT:   "c:@F@main#I#**C#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
-// CHECK-NEXT:     "DebugName": "main(int, char **)",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -70,14 +91,7 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
 // CHECK-NEXT:   }
 // CHECK-NEXT: }
 // CHECK-NOT: non_entry_point
@@ -102,3 +116,6 @@ int main(int argc, char **argv) {
   int i = non_entry_point(argc);
   return i;
 }
+
+void func_one() {}
+void func_two() {}

@necto
Copy link
Contributor Author

necto commented Oct 10, 2025

FYI, I used ClaudeCode to write the example checker (UnsignedStatTesterChecker) that exercises an unsigned explicitly-set statistic.

This is a second patch out of three, the first one (#162817) removing the unneeded boolean statistic type, and the final one will add SyntaxRunningTime similar to how PathRunningTime is recorded now.

@github-actions
Copy link

github-actions bot commented Oct 10, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@necto
Copy link
Contributor Author

necto commented Oct 10, 2025

I recommend review commit-by-commit

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me overall, but I have several minor stylistic suggestions, especially in the tests generated by Claude.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the new test.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Wait for another approval

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

(Note that I didn't go through the inline comments that are marked as "outdated", I just checked the end result.)

@necto necto merged commit bb3b020 into llvm:main Oct 13, 2025
11 checks passed
@necto necto deleted the az/empty-metrics branch October 13, 2025 15:24
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
…missing PathRunningTime metric (llvm#162839)

To avoid information loss, introduce a difference between unset stats
and 0 for statistics that are supposed to be set once per entry point.
Now, if the statistic is not set for an entry point, the corresponding
CSV cell will be empty, and not 0.

Thanks to this differentiation, I noticed that `PathRunningTime` was
actually never set, and fixed that.
Additionally, this patch enables the timers if
`DumpEntryPointStatsToCSV` is set, because in most cases you dump these
stats to get a detailed view on analyzer performance.

Finally, I added a dedicated debug checker that demonstrates the use of
a statistic and tested the set and unset scenarios explicitly.

--

CPP-7097

---------

Co-authored-by: Donát Nagy <donat.nagy@ericsson.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants