Skip to content

Conversation

necto
Copy link
Contributor

@necto necto commented Oct 10, 2025

The complexity of maintaining an extra kind of metrics have not justified itself, as it is not used upstream, and we have only a single use of boolean stats per entrypoint downstream.

As I will do downstream, you can use an unsigned statistic type with values 0 and 1 to model a boolean flag.

--

CPP-7097

The complexity of maintaining an extra kind of metrics have not
justified itself, as it is not used upstream, and we have only a single
use of boolean stats per entrypoint downstream.

As I will do downstream, you can use an unsigned statistic type with
values 0 and 1 to model a boolean flag.

--

CPP-7097
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 10, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

Changes

The complexity of maintaining an extra kind of metrics have not justified itself, as it is not used upstream, and we have only a single use of boolean stats per entrypoint downstream.

As I will do downstream, you can use an unsigned statistic type with values 0 and 1 to model a boolean flag.

--

CPP-7097


Full diff: https://github.com/llvm/llvm-project/pull/162817.diff

3 Files Affected:

  • (modified) clang/docs/analyzer/developer-docs/Statistics.rst (-1)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h (-13)
  • (modified) clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp (+3-26)
diff --git a/clang/docs/analyzer/developer-docs/Statistics.rst b/clang/docs/analyzer/developer-docs/Statistics.rst
index 595b44dd95753..4f2484a89a6af 100644
--- a/clang/docs/analyzer/developer-docs/Statistics.rst
+++ b/clang/docs/analyzer/developer-docs/Statistics.rst
@@ -22,7 +22,6 @@ However, note that with ``LLVM_ENABLE_STATS`` disabled, only storage of the valu
 If you want to define a statistic only for entry point, EntryPointStats.h has four classes at your disposal:
 
 
-- ``BoolEPStat`` - a boolean value assigned at most once per entry point. For example: "has the inline limit been reached".
 - ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body".
 - ``CounterEPStat`` - an additive statistic. It starts with 0 and you can add to it as many times as needed. For example: "the number of bugs discovered".
 - ``UnsignedMaxEPStat`` - a maximizing statistic. It starts with 0 and when you join it with a value, it picks the maximum of the previous value and the new one. For example, "the longest execution path of a bug".
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
index 448e40269ca2d..389f17d36e65a 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
@@ -42,19 +42,6 @@ class EntryPointStat {
   llvm::StringLiteral Name;
 };
 
-class BoolEPStat : public EntryPointStat {
-  std::optional<bool> Value = {};
-
-public:
-  explicit BoolEPStat(llvm::StringLiteral Name);
-  unsigned value() const { return Value && *Value; }
-  void set(bool V) {
-    assert(!Value.has_value());
-    Value = V;
-  }
-  void reset() { Value = {}; }
-};
-
 // used by CounterEntryPointTranslationUnitStat
 class CounterEPStat : public EntryPointStat {
   using EntryPointStat::EntryPointStat;
diff --git a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
index 62ae62f2f2154..abfb176d6384d 100644
--- a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
+++ b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
@@ -24,7 +24,6 @@ using namespace ento;
 
 namespace {
 struct Registry {
-  std::vector<BoolEPStat *> BoolStats;
   std::vector<CounterEPStat *> CounterStats;
   std::vector<UnsignedMaxEPStat *> UnsignedMaxStats;
   std::vector<UnsignedEPStat *> UnsignedStats;
@@ -33,7 +32,6 @@ struct Registry {
 
   struct Snapshot {
     const Decl *EntryPoint;
-    std::vector<bool> BoolStatValues;
     std::vector<unsigned> UnsignedStatValues;
 
     void dumpAsCSV(llvm::raw_ostream &OS) const;
@@ -48,7 +46,6 @@ static llvm::ManagedStatic<Registry> StatsRegistry;
 
 namespace {
 template <typename Callback> void enumerateStatVectors(const Callback &Fn) {
-  Fn(StatsRegistry->BoolStats);
   Fn(StatsRegistry->CounterStats);
   Fn(StatsRegistry->UnsignedMaxStats);
   Fn(StatsRegistry->UnsignedStats);
@@ -94,12 +91,6 @@ void EntryPointStat::lockRegistry(llvm::StringRef CPPFileName) {
   return Result;
 }
 
-BoolEPStat::BoolEPStat(llvm::StringLiteral Name) : EntryPointStat(Name) {
-  assert(!StatsRegistry->IsLocked);
-  assert(!isRegistered(Name));
-  StatsRegistry->BoolStats.push_back(this);
-}
-
 CounterEPStat::CounterEPStat(llvm::StringLiteral Name) : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
@@ -165,28 +156,14 @@ void Registry::Snapshot::dumpAsCSV(llvm::raw_ostream &OS) const {
   OS << StatsRegistry->EscapedCPPFileName << "\",\"";
   llvm::printEscapedString(
       clang::AnalysisDeclContext::getFunctionName(EntryPoint), OS);
-  OS << "\",";
-  auto PrintAsBool = [&OS](bool B) { OS << (B ? "true" : "false"); };
-  llvm::interleave(BoolStatValues, OS, PrintAsBool, ",");
-  OS << ((BoolStatValues.empty() || UnsignedStatValues.empty()) ? "" : ",");
+  OS << "\"";
+  OS << (UnsignedStatValues.empty() ? "" : ",");
   llvm::interleave(UnsignedStatValues, OS, [&OS](unsigned U) { OS << U; }, ",");
 }
 
-static std::vector<bool> consumeBoolStats() {
-  std::vector<bool> Result;
-  Result.reserve(StatsRegistry->BoolStats.size());
-  for (auto *M : StatsRegistry->BoolStats) {
-    Result.push_back(M->value());
-    M->reset();
-  }
-  return Result;
-}
-
 void EntryPointStat::takeSnapshot(const Decl *EntryPoint) {
-  auto BoolValues = consumeBoolStats();
   auto UnsignedValues = consumeUnsignedStats();
-  StatsRegistry->Snapshots.push_back(
-      {EntryPoint, std::move(BoolValues), std::move(UnsignedValues)});
+  StatsRegistry->Snapshots.push_back({EntryPoint, std::move(UnsignedValues)});
 }
 
 void EntryPointStat::dumpStatsAsCSV(llvm::StringRef FileName) {

@necto
Copy link
Contributor Author

necto commented Oct 10, 2025

This is the first step out of three I plan instead of #162089 . The end goal is to record both PathRunningTime and SyntaxRunningTime per entry point.

  • in this step, I remove boolean statistics as I don't find them carrying their weight in terms of complexity.
  • in the next step, I will propose more accurate representation of the unsigned statistics that were never set for a particular entry point in the CSV - as empty cells, rather than 0
  • in the final step, I will add the SyntaxRunningTime in addition to the existing PathRanningTime statistic, which will be only set for entry points that underwent syntax analysis

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very straightforward change. Thanks for this cleanup!

@necto
Copy link
Contributor Author

necto commented Oct 10, 2025

@balazs-benics-sonarsource is traveling right now, but we discussed this change today and he agrees with it, so I will merge, to unblock my follow-up PR

@necto necto merged commit e297184 into llvm:main Oct 10, 2025
14 checks passed
@necto necto deleted the az/remove-bool-ep-metrics branch October 10, 2025 12:53
DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025
The complexity of maintaining an extra kind of metrics have not
justified itself, as it is not used upstream, and we have only a single
use of boolean stats per entrypoint downstream.

As I will do downstream, you can use an unsigned statistic type with
values 0 and 1 to model a boolean flag.

--

CPP-7097
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
The complexity of maintaining an extra kind of metrics have not
justified itself, as it is not used upstream, and we have only a single
use of boolean stats per entrypoint downstream.

As I will do downstream, you can use an unsigned statistic type with
values 0 and 1 to model a boolean flag.

--

CPP-7097
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants