[analyzer] Bring unix.cstring.UninitializedRead checker out of alpha#196292
[analyzer] Bring unix.cstring.UninitializedRead checker out of alpha#196292gamesh411 wants to merge 1 commit into
Conversation
There have been recent improvements (llvm#186802) and fixes (llvm#191061) related to this checker. The reports are no longer noisy, as evaluated on 14 OS projects.
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-static-analyzer-1 Author: Endre Fülöp (gamesh411) ChangesThere have been recent improvements (#186802) and fixes (#191061) related to this checker. The reports are no longer noisy, as evaluated on 14 OS projects. Full diff: https://github.com/llvm/llvm-project/pull/196292.diff 11 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index cb19b80b7e994..b4fe7f2ace1c1 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -764,6 +764,12 @@ Crash and bug fixes
- Improvements
- Moved checkers
+
+Moved checkers
+^^^^^^^^^^^^^^
+
+- The checker ``unix.cstring.UninitializedRead`` is now out of alpha.
+
.. _release-notes-sanitizers:
Sanitizers
diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index 61f591916018e..a7b1e1c882e17 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -2353,6 +2353,40 @@ Check for null pointers being passed as arguments to C string functions:
return strlen(0); // warn
}
+.. _unix-cstring-UninitializedRead:
+
+unix.cstring.UninitializedRead (C)
+""""""""""""""""""""""""""""""""""""""""
+Check for uninitialized reads from common memory copy/manipulation functions such as:
+ ``memcpy, mempcpy, memmove, memcmp, strcmp, strncmp, strcpy, strlen, strsep`` and many more.
+
+.. code-block:: c
+
+ void test() {
+ char src[10];
+ char dst[5];
+ memcpy(dst,src,sizeof(dst)); // warn: Bytes string function accesses uninitialized/garbage values
+ }
+
+Limitations:
+
+ - Due to limitations of the memory modeling in the analyzer, one can likely
+ observe some false-positives of the following kind:
+
+ .. code-block:: c
+
+ void false_positive() {
+ int src[] = {1, 2, 3, 4};
+ int dst[5] = {0};
+ memcpy(dst, src, 4 * sizeof(int)); // false-positive:
+ // The 'src' buffer was correctly initialized, yet we cannot conclude
+ // that since the analyzer could not see a direct initialization of the
+ // very last byte of the source buffer.
+ }
+
+ More details at the corresponding `GitHub issue <https://github.com/llvm/llvm-project/issues/43459>`_.
+
+
.. _unix-StdCLibraryFunctions:
unix.StdCLibraryFunctions (C)
@@ -3701,39 +3735,6 @@ the analyzer cannot detect embedded NULL characters when determining the string
memcpy(buffer, str, sizeof(str)); // warn
}
-.. _alpha-unix-cstring-UninitializedRead:
-
-alpha.unix.cstring.UninitializedRead (C)
-""""""""""""""""""""""""""""""""""""""""
-Check for uninitialized reads from common memory copy/manipulation functions such as:
- ``memcpy, mempcpy, memmove, memcmp, strcmp, strncmp, strcpy, strlen, strsep`` and many more.
-
-.. code-block:: c
-
- void test() {
- char src[10];
- char dst[5];
- memcpy(dst,src,sizeof(dst)); // warn: Bytes string function accesses uninitialized/garbage values
- }
-
-Limitations:
-
- - Due to limitations of the memory modeling in the analyzer, one can likely
- observe a lot of false-positive reports like this:
-
- .. code-block:: c
-
- void false_positive() {
- int src[] = {1, 2, 3, 4};
- int dst[5] = {0};
- memcpy(dst, src, 4 * sizeof(int)); // false-positive:
- // The 'src' buffer was correctly initialized, yet we cannot conclude
- // that since the analyzer could not see a direct initialization of the
- // very last byte of the source buffer.
- }
-
- More details at the corresponding `GitHub issue <https://github.com/llvm/llvm-project/issues/43459>`_.
-
alpha.WebKit
^^^^^^^^^^^^
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 6b9e0b50e1f59..84c152cd72bd1 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -460,6 +460,11 @@ def CStringSyntaxChecker : Checker<"BadSizeArg">,
Dependencies<[CStringModeling]>,
Documentation<HasDocumentation>;
+def CStringUninitializedRead : Checker<"UninitializedRead">,
+ HelpText<"Checks if the string manipulation function would read uninitialized bytes">,
+ Dependencies<[CStringModeling]>,
+ Documentation<HasDocumentation>;
+
} // end "unix.cstring"
let ParentPackage = CStringAlpha in {
@@ -474,11 +479,6 @@ def CStringBufferOverlap : Checker<"BufferOverlap">,
Dependencies<[CStringModeling]>,
Documentation<HasDocumentation>;
-def CStringUninitializedRead : Checker<"UninitializedRead">,
- HelpText<"Checks if the string manipulation function would read uninitialized bytes">,
- Dependencies<[CStringModeling]>,
- Documentation<HasDocumentation>;
-
} // end "alpha.unix.cstring"
let ParentPackage = Unix in {
diff --git a/clang/test/Analysis/analyzer-enabled-checkers.c b/clang/test/Analysis/analyzer-enabled-checkers.c
index c1ed882069073..8371b0e7a410a 100644
--- a/clang/test/Analysis/analyzer-enabled-checkers.c
+++ b/clang/test/Analysis/analyzer-enabled-checkers.c
@@ -55,6 +55,7 @@
// CHECK-NEXT: unix.cstring.BadSizeArg
// CHECK-NEXT: unix.cstring.NotNullTerminated
// CHECK-NEXT: unix.cstring.NullArg
+// CHECK-NEXT: unix.cstring.UninitializedRead
int main() {
int i;
diff --git a/clang/test/Analysis/bstring.c b/clang/test/Analysis/bstring.c
index 810241accffa2..b337c71eb02c7 100644
--- a/clang/test/Analysis/bstring.c
+++ b/clang/test/Analysis/bstring.c
@@ -1,32 +1,32 @@
// RUN: %clang_analyze_cc1 -verify %s \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
//
// RUN: %clang_analyze_cc1 -verify %s -DUSE_BUILTINS \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
//
// RUN: %clang_analyze_cc1 -verify %s -DVARIANT \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
//
// RUN: %clang_analyze_cc1 -verify %s -DUSE_BUILTINS -DVARIANT \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
diff --git a/clang/test/Analysis/bstring.cpp b/clang/test/Analysis/bstring.cpp
index 2f1712648d8e1..a5fd56a19eb1a 100644
--- a/clang/test/Analysis/bstring.cpp
+++ b/clang/test/Analysis/bstring.cpp
@@ -19,12 +19,13 @@
// RUN: %{analyzer} \
// RUN: -analyzer-checker=alpha.unix.cstring.BufferOverlap \
// RUN: -analyzer-checker=unix.cstring.NotNullTerminated \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -verify=expected,no-oob %s
// UninitializedRead enabled without OutOfBounds: verifies that
// UninitializedRead works independently of OutOfBounds.
// RUN: %{analyzer} \
-// RUN: -analyzer-checker=alpha.unix.cstring.UninitializedRead \
+// RUN: -analyzer-checker=unix.cstring.UninitializedRead \
// RUN: -verify=expected,no-oob,uninit %s
#include "Inputs/system-header-simulator-cxx.h"
diff --git a/clang/test/Analysis/bstring_UninitRead.c b/clang/test/Analysis/bstring_UninitRead.c
index 45e38dd316298..7557c9641781c 100644
--- a/clang/test/Analysis/bstring_UninitRead.c
+++ b/clang/test/Analysis/bstring_UninitRead.c
@@ -1,5 +1,5 @@
// RUN: %clang_analyze_cc1 -verify %s \
-// RUN: -analyzer-checker=core,alpha.unix.cstring
+// RUN: -analyzer-checker=core,unix.cstring.UninitializedRead
//===----------------------------------------------------------------------===//
// mempcpy() using character array. This is the easiest case, as memcpy
diff --git a/clang/test/Analysis/cstring-uninitread-notes.c b/clang/test/Analysis/cstring-uninitread-notes.c
index b62519a85c8cc..6a934078566c3 100644
--- a/clang/test/Analysis/cstring-uninitread-notes.c
+++ b/clang/test/Analysis/cstring-uninitread-notes.c
@@ -1,5 +1,5 @@
// RUN: %clang_analyze_cc1 -verify %s \
-// RUN: -analyzer-checker=core,alpha.unix.cstring \
+// RUN: -analyzer-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-output=text
#include "Inputs/system-header-simulator.h"
@@ -19,7 +19,7 @@ void returning_without_writing_to_memcpy(const char *src, unsigned size) {
maybeWrite(src, size, block); // expected-note{{Returning from 'maybeWrite'}}
int buf[8 * 8];
- memcpy(buf, &block[0], 8); // expected-warning{{The first element of the 2nd argument is undefined [alpha.unix.cstring.UninitializedRead]}}
+ memcpy(buf, &block[0], 8); // expected-warning{{The first element of the 2nd argument is undefined [unix.cstring.UninitializedRead]}}
// expected-note@-1{{The first element of the 2nd argument is undefined}}
// expected-note@-2{{Other elements might also be undefined}}
}
diff --git a/clang/test/Analysis/infeasible-crash.c b/clang/test/Analysis/infeasible-crash.c
index d4e6a66f85bcf..062d13f6fe63b 100644
--- a/clang/test/Analysis/infeasible-crash.c
+++ b/clang/test/Analysis/infeasible-crash.c
@@ -1,6 +1,6 @@
// RUN: %clang_analyze_cc1 %s \
// RUN: -analyzer-checker=core \
-// RUN: -analyzer-checker=alpha.unix.cstring.OutOfBounds,alpha.unix.cstring.UninitializedRead \
+// RUN: -analyzer-checker=alpha.unix.cstring.OutOfBounds,unix.cstring.UninitializedRead \
// RUN: -analyzer-config eagerly-assume=false \
// RUN: -verify
diff --git a/clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c b/clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c
index 4de004e00687a..e1f365cdfbcf6 100644
--- a/clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c
+++ b/clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c
@@ -63,6 +63,7 @@
// CHECK-NEXT: unix.cstring.BadSizeArg
// CHECK-NEXT: unix.cstring.NotNullTerminated
// CHECK-NEXT: unix.cstring.NullArg
+// CHECK-NEXT: unix.cstring.UninitializedRead
int main() {
int i;
diff --git a/clang/test/Analysis/wstring.c b/clang/test/Analysis/wstring.c
index 9c60d39ff502e..340a01c047a7e 100644
--- a/clang/test/Analysis/wstring.c
+++ b/clang/test/Analysis/wstring.c
@@ -2,7 +2,7 @@
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
//
@@ -10,7 +10,7 @@
// RUN: -analyzer-checker=core \
// RUN: -analyzer-checker=unix.cstring \
// RUN: -analyzer-checker=alpha.unix.cstring \
-// RUN: -analyzer-disable-checker=alpha.unix.cstring.UninitializedRead \
+// RUN: -analyzer-disable-checker=unix.cstring.UninitializedRead \
// RUN: -analyzer-checker=debug.ExprInspection \
// RUN: -analyzer-config eagerly-assume=false
|
|
At first glance it has similar problems as the ArrayBound. It's frequently difficult to keep track of the buffer and where it points to, and incidentally trick the engine to believe that it points to outside of the buffer - I speculate. I looked at some of the samples, and for example:
As a generic note, the error message could/should be probably improved, because right now the
If you all still believe that these issues should not block this move, I could look at the other examples to form a grounded opinion, but right now I'm a bit concerned. |
|
Quick replies to to the quick reply (Iwill try to give a proper review later 😅):
IIRC that message comes from the bug reporter visitors and we get it automagically via
EDIT: I misremembered the internals of the engine and the stuff below is completely nonsense, the engine doesn't perform repeated bounds checking. The undefined values are coming from some other bug of the
|
|
I looked at a few of the analysis reports together with Endre (= @gamesh411) and we found that:
Endre is also working on changing the "was initialized here" message to "was left uninitialized here" in the notes. We will revisit this PR (and the inspection of the rest of the analysis results) when Endre is done with these two subtasks. |
The error message seems to deliberately only mention if either the beginning of the read buffer is uninitialized, or the end of this same buffer is uninitialized. Tracking a more precise location could be done, but that would need a linear scan over that range of the buffer, and this is why I think the decision was made to not do it, even though it would provide a more precise location of where reading uninitialized values start. |
There have been recent improvements (#186802) and fixes (#191061) related to this checker. The reports are no longer noisy, as evaluated on 14 OS projects.