Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

false positive: Argument to free() is the address of ..., which is not memory allocated by malloc() #81597

Closed
phlax opened this issue Feb 13, 2024 · 12 comments · Fixed by #82614 or #88416
Assignees
Labels
clang:static analyzer false-positive Warning fires when it should not

Comments

@phlax
Copy link

phlax commented Feb 13, 2024

we are seeing the above error in our CI (running clang-tidy on the https://github.com/envoyproxy/envoy project)

it would appear this is a false positive that should not trigger on a member function

full log:

/b/f/w/source/common/stats/symbol_table.cc:623:3: error: Argument to free() is the address of a local stack variable, which is not memory allocated by malloc() [clang-analyzer-unix.Malloc,-warnings-as-errors]
  table.free(statName());
  ^
/b/f/w/source/common/stats/symbol_table.cc:629:5: note: Calling 'StatNameStorage::free'
    storage.free(symbol_table_);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:623:3: note: Argument to free() is the address of a local stack variable, which is not memory allocated by malloc()
  table.free(statName());
  ^          ~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:736:5: error: Argument to free() is the address of the parameter 'stat_name', which is not memory allocated by malloc() [clang-analyzer-unix.Malloc,-warnings-as-errors]
    symbol_table.free(stat_name);
    ^
/b/f/w/source/common/stats/symbol_table.cc:735:3: note: Calling 'StatNameList::iterate'
  iterate([&symbol_table](StatName stat_name) -> bool {
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:725:24: note: Assuming 'i' is < 'num_elements'
  for (uint32_t i = 0; i < num_elements; ++i) {
                       ^~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:725:3: note: Loop condition is true.  Entering loop body
  for (uint32_t i = 0; i < num_elements; ++i) {
  ^
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:736:5: note: Argument to free() is the address of the parameter 'stat_name', which is not memory allocated by malloc()
    symbol_table.free(stat_name);
    ^                 ~~~~~~~~~
@phlax phlax changed the title false positive: `Argument to free() is the address of ..., which is not memory allocated by malloc() false positive: Argument to free() is the address of ..., which is not memory allocated by malloc() Feb 13, 2024
@phlax
Copy link
Author

phlax commented Feb 13, 2024

cc @ravenblackx

@chrchr-github
Copy link

Reduced:

struct S {};
struct T {
	void free(const S& s);
};
void f(T& t) {
	S s;
	t.free(s);
}
<source>:7:2: warning: Argument to free() is the address of the local variable 's', which is not memory allocated by malloc() [clang-analyzer-unix.Malloc]
    7 |         t.free(s);
      |         ^      ~
<source>:7:2: note: Argument to free() is the address of the local variable 's', which is not memory allocated by malloc()
    7 |         t.free(s);
      |         ^      ~

https://godbolt.org/z/qvWrEG3a6

@phlax
Copy link
Author

phlax commented Feb 13, 2024

for ref:

$ bazel run //tools/clang-tidy -- --version
...
LLVM (http://llvm.org/):
  LLVM version 14.0.6

@EugeneZelenko EugeneZelenko added clang:static analyzer false-positive Warning fires when it should not and removed new issue labels Feb 13, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 13, 2024

@llvm/issue-subscribers-clang-static-analyzer

Author: None (phlax)

we are seeing the above error in our CI (running clang-tidy on the https://github.com/envoyproxy/envoy project)

it would appear this is a false positive that should not trigger on a member function

full log:

/b/f/w/source/common/stats/symbol_table.cc:623:3: error: Argument to free() is the address of a local stack variable, which is not memory allocated by malloc() [clang-analyzer-unix.Malloc,-warnings-as-errors]
  table.free(statName());
  ^
/b/f/w/source/common/stats/symbol_table.cc:629:5: note: Calling 'StatNameStorage::free'
    storage.free(symbol_table_);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:623:3: note: Argument to free() is the address of a local stack variable, which is not memory allocated by malloc()
  table.free(statName());
  ^          ~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:736:5: error: Argument to free() is the address of the parameter 'stat_name', which is not memory allocated by malloc() [clang-analyzer-unix.Malloc,-warnings-as-errors]
    symbol_table.free(stat_name);
    ^
/b/f/w/source/common/stats/symbol_table.cc:735:3: note: Calling 'StatNameList::iterate'
  iterate([&amp;symbol_table](StatName stat_name) -&gt; bool {
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:725:24: note: Assuming 'i' is &lt; 'num_elements'
  for (uint32_t i = 0; i &lt; num_elements; ++i) {
                       ^~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:725:3: note: Loop condition is true.  Entering loop body
  for (uint32_t i = 0; i &lt; num_elements; ++i) {
  ^
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/b/f/w/source/common/stats/symbol_table.cc:736:5: note: Argument to free() is the address of the parameter 'stat_name', which is not memory allocated by malloc()
    symbol_table.free(stat_name);
    ^                 ~~~~~~~~~

@NagyDonat
Copy link
Contributor

NagyDonat commented Feb 13, 2024

The reproducer also triggers the bug on recent clang main; I'll try to investigate and fix the bug.

@NagyDonat NagyDonat self-assigned this Feb 13, 2024
@NagyDonat
Copy link
Contributor

This kind of confusion between well-known functions and methods that accidentally have the same name probably affects many other checkers as well, because the method bool CallDescription::matches(const CallEvent &) doesn't distinguish function calls and method calls.

I'll try to survey all the 30 checkers that use CallDescriptions and design a solution that would eliminate all occurrences of this issue.

@steakhal
Copy link
Contributor

because the method bool CallDescription::matches(const CallEvent &) doesn't distinguish function calls and method calls.

That was on purpose, and exploited in the llvm-convections checker I think. I agree that this shouldn't be the default behavior. CallDescriptions should be strict about what they match, and be explicit about when you want a fuzzy match.

@NagyDonat
Copy link
Contributor

Yes, I see that there are several checkers (the one checking the getAs() and castAs() methods, and also ContainterModeling.cpp, and probably others as well) that use CallDescriptions to match methods.

I will pay attention to preserve their current behavior while I'm fixing all the other checkers. My rough plan is that by default call descriptions only match functions, and it's possible to specify "is a method call" (and perhaps "may be a method call or a function call") as a flag that's passed to their constructor; but this may evolve as I survey all applications of CallDescriptions.

NagyDonat added a commit to Ericsson/llvm-project that referenced this issue Feb 22, 2024
To fix llvm#81597, I'm planning
to refactor the usage of CallDescription; and as I was preparing for
this I noticed that there are two superfluous references to this header.
@NagyDonat NagyDonat reopened this Feb 22, 2024
@NagyDonat
Copy link
Contributor

Github automatically closed this when I referenced this ticket in a commit that's tangentially related to it (but far from a full solution).

@mjklemm mjklemm closed this as completed Feb 26, 2024
@mjklemm mjklemm reopened this Feb 26, 2024
@mjklemm
Copy link
Contributor

mjklemm commented Feb 26, 2024

Sorry, for some reason GitHub closes tickets when pushing a branch to a private copy of the repository. Does anyone know how to prevent this?

@steakhal
Copy link
Contributor

Sorry, for some reason GitHub closes tickets when pushing a branch to a private copy of the repository. Does anyone know how to prevent this?

Yea, thats a hard learning for us, I guess.
I think the best we can do is to create a copy of this ticket and close this xD
(While crossreferencing that here)

@NagyDonat
Copy link
Contributor

That sounds to be a github bug... 🐛

Why does it close a PR on this repository when a commit is pushed to a different repository?? That sounds like a doctor saying "the medicine will cure your illness" after giving the medicine to a different person... 🙃

NagyDonat added a commit that referenced this issue Mar 4, 2024
The class `CallDescription` is used to define patterns that are used for
matching `CallEvent`s. For example, a
`CallDescription{{"std", "find_if"}, 3}`
matches a call to `std::find_if` with 3 arguments.

However, these patterns are somewhat fuzzy, so this pattern could also
match something like `std::__1::find_if` (with an additional namespace
layer), or, unfortunately, a `CallDescription` for the well-known
function `free()` can match a C++ method named `free()`:
#81597

To prevent this kind of ambiguity this commit introduces the enum
`CallDescription::Mode` which can limit the pattern matching to
non-method function calls (or method calls etc.). After this NFC change,
one or more follow-up commits will apply the right pattern matching
modes in the ~30 checkers that use `CallDescription`s.

Note that `CallDescription` previously had a `Flags` field which had
only two supported values:
 - `CDF_None` was the default "match anything" mode,
 - `CDF_MaybeBuiltin` was a "match only C library functions and accept
some inexact matches" mode.
This commit preserves `CDF_MaybeBuiltin` under the more descriptive
name `CallDescription::Mode::CLibrary` (or `CDM::CLibrary`).

Instead of this "Flags" model I'm switching to a plain enumeration
becasue I don't think that there is a natural usecase to combine the
different matching modes. (Except for the default "match anything" mode,
which is currently kept for compatibility, but will be phased out in the
follow-up commits.)
NagyDonat added a commit to Ericsson/llvm-project that referenced this issue Apr 11, 2024
This commit ensures that the `CallDescription`s in `MallocChecker` are
matched with the mode `CDM::CLibrary`, so:
- they don't match methods or functions within user-defined namespaces;
- they also match builtin variants of these functions (if any), so the
  checker can model `__builtin_alloca()` like `alloca()`.

This change fixes llvm#81597.
New tests were added to verify that `std::malloc` and `std::free` (from
<cstdlib>) are modeled, but a method that's named e.g. `free` isn't
confused with the memory release function.

The responsibility for modeling `__builtin_alloca` and
`__builtin_alloca_with_align` was moved from `BuiltinFunctionChecker` to
`MallocChecker`, to avoid buggy interactions between the checkers and
ensure that the builtin and non-builtin variants are handled by exactly
the same logic.

This change might be a step backwards for the users who don't have
`unix.Malloc` enabled; but I suspect that `__builtin_alloca()` is so
rare that it would be a waste of time to implement backwards
compatibility for them.

There were several test files that relied on `__builtin_alloca()` calls
to get an `AllocaRegion`, these were modified to enable `unix.Malloc`.
One of these files (cxx-uninitialized-object-ptr-ref.cpp) had some tests
that relied on the fact that `malloc()` was treated as a "black box" in
them, these were updated to use `calloc()` (to get initialized memory)
and `free()` (to avoid memory leak reports).

While I was developing this change, I found a very suspicious assert in
`MallocChecker`. As it isn't blocking the goals of this commit, I just
marked it with a FIXME, but I'll try to investigate and fix it in a
follow-up change.
NagyDonat added a commit that referenced this issue Apr 16, 2024
This commit ensures that the `CallDescription`s in `MallocChecker` are
matched with the mode `CDM::CLibrary`, so:
- they don't match methods or functions within user-defined namespaces;
- they also match builtin variants of these functions (if any), so the
checker can model `__builtin_alloca()` like `alloca()`.

This change fixes #81597. New
tests were added to verify that `std::malloc` and `std::free` (from
`<cstdlib>`) are modeled, but a method that's named e.g. `free` isn't
confused with the memory release function.

The responsibility for modeling `__builtin_alloca` and
`__builtin_alloca_with_align` was moved from `BuiltinFunctionChecker` to
`MallocChecker`, to avoid buggy interactions between the checkers and
ensure that the builtin and non-builtin variants are handled by exactly
the same logic.

This change might be a step backwards for the users who don't have
`unix.Malloc` enabled; but I suspect that `__builtin_alloca()` is so
rare that it would be a waste of time to implement backwards
compatibility for them.

There were several test files that relied on `__builtin_alloca()` calls
to get an `AllocaRegion`, these were modified to enable `unix.Malloc`.
One of these files (cxx-uninitialized-object-ptr-ref.cpp) had some tests
that relied on the fact that `malloc()` was treated as a "black box" in
them, these were updated to use `calloc()` (to get initialized memory)
and `free()` (to avoid memory leak reports).

While I was developing this change, I found a very suspicious assert in
`MallocChecker`. As it isn't blocking the goals of this commit, I just
marked it with a FIXME, but I'll try to investigate and fix it in a
follow-up change.
qihangkong pushed a commit to rvgpu/llvm that referenced this issue Apr 18, 2024
The class `CallDescription` is used to define patterns that are used for
matching `CallEvent`s. For example, a `CallEvent{{"std", "find_if"}, 3}`
matches a call to `std::find_if` with 3 arguments.

However, these patterns are somewhat fuzzy, so this pattern could also
match something like `std::__1::find_if` (with an additional namespace
layer), or, unfortunately, a `CallDescription` for the well-known
function `free()` can match a C++ method named `free()`:
llvm/llvm-project#81597

To prevent this kind of ambiguity this commit introduces the enum
`CallDescription::Mode` which can limit the pattern matching to
non-method function calls (or method calls etc.). After this NFC change,
one or more follow-up commits will apply the right pattern matching
modes in the ~30 checkers that use `CallDescription`s.

Note that `CallDescription` previously had a `Flags` field which had
only two supported values:
 - CDF_None was the default "match anything" mode,
 - CDF_MaybeBuiltin was a "match only C library functions and accept
   some inexact matches" mode.
This commit preserves `CDF_MaybeBuiltin` under the more descriptive name
`CallDescription::Mode::CLibrary` (or `CDM::CLibrary`).

Instead of this "Flags" model I'm switching to a plain enumeration
becasue I don't think that there is a natural usecase to combine the
different matching modes. (Except for the default "match anything" mode,
which is currently kept for compatibility, but will be phased out in the
follow-up commits.)
NagyDonat added a commit to Ericsson/llvm-project that referenced this issue May 16, 2024
This commit deletes the "simple" constructor of `CallDescription` which
did not require a `CallDescription::Mode` argument and always used the
"wildcard" mode `CDM::Unspecified`.

A few months ago, this vague matching mode was used by many checkers,
which caused bugs like llvm#81597 and llvm#88181. Since then, my commits
improved the available matching modes and ensured that all checkers
explicitly specify the right matching mode.

After those commits, the only remaining references to the "simple"
constructor were some unit tests; this commit updates them to use an
explicitly specified matching mode (often `CDM::SimpleFunc`).

The mode `CDM::Unspecified` was not deleted in this commit because it's
still a reasonable choice in `GenericTaintChecker` and a few unit tests.
NagyDonat added a commit that referenced this issue May 17, 2024
…92454)

This commit deletes the "simple" constructor of `CallDescription` which
did not require a `CallDescription::Mode` argument and always used the
"wildcard" mode `CDM::Unspecified`.

A few months ago, this vague matching mode was used by many checkers,
which caused bugs like #81597
and #88181. Since then, my
commits improved the available matching modes and ensured that all
checkers explicitly specify the right matching mode.

After those commits, the only remaining references to the "simple"
constructor were some unit tests; this commit updates them to use an
explicitly specified matching mode (often `CDM::SimpleFunc`).

The mode `CDM::Unspecified` was not deleted in this commit because it's
still a reasonable choice in `GenericTaintChecker` and a few unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer false-positive Warning fires when it should not
Projects
None yet
7 participants