Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzer enhancement: Explicitly check output for uninitialized memory #22064

Open
guidovranken opened this issue May 25, 2021 · 6 comments
Open

Comments

@guidovranken
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Both MemorySanitizer and Valgrind will only detect uninitialized memory if it is used for branching or IO.

E.g. the following program performs a computation using an uninitialized variable (a) but this won't trigger MSAN/Valgrind:

int main(void)
{
    int a; int b = a + 10;
    return 0;
}

Describe the solution you'd like

Call

extern "C" void __msan_check_mem_is_initialized(const volatile void *x, size_t size);

on the data to make MSAN evaluate it.

Describe alternatives you've considered

Alternative solution that also works with Valgrind: write the data to /dev/null:

#include <stdio.h>

int main(void)
{
    int a; int b = a + 10;
    FILE* fp = fopen("/dev/null", "wb");
    fwrite(&b, sizeof(b), 1, fp);
    fclose(fp);
    return 0;
}

Additional context

Proposal: Create a wrapper for __msan_check_mem_is_initialized (as a C++ method), e.g.:

void TestMsan(const void* data, const size_t size) {
   __msan_check_mem_is_initialized(x, size);
}

And use overloaded methods for special types, e.g.

void TestMsan(const std::string& s) {
   TestMsan(s.data(), s.size());
}

Then edit all fuzzer harnesses and call TestMsan with the output of each non-void method.

E.g. the parse_script harness would become:

// Copyright (c) 2009-2020 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.

#include <core_io.h>
#include <script/script.h>
#include <test/fuzz/fuzz.h>

FUZZ_TARGET(parse_script)
{
    const std::string script_string(buffer.begin(), buffer.end());
    try {
        TestMsan(ParseScript(script_string));
    } catch (const std::runtime_error&) {
    }
}

The same concept can be applied to the unit tests.

@maflcko
Copy link
Member

maflcko commented Jun 5, 2021

Concept ACK, obviously. Though, I am worried that this will make our code overly verbose, and hard to maintain. Maybe it would help if there was a compiler knob to adjust the aggressiveness of the memory sanitizers?

For "easy" memory violations, -ftrivial-auto-var-init=pattern might be enough to corrupt values and then cause a logic error down the line. At least it will detect the memory issue fixed in commit 3737126.

@practicalswift
Copy link
Contributor

@guidovranken This technique was used in #23152 (comment). Thanks! :)

@maflcko
Copy link
Member

maflcko commented Sep 27, 2023

For reference -ftrivial-auto-var-init=pattern, caught another issue (in leveldb) that both valgrind and msan missed: #28359

As mentioned previously, I don't think there is anything that can be done here, other than adding a compiler flag upstream.

In theory the wrapper code can be enforced with a clang-tidy plugin in fuzz code (cc @dergoegge), but the downsides of being incomplete and making the code overly verbose still hold.

@maflcko maflcko added Tests and removed Feature labels Sep 27, 2023
@dergoegge
Copy link
Member

In theory the wrapper code can be enforced with a clang-tidy plugin in fuzz code (cc @dergoegge), but the downsides of being incomplete and making the code overly verbose still hold.

(maybe this is what you mean by "enforce" but) My idea was to have the clang-tidy plugin auto refactor all our code to insert the wrappers prior to running the msan/valgrind job in CI. I think this should be possible and would avoid the verbosity of having the wrappers present in the actual code.

@maflcko
Copy link
Member

maflcko commented Sep 28, 2023

refactor all our code

So every statement in the source code is wrapped and the memory is read by msan? Probably fine, but I'd suspect a massive slow-down.

I guess doing this once per release can't hurt.

@real-or-random
Copy link
Contributor

real-or-random commented Apr 9, 2024

Alternative solution that also works with Valgrind: write the data to /dev/null:

There's also VALGRIND_CHECK_MEM_IS_DEFINED, see also https://github.com/bitcoin-core/secp256k1/blob/master/src/checkmem.h for an abstraction layer that works with both MSan and Valgrind.


What's suggested in this issue, i.e., reporting every read of an uninitialized value, may just be too much. The Valgrind FAQ says this:

As for eager reporting of copies of uninitialised memory values, this has been suggested multiple times. Unfortunately, almost all programs legitimately copy uninitialised memory values around (because compilers pad structs to preserve alignment) and eager checking leads to hundreds of false positives. Therefore Memcheck does not support eager checking at this time.

But starting with clang 16, at least MSan gets us closer to this. Returning an uninitialized variables from a function, or passing uninitialized values to a function as a parameter is now considered a "use" of uninitialized memory, and MSan will report it by default. See the Clang 16.0.09 Release Notes:

-fsanitize-memory-param-retval is turned on by default. With -fsanitize=memory, passing uninitialized variables to functions and returning uninitialized variables from functions is more aggressively reported. -fno-sanitize-memory-param-retval restores the previous behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@real-or-random @maflcko @guidovranken @practicalswift @dergoegge and others