Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have #17624

Merged

Conversation

@practicalswift
Copy link
Member

practicalswift commented Nov 27, 2019

Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have.

The uninitialized value is read and used on L2526 in the case of AlreadyHave(inv) == true.

Proof of concept being run against a bitcoind built with MemorySanitizer (-fsanitize=memory):

$ ./p2p-uninit-read-in-conditional-poc.py
Usage: ./p2p-uninit-read-in-conditional-poc.py <dstaddr> <dstport> <net>
$ bitcoind -regtest &
$ ./p2p-uninit-read-in-conditional-poc.py 127.0.0.1 18444 regtest
SUMMARY: MemorySanitizer: use-of-uninitialized-value
[1]+  Exit 77                 bitcoind -regtest
$

Proof of concept being run against a bitcoind running under Valgrind (valgrind --exit-on-first-error):

$ valgrind -q --exit-on-first-error=yes --error-exitcode=1 bitcoind -regtest &
$ ./p2p-uninit-read-in-conditional-poc.py 127.0.0.1 18444 regtest
==27351== Conditional jump or move depends on uninitialised value(s)
[1]+  Exit 1                  valgrind -q --exit-on-first-error=yes --error-exitcode=1 bitcoind -regtest
$ 

Proof of concept script:

#!/usr/bin/env python3

import sys

from test_framework.mininode import NetworkThread
from test_framework.mininode import P2PDataStore
from test_framework.messages import CTransaction, CTxIn, CTxOut, msg_tx


def send_duplicate_tx(dstaddr="127.0.0.1", dstport=18444, net="regtest"):
    network_thread = NetworkThread()
    network_thread.start()

    node = P2PDataStore()
    node.peer_connect(dstaddr=dstaddr, dstport=dstport, net=net)()
    node.wait_for_verack()

    tx = CTransaction()
    tx.vin.append(CTxIn())
    tx.vout.append(CTxOut())
    node.send_message(msg_tx(tx))
    node.send_message(msg_tx(tx))
    node.peer_disconnect()
    network_thread.close()


if __name__ == "__main__":
    if len(sys.argv) != 4:
        print("Usage: {} <dstaddr> <dstport> <net>".format(sys.argv[0]))
        sys.exit(0)
    send_duplicate_tx(sys.argv[1], int(sys.argv[2]), sys.argv[3])

Note that the transaction in the proof of concept is the simplest possible, but really any transaction can be used. It does not have to be a valid transaction.

This bug was introduced in #15921 ("validation: Tidy up ValidationState interface") which was merged in to master 28 days ago.

Luckily this bug was caught before being part of any Bitcoin Core release :)

@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Nov 27, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17557 (util: Refactor message hashing into a utility function by jkczyz)
  • #17399 (validation: Templatize ValidationState instead of subclassing by jkczyz)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jnewbery

This comment has been minimized.

Copy link
Member

jnewbery commented Nov 28, 2019

utACK 73b96c9

Great catch @practicalswift . Are the else if (state.GetResult() == TxValidationResult::TX_MISSING_INPUTS) and else branches really not covered by any of our tests? If so, that seems like a big hole in our testing. We should add tests for duplicate TX messages received.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Nov 28, 2019

ACK 73b96c9, thanks for discovering and reporting this before it ended up in a release.

(travis fail is unrelated, restarted …)

@practicalswift

This comment has been minimized.

Copy link
Member Author

practicalswift commented Nov 28, 2019

@jnewbery

Both branches are covered by our functional tests (specifically test/functional/p2p_segwit.py), but unfortunately we're not running the functional tests under MemorySanitizer or Valgrind :(

I found this issue after observing some weirdness on a mainnet when running a live node with MemorySanitizer instrumentation enabled. Receiving duplicate transactions on mainnet happens quite regularly. I then reproduced the issue on regtest.

Since discovering the issue on mainnet I've "rediscovered" it 1.) using static analysis tooling (Coverity is one example), 2.) by running the ordinary functional tests under MemorySanitizer, and 3.) by running the ordinary functional tests under Valgrind.

My personal view is that we really really underuse the excellent modern tooling that is typically used in security critical C++ projects to guard against introduction of bugs like this. I find that a bit surprising and I promise to do my best to help improve that situation going forward :)


Rediscovery 1. Finding the issue using static analysis (Coverity in this example)

________________________________________________________________________________________________________
*** CID 350378:  Uninitialized members  (UNINIT_CTOR)
/src/consensus/validation.h: 117 in TxValidationState::TxValidationState()()
111     };
112     
113     inline ValidationState::~ValidationState() {};
114     
115     class TxValidationState : public ValidationState {
116     private:
>>>     CID 350378:  Uninitialized members  (UNINIT_CTOR)
>>>     The compiler-generated constructor for this class does not initialize "m_result".
117         TxValidationResult m_result;
118     public:
119         bool Invalid(TxValidationResult result,
120                      const std::string &reject_reason="",
121                      const std::string &debug_message="")
122         {
** CID 350377:  Uninitialized variables  (UNINIT)
________________________________________________________________________________________________________
*** CID 350377:  Uninitialized variables  (UNINIT)
/src/net_processing.cpp: 2526 in ProcessMessage(CNode *, const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> &, CDataStream &, long, const CChainParams &, CConnman *, BanMan *, const std::atomic<bool> &)()
2520                     tx.GetHash().ToString(),
2521                     mempool.size(), mempool.DynamicMemoryUsage() / 1000);
2522     
2523                 // Recursively process any orphan transactions that depended on this one
2524                 ProcessOrphanTx(connman, pfrom->orphan_work_set, lRemovedTxn);
2525             }
>>>     CID 350377:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "state.m_result" when calling "GetResult".
2526             else if (state.GetResult() == TxValidationResult::TX_MISSING_INPUTS)
2527             {
2528                 bool fRejectedParents = false; // It may be the case that the orphans parents have all been rejected
2529                 for (const CTxIn& txin : tx.vin) {
2530                     if (recentRejects->contains(txin.prevout.hash)) {
2531                         fRejectedParents = true;

Rediscovery 2. Finding the issue using dynamic analysis (MemorySanitizer)

Running test/functional/test_runner.py with bitcoind compiled with MSAN instrumentation:

$ test/functional/test_runner.py
…
2020-01-02T04:05:06.123456Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
2020-01-02T04:05:06.123456Z TestFramework (ERROR): Assertion failed
…
SUMMARY: MemorySanitizer: use-of-uninitialized-value

Rediscovery 3. Finding the issue using dynamic analysis (Valgrind)

Running BITCOIND=bitcoind_valgrind test/functional/test_runner.py where bitcoind_valgrind is a shell script wrapper doing valgrind -q --exit-on-first-error=yes --error-exitcode=1 --gen-suppressions=all --leak-check=full --show-leak-kinds=all --suppressions=contrib/valgrind.supp src/bitcoind "$@":

$ BITCOIND=bitcoind_valgrind test/functional/test_runner.py
…
2020-01-02T04:05:06.123456Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
2020-01-02T04:05:06.123456Z TestFramework (ERROR): Assertion failed
…
==47281== Conditional jump or move depends on uninitialised value(s)
laanwj added a commit that referenced this pull request Nov 28, 2019
…", …) when receiving a transaction we already have

73b96c9 net: Fix uninitialized read in ProcessMessage(...) (practicalswift)

Pull request description:

  Fix an uninitialized read in `ProcessMessage(…, "tx", …)` when receiving a transaction we already have.

  The uninitialized value is read and used on [L2526 in the case of `AlreadyHave(inv) == true`](https://github.com/bitcoin/bitcoin/blob/d8a66626d63135fd245d5afc524b88b9a94d208b/src/net_processing.cpp#L2494-L2526).

  Proof of concept being run against a `bitcoind` built with MemorySanitizer (`-fsanitize=memory`):

  ```
  $ ./p2p-uninit-read-in-conditional-poc.py
  Usage: ./p2p-uninit-read-in-conditional-poc.py <dstaddr> <dstport> <net>
  $ bitcoind -regtest &
  $ ./p2p-uninit-read-in-conditional-poc.py 127.0.0.1 18444 regtest
  SUMMARY: MemorySanitizer: use-of-uninitialized-value
  [1]+  Exit 77                 bitcoind -regtest
  $
  ```

  Proof of concept being run against a `bitcoind` running under Valgrind (`valgrind --exit-on-first-error`):

  ```
  $ valgrind -q --exit-on-first-error=yes --error-exitcode=1 bitcoind -regtest &
  $ ./p2p-uninit-read-in-conditional-poc.py 127.0.0.1 18444 regtest
  ==27351== Conditional jump or move depends on uninitialised value(s)
  [1]+  Exit 1                  valgrind -q --exit-on-first-error=yes --error-exitcode=1 bitcoind -regtest
  $
  ```

  Proof of concept script:

  ```
  #!/usr/bin/env python3

  import sys

  from test_framework.mininode import NetworkThread
  from test_framework.mininode import P2PDataStore
  from test_framework.messages import CTransaction, CTxIn, CTxOut, msg_tx

  def send_duplicate_tx(dstaddr="127.0.0.1", dstport=18444, net="regtest"):
      network_thread = NetworkThread()
      network_thread.start()

      node = P2PDataStore()
      node.peer_connect(dstaddr=dstaddr, dstport=dstport, net=net)()
      node.wait_for_verack()

      tx = CTransaction()
      tx.vin.append(CTxIn())
      tx.vout.append(CTxOut())
      node.send_message(msg_tx(tx))
      node.send_message(msg_tx(tx))
      node.peer_disconnect()
      network_thread.close()

  if __name__ == "__main__":
      if len(sys.argv) != 4:
          print("Usage: {} <dstaddr> <dstport> <net>".format(sys.argv[0]))
          sys.exit(0)
      send_duplicate_tx(sys.argv[1], int(sys.argv[2]), sys.argv[3])
  ```

  Note that the transaction in the proof of concept is the simplest possible, but really any transaction can be used. It does not have to be a valid transaction.

  This bug was introduced in #15921 ("validation: Tidy up ValidationState interface") which was merged in to `master` 28 days ago.

  Luckily this bug was caught before being part of any Bitcoin Core release :)

ACKs for top commit:
  jnewbery:
    utACK 73b96c9
  laanwj:
    ACK 73b96c9, thanks for discovering and reporting this before it ended up in a release.

Tree-SHA512: 7ce6b8f260bcdd9b2ec4ff4b941a891bbef578acf4456df33b7a8d42b248237ec4949e65e2445b24851d1639b10681c701ad500b1c0b776ff050ef8c3812c795
@laanwj laanwj merged commit 73b96c9 into bitcoin:master Nov 28, 2019
1 of 2 checks passed
1 of 2 checks passed
continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@jnewbery jnewbery mentioned this pull request Dec 2, 2019
MarcoFalke added a commit that referenced this pull request Dec 10, 2019
…s under Valgrind

5db506b tests: Add option --valgrind to run nodes under valgrind in the functional tests (practicalswift)

Pull request description:

  What is better than fixing bugs? Fixing entire bug classes of course! :)

  Add option `--valgrind` to run the functional tests under Valgrind.

  Regular functional testing under Valgrind would have caught many of the uninitialized reads we've seen historically.

  Let's kill this bug class once and for all: let's never use an uninitialized value ever again. Or at least not one that would be triggered by running the functional tests! :)

  My hope is that this addition will make it super-easy to run the functional tests under Valgrind and thus increase the probability of people making use of it :)

  Hopefully `test/functional/test_runner.py --valgrind` will become a natural part of the pre-release QA process.

  **Usage:**

  ```
  $ test/functional/test_runner.py --help
  …
    --valgrind            run nodes under the valgrind memory error detector:
                          expect at least a ~10x slowdown, valgrind 3.14 or
                          later required
  ```

  **Live demo:**

  First, let's re-introduce a memory bug by reverting the recent P2P uninitialized read bug fix from PR #17624 ("net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have").

  ```
  $ git diff
  diff --git a/src/consensus/validation.h b/src/consensus/validation.h
  index 3401eb64c..940adea33 100644
  --- a/src/consensus/validation.h
  +++ b/src/consensus/validation.h
  @@ -114,7 +114,7 @@ inline ValidationState::~ValidationState() {};

   class TxValidationState : public ValidationState {
   private:
  -    TxValidationResult m_result = TxValidationResult::TX_RESULT_UNSET;
  +    TxValidationResult m_result;
   public:
       bool Invalid(TxValidationResult result,
                    const std::string &reject_reason="",
  ```

  Second, let's test as normal without Valgrind:

  ```
  $ test/functional/p2p_segwit.py -l INFO
  2019-11-28T09:30:42.810000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__fc8q3qo
  …
  2019-11-28T09:31:57.187000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  …
  2019-11-28T09:32:08.265000Z TestFramework (INFO): Tests successful
  ```

  Third, let's test with `--valgrind` and see if the test fail (as we expect) when the unitialized value is used:

  ```
  $ test/functional/p2p_segwit.py -l INFO --valgrind
  2019-11-28T09:32:33.018000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gtjecx2l
  …
  2019-11-28T09:40:36.702000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  2019-11-28T09:40:37.813000Z TestFramework (ERROR): Assertion failed
  ConnectionRefusedError: [Errno 111] Connection refused
  ```

ACKs for top commit:
  MarcoFalke:
    ACK 5db506b
  jonatack:
    ACK 5db506b

Tree-SHA512: 2eaecacf4da166febad88b2a8ee6d7ac2bcd38d4c1892ca39516b6343e8f8c8814edf5eaf14c90f11a069a0389d24f0713076112ac284de987e72fc5f6cc3795
sidhujag added a commit to syscoin/syscoin that referenced this pull request Dec 10, 2019
…al tests under Valgrind

5db506b tests: Add option --valgrind to run nodes under valgrind in the functional tests (practicalswift)

Pull request description:

  What is better than fixing bugs? Fixing entire bug classes of course! :)

  Add option `--valgrind` to run the functional tests under Valgrind.

  Regular functional testing under Valgrind would have caught many of the uninitialized reads we've seen historically.

  Let's kill this bug class once and for all: let's never use an uninitialized value ever again. Or at least not one that would be triggered by running the functional tests! :)

  My hope is that this addition will make it super-easy to run the functional tests under Valgrind and thus increase the probability of people making use of it :)

  Hopefully `test/functional/test_runner.py --valgrind` will become a natural part of the pre-release QA process.

  **Usage:**

  ```
  $ test/functional/test_runner.py --help
  …
    --valgrind            run nodes under the valgrind memory error detector:
                          expect at least a ~10x slowdown, valgrind 3.14 or
                          later required
  ```

  **Live demo:**

  First, let's re-introduce a memory bug by reverting the recent P2P uninitialized read bug fix from PR bitcoin#17624 ("net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have").

  ```
  $ git diff
  diff --git a/src/consensus/validation.h b/src/consensus/validation.h
  index 3401eb64c..940adea33 100644
  --- a/src/consensus/validation.h
  +++ b/src/consensus/validation.h
  @@ -114,7 +114,7 @@ inline ValidationState::~ValidationState() {};

   class TxValidationState : public ValidationState {
   private:
  -    TxValidationResult m_result = TxValidationResult::TX_RESULT_UNSET;
  +    TxValidationResult m_result;
   public:
       bool Invalid(TxValidationResult result,
                    const std::string &reject_reason="",
  ```

  Second, let's test as normal without Valgrind:

  ```
  $ test/functional/p2p_segwit.py -l INFO
  2019-11-28T09:30:42.810000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__fc8q3qo
  …
  2019-11-28T09:31:57.187000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  …
  2019-11-28T09:32:08.265000Z TestFramework (INFO): Tests successful
  ```

  Third, let's test with `--valgrind` and see if the test fail (as we expect) when the unitialized value is used:

  ```
  $ test/functional/p2p_segwit.py -l INFO --valgrind
  2019-11-28T09:32:33.018000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gtjecx2l
  …
  2019-11-28T09:40:36.702000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  2019-11-28T09:40:37.813000Z TestFramework (ERROR): Assertion failed
  ConnectionRefusedError: [Errno 111] Connection refused
  ```

ACKs for top commit:
  MarcoFalke:
    ACK 5db506b
  jonatack:
    ACK 5db506b

Tree-SHA512: 2eaecacf4da166febad88b2a8ee6d7ac2bcd38d4c1892ca39516b6343e8f8c8814edf5eaf14c90f11a069a0389d24f0713076112ac284de987e72fc5f6cc3795
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.