Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Add option --valgrind to run the functional tests under Valgrind #17633

Merged
merged 1 commit into from Dec 10, 2019

Conversation

@practicalswift
Copy link
Member

practicalswift commented Nov 29, 2019

What is better than fixing bugs? Fixing entire bug classes of course! :)

Add option --valgrind to run the functional tests under Valgrind.

Regular functional testing under Valgrind would have caught many of the uninitialized reads we've seen historically.

Let's kill this bug class once and for all: let's never use an uninitialized value ever again. Or at least not one that would be triggered by running the functional tests! :)

My hope is that this addition will make it super-easy to run the functional tests under Valgrind and thus increase the probability of people making use of it :)

Hopefully test/functional/test_runner.py --valgrind will become a natural part of the pre-release QA process.

Usage:

$ test/functional/test_runner.py --help
…
  --valgrind            run nodes under the valgrind memory error detector:
                        expect at least a ~10x slowdown, valgrind 3.14 or
                        later required

Live demo:

First, let's re-introduce a memory bug by reverting the recent P2P uninitialized read bug fix from PR #17624 ("net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have").

$ git diff
diff --git a/src/consensus/validation.h b/src/consensus/validation.h
index 3401eb64c..940adea33 100644
--- a/src/consensus/validation.h
+++ b/src/consensus/validation.h
@@ -114,7 +114,7 @@ inline ValidationState::~ValidationState() {};

 class TxValidationState : public ValidationState {
 private:
-    TxValidationResult m_result = TxValidationResult::TX_RESULT_UNSET;
+    TxValidationResult m_result;
 public:
     bool Invalid(TxValidationResult result,
                  const std::string &reject_reason="",

Second, let's test as normal without Valgrind:

$ test/functional/p2p_segwit.py -l INFO
2019-11-28T09:30:42.810000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__fc8q3qo
…
2019-11-28T09:31:57.187000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
…
2019-11-28T09:32:08.265000Z TestFramework (INFO): Tests successful

Third, let's test with --valgrind and see if the test fail (as we expect) when the unitialized value is used:

$ test/functional/p2p_segwit.py -l INFO --valgrind
2019-11-28T09:32:33.018000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gtjecx2l
…
2019-11-28T09:40:36.702000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
2019-11-28T09:40:37.813000Z TestFramework (ERROR): Assertion failed
ConnectionRefusedError: [Errno 111] Connection refused
@fanquake fanquake added the Tests label Nov 29, 2019
@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Nov 29, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #12134 (Build previous releases and run functional tests by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link

paymog left a comment

Do all tests currently pass with this new valgrind option?

@practicalswift practicalswift force-pushed the practicalswift:functional-valgrind branch from 61a48c1 to 5db506b Dec 1, 2019
@practicalswift

This comment has been minimized.

Copy link
Member Author

practicalswift commented Dec 2, 2019

@paymog

There are a few failing tests: most of them due to timing assumptions that fail in the presence of the bitcoind slowdown caused by Valgrind. I plan to fix these failing tests by increasing timeouts where required.

Note that doing that has the added nice benefit of turning a few flaky tests into stable ones since said timing assumptions are intermittently violated under normal test execution due to high load, etc.

@jnewbery jnewbery mentioned this pull request Dec 2, 2019
@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 3, 2019

ACK 5db506b

Copy link
Member

jonatack left a comment

Concept ACK, thank you for working on adding valgrind for the tests. Unfortunately, every test run with --valgrind fails for me with AssertionError: [node 0] Error: no RPC connection. Pointers welcome :)

$ valgrind --version
valgrind-3.14.0
"..", "..", "..", "contrib", "valgrind.supp")
suppressions_file = os.getenv("VALGRIND_SUPPRESSIONS_FILE",
default_suppressions_file)
self.args = ["valgrind", "--suppressions={}".format(suppressions_file),

This comment has been minimized.

Copy link
@jonatack

jonatack Dec 3, 2019

Member

suggestion if you have to retouch this:

-            self.args = ["valgrind", "--suppressions={}".format(suppressions_file),
-                         "--gen-suppressions=all", "--exit-on-first-error=yes",
-                         "--error-exitcode=1", "--quiet"] + self.args
+            self.args = [
+                "valgrind",
+                "--suppressions={}".format(suppressions_file),
+                "--gen-suppressions=all",
+                "--exit-on-first-error=yes",
+                "--error-exitcode=1",
+                "--quiet",
+            ] + self.args

(like lines 89-97 before it)

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 3, 2019

@jonatack What does combine_logs.py tell you, and are the directories stdout and stderr in the test dir empty?

@jonatack

This comment has been minimized.

Copy link
Member

jonatack commented Dec 3, 2019

@MarcoFalke thanks! I don't see stdout or stderr directories in /test and /test/functional. Here is the combine_logs.py output for p2p_segwit.py (which passes without the --valgrind option).

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 3, 2019

Try tail /tmp/bitcoin_func_test_u513zycz/node*/std* or similar (Reminds me that this should ideally be done by combine_logs.py)

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 3, 2019

Also, could you try with the suppressions file from https://github.com/bitcoin/bitcoin/pull/17490/files#diff-8982317ece972c9b26bd8545350812f2R31 ? Maybe remove the line I linked to if it still doesn't work.

@jonatack

This comment has been minimized.

Copy link
Member

jonatack commented Dec 3, 2019

ACK 5db506b

Reviewed code, reproduced the PR description steps after modifying contrib/valgrind.supp (diff)... perhaps I have a depends issue in my build.

Thank you @MarcoFalke for your suggestions.

@practicalswift

This comment has been minimized.

Copy link
Member Author

practicalswift commented Dec 10, 2019

Try tail /tmp/bitcoin_func_test_u513zycz/node*/std* or similar (Reminds me that this should ideally be done by combine_logs.py)

I'm probably missing something, but isn't that what print_node_warnings in combine_logs.py does? :)

def print_node_warnings(tmp_dir, colors):
"""Print nodes' errors and warnings"""
warnings = []
for stream in ['stdout', 'stderr']:
for i in itertools.count():
folder = "{}/node{}/{}".format(tmp_dir, i, stream)
if not os.path.isdir(folder):
break
for (_, _, fns) in os.walk(folder):
for fn in fns:
warning = pathlib.Path('{}/{}'.format(folder, fn)).read_text().strip()
if warning:
warnings.append(("node{} {}".format(i, stream), warning))

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 10, 2019

Ah right. Forgot that I already wrote that. And it indeed was also included in https://gist.github.com/jonatack/915df3494f9f7274a07e29ae092a2a47#file-pr17633-combine_log_output-txt-L263

MarcoFalke added a commit that referenced this pull request Dec 10, 2019
…s under Valgrind

5db506b tests: Add option --valgrind to run nodes under valgrind in the functional tests (practicalswift)

Pull request description:

  What is better than fixing bugs? Fixing entire bug classes of course! :)

  Add option `--valgrind` to run the functional tests under Valgrind.

  Regular functional testing under Valgrind would have caught many of the uninitialized reads we've seen historically.

  Let's kill this bug class once and for all: let's never use an uninitialized value ever again. Or at least not one that would be triggered by running the functional tests! :)

  My hope is that this addition will make it super-easy to run the functional tests under Valgrind and thus increase the probability of people making use of it :)

  Hopefully `test/functional/test_runner.py --valgrind` will become a natural part of the pre-release QA process.

  **Usage:**

  ```
  $ test/functional/test_runner.py --help
  …
    --valgrind            run nodes under the valgrind memory error detector:
                          expect at least a ~10x slowdown, valgrind 3.14 or
                          later required
  ```

  **Live demo:**

  First, let's re-introduce a memory bug by reverting the recent P2P uninitialized read bug fix from PR #17624 ("net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have").

  ```
  $ git diff
  diff --git a/src/consensus/validation.h b/src/consensus/validation.h
  index 3401eb64c..940adea33 100644
  --- a/src/consensus/validation.h
  +++ b/src/consensus/validation.h
  @@ -114,7 +114,7 @@ inline ValidationState::~ValidationState() {};

   class TxValidationState : public ValidationState {
   private:
  -    TxValidationResult m_result = TxValidationResult::TX_RESULT_UNSET;
  +    TxValidationResult m_result;
   public:
       bool Invalid(TxValidationResult result,
                    const std::string &reject_reason="",
  ```

  Second, let's test as normal without Valgrind:

  ```
  $ test/functional/p2p_segwit.py -l INFO
  2019-11-28T09:30:42.810000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__fc8q3qo
  …
  2019-11-28T09:31:57.187000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  …
  2019-11-28T09:32:08.265000Z TestFramework (INFO): Tests successful
  ```

  Third, let's test with `--valgrind` and see if the test fail (as we expect) when the unitialized value is used:

  ```
  $ test/functional/p2p_segwit.py -l INFO --valgrind
  2019-11-28T09:32:33.018000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gtjecx2l
  …
  2019-11-28T09:40:36.702000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  2019-11-28T09:40:37.813000Z TestFramework (ERROR): Assertion failed
  ConnectionRefusedError: [Errno 111] Connection refused
  ```

ACKs for top commit:
  MarcoFalke:
    ACK 5db506b
  jonatack:
    ACK 5db506b

Tree-SHA512: 2eaecacf4da166febad88b2a8ee6d7ac2bcd38d4c1892ca39516b6343e8f8c8814edf5eaf14c90f11a069a0389d24f0713076112ac284de987e72fc5f6cc3795
@MarcoFalke MarcoFalke merged commit 5db506b into bitcoin:master Dec 10, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
sidhujag added a commit to syscoin/syscoin that referenced this pull request Dec 10, 2019
…al tests under Valgrind

5db506b tests: Add option --valgrind to run nodes under valgrind in the functional tests (practicalswift)

Pull request description:

  What is better than fixing bugs? Fixing entire bug classes of course! :)

  Add option `--valgrind` to run the functional tests under Valgrind.

  Regular functional testing under Valgrind would have caught many of the uninitialized reads we've seen historically.

  Let's kill this bug class once and for all: let's never use an uninitialized value ever again. Or at least not one that would be triggered by running the functional tests! :)

  My hope is that this addition will make it super-easy to run the functional tests under Valgrind and thus increase the probability of people making use of it :)

  Hopefully `test/functional/test_runner.py --valgrind` will become a natural part of the pre-release QA process.

  **Usage:**

  ```
  $ test/functional/test_runner.py --help
  …
    --valgrind            run nodes under the valgrind memory error detector:
                          expect at least a ~10x slowdown, valgrind 3.14 or
                          later required
  ```

  **Live demo:**

  First, let's re-introduce a memory bug by reverting the recent P2P uninitialized read bug fix from PR bitcoin#17624 ("net: Fix an uninitialized read in ProcessMessage(…, "tx", …) when receiving a transaction we already have").

  ```
  $ git diff
  diff --git a/src/consensus/validation.h b/src/consensus/validation.h
  index 3401eb64c..940adea33 100644
  --- a/src/consensus/validation.h
  +++ b/src/consensus/validation.h
  @@ -114,7 +114,7 @@ inline ValidationState::~ValidationState() {};

   class TxValidationState : public ValidationState {
   private:
  -    TxValidationResult m_result = TxValidationResult::TX_RESULT_UNSET;
  +    TxValidationResult m_result;
   public:
       bool Invalid(TxValidationResult result,
                    const std::string &reject_reason="",
  ```

  Second, let's test as normal without Valgrind:

  ```
  $ test/functional/p2p_segwit.py -l INFO
  2019-11-28T09:30:42.810000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test__fc8q3qo
  …
  2019-11-28T09:31:57.187000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  …
  2019-11-28T09:32:08.265000Z TestFramework (INFO): Tests successful
  ```

  Third, let's test with `--valgrind` and see if the test fail (as we expect) when the unitialized value is used:

  ```
  $ test/functional/p2p_segwit.py -l INFO --valgrind
  2019-11-28T09:32:33.018000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_gtjecx2l
  …
  2019-11-28T09:40:36.702000Z TestFramework (INFO): Subtest: test_non_standard_witness_blinding (Segwit active = True)
  2019-11-28T09:40:37.813000Z TestFramework (ERROR): Assertion failed
  ConnectionRefusedError: [Errno 111] Connection refused
  ```

ACKs for top commit:
  MarcoFalke:
    ACK 5db506b
  jonatack:
    ACK 5db506b

Tree-SHA512: 2eaecacf4da166febad88b2a8ee6d7ac2bcd38d4c1892ca39516b6343e8f8c8814edf5eaf14c90f11a069a0389d24f0713076112ac284de987e72fc5f6cc3795
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.