Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python P2P testing framework #5981

Merged
merged 5 commits into from Apr 30, 2015

Conversation

sdaftuar
Copy link
Member

@sdaftuar sdaftuar commented Apr 7, 2015

Motivated by the discussion in #4545, I've been working on building a python testing framework that will allow us to write tests that exercise the p2p code in bitcoind, so that we can perform end-to-end testing and perform comparisons of behavior across bitcoind versions. This code builds on the existing RPC testing functionality, so that tests using both RPC calls and p2p messages can be written.

I've split this pull into 3 commits to try to make this easier to review.

  1. The first commit adds a file called mininode.py, which I grabbed from jgarzik's mini-node branch of the pynode repository, and modified. This file does a few things:
  • Defines a class, NodeConn, which manages connectivity to a bitcoin node.
  • Defines a callback prototype, NodeConnCB, which is used for message delivery. The idea is that you can write a class that inherits from NodeConnCB and pass it in to a NodeConn object, and then be notified when events of interest arrive.
  • Defines all the data structures from bitcoin core that pass over the network, eg CBlock, CTransaction, etc.
  • Defines all the serialization/deserialization code.

It's possible to write (crude) tests using mininode.py alone, if you want. Also, mininode supports testing outside of regtest (it can communicate on testnet and mainnet as well), so that theoretically allows for a broader category of testing than we currently can do. All the tests I've been working on have been focused on regtest, however.

Also, in the first commit, I provide one example test (maxblocksinflight.py) that shows how you can use mininode. This is largely a proof-of-concept example, but it does test something useful: 0.10 used to fail this test until #5507 was merged.

  1. The second commit adds support for a comparison-tool style testing framework, similar to the comparison-tool framework that we use from bitcoinj in the pull-tester. The code in this commit is designed to add structure to test writing, so that tests are both easier to read and write (compared to the free-form structure of maxblocksinflight.py). I include in this commit one example test written using the framework, which tests the processing of two different types of invalid blocks.

EDIT: I forgot to mention that I use py-leveldb in blockstore.py, a file I introduced in this commit that provides disk-backed storage for blocks. I think I had to manually install that package on my machine, so I wanted to flag this dependency in case others think this would be a problem.

  1. The third commit adds some script processing routines (script.py and its dependency, bignum.py -- the latter could probably be removed with a little work) which I have copied and modified from python-bitcoinlib. With these additional tools, I was able to write a test, script_test.py, which reads all the script tests from the unit test data directory (script_valid.json and script_invalid.json) and inserts them in blocks delivered to two nodes, and for each test case compares whether the nodes agree on whether the block containing the test transaction passes consensus checks. This test is very slow (perhaps 40 minutes to run), so this is largely a proof of concept that demonstrates the kinds of tests we ought to be able to write. (Unfortunately, this test makes use of RPC calls that only have existed since 0.10, so it is not possible to run this test in its current form to compare 0.9-or-older versions).

If this framework looks okay, then I think a future project would be to re-implement the pull-tester's comparison test in this framework.

I realize this is a lot of code, so if there's anything more I can do to aid in review please let me know.

@jonasschnelli
Copy link
Contributor

Nice!
conceptual ACK.
Have plans to test this soon.

@laanwj
Copy link
Member

laanwj commented Apr 8, 2015

Nice work!

I forgot to mention that I use py-leveldb in blockstore.py

I'd rather not add any external dependencies for the tests. Do we need any kind of persistent block storage in the tests? If we really do, I'd rather hack something together with python.

@laanwj laanwj added the Tests label Apr 8, 2015
@sdaftuar
Copy link
Member Author

sdaftuar commented Apr 9, 2015

@laanwj I'd prefer to leave the disk-backed storage there, even though it's probably not needed for the existing tests I've included in this pull, because it will be needed to write longer tests like the comparison test which currently runs in the pull-tester.

I pushed up a commit that replaces leveldb with dbm, which I believe is a standard python package; does that seem like an acceptable option?

@laanwj
Copy link
Member

laanwj commented Apr 9, 2015

Sure, using a python standard package is fine, my gripe was with using an external dependency. Using disk-backed storage is no problem as long as subsequent tests don't interfere with each other.

@laanwj
Copy link
Member

laanwj commented Apr 16, 2015

Maybe add the new tests (whenever they don't take too long) to the pull tester e.g. qa/pull-tester/rpc-tests.sh. For example invalidblockrequest.py seems to finish quickly.

The bipdersig-p2p.py test seems to hang indefinitely here at:

Initializing test directory /tmp/testBxTNIB
MiniNode: Connecting to Bitcoin Node IP # 127.0.0.1:11916
Test 1: PASS [98]

@sdaftuar
Copy link
Member Author

@laanwj The bipdersig-p2p.py test was failing because this needed rebase, due to the setgenerate ()/generate() regtest rpc change (now fixed).

I've also shortened maxblocksinflight.py, and added both it and invalidblockrequest.py to the pull tester's rpc tests.

EDIT: I see that this errored out in travis; will investigate.

@laanwj
Copy link
Member

laanwj commented Apr 21, 2015

Ah yes, the generate switcharoo. I think we should add an explicit error when setgenerate() is used on regtest, so that bugs like this are tripped up immediately instead of through vague timeouts.

The travis error seems to occur on 64-bit Linux - no detailed information unfortunately

Running testscript maxblocksinflight.py...

Initializing test directory /tmp/testIyXJqc

No output has been received in the last 10 minutes, this potentially indicates a stalled build or something wrong with the build itself.

@laanwj
Copy link
Member

laanwj commented Apr 28, 2015

I'd really like to merge this. Do you have trouble solving the travis error? If so, maybe @theuni can help.

@sdaftuar sdaftuar force-pushed the python-testing-framework-4 branch 2 times, most recently from b7f3812 to 96c34c1 Compare April 28, 2015 16:07
@sdaftuar
Copy link
Member Author

This needed to be rebased anyway, so I did that and now travis succeeds. I'm not sure what to make of that, since I didn't actually fix anything, and it seems odd to think that this was just a spurious travis failure that happened to only affect a new test I've introduced in this pull.

I'll try rebasing again, since I wanted to fold the 4th and 5th commits in anyway, and when I push we'll get one more data point about how travis does with the new tests.

mininode.py provides a framework for connecting to a bitcoin node over the p2p
network. NodeConn is the main object that manages connectivity to a node and
provides callbacks; the interface for those callbacks is defined by NodeConnCB.
Defined also are all data structures from bitcoin core that pass on the network
(CBlock, CTransaction, etc), along with de-/serialization functions.

maxblocksinflight.py is an example test using this framework that tests whether
a node is limiting the maximum number of in-flight block requests.

This also adds support to util.py for specifying the binary to use when
starting nodes (for tests that compare the behavior of different bitcoind
versions), and adds maxblocksinflight.py to the pull tester.
comptool.py creates a tool for running a test suite on top of the mininode p2p
framework.  It supports two types of tests: those for which we expect certain
behavior (acceptance or rejection of a block or transaction) and those for
which we are just comparing that the behavior of 2 or more nodes is the same.

blockstore.py defines BlockStore and TxStore, which provide db-backed maps
between block/tx hashes and the corresponding block or tx.

blocktools.py defines utility functions for creating and manipulating blocks
and transactions.

invalidblockrequest.py is an example test in the comptool framework, which
tests the behavior of a single node when sent two different types of invalid
blocks (a block with a duplicated transaction and a block with a bad coinbase
value).
script.py is modified from the code in python-bitcoinlib, and provides tools
for manipulating and creating CScript objects.

bignum.py is a dependency for script.py

script_test.py is an example test that uses the script tools for running a test
that compares the behavior of two nodes, in a comptool- style test, for each of
the test cases in the bitcoin unit test script files, script_valid.json and
script_invalid.json.  (This test is very slow to run, but is a proof of concept
for how we can write tests to compare consensus-critical behavior between
different versions of bitcoind.)

bipdersig-p2p.py is another example test in the comptool framework, which tests
deployment of BIP DERSIG for a single node.  It uses the script.py tools for
manipulating signatures to be non-DER compliant.
@sdaftuar
Copy link
Member Author

Well, on the first attempt, travis failed again with the 10-minute no-data-received timeout, but this time it was not involving the new p2p tests (it died while the java comparison tool was running). I amended the last commit to force a re-run, and it ran successfully, so now I am a little more willing to believe that these may just be random travis failures that I'm encountering, and not a problem in the code introduced here.

@theuni
Copy link
Member

theuni commented Apr 28, 2015

I came too late.. what was the original error?
Edit: nevermind, posted same time.

Adds printing to the console before/after calls to bitcoin-cli -rpcwait,
if the PYTHON_DEBUG environment variable is initialized.
@sdaftuar
Copy link
Member Author

After discussing with @theuni on IRC, I added a commit that adds some optional debugging (which I enabled for travis) to calls to bitcoin-cli -rpcwait (inside util.py), and then another addressing the bug he caught with the default binary names.

@laanwj
Copy link
Member

laanwj commented Apr 29, 2015

ACK

@laanwj laanwj merged commit 2703412 into bitcoin:master Apr 30, 2015
laanwj added a commit that referenced this pull request Apr 30, 2015
2703412 Fix default binary in p2p tests to use environment variable (Suhas Daftuar)
29bff0e Add some travis debugging for python scripts (Suhas Daftuar)
d76412b Add script manipulation tools for use in mininode testing framework (Suhas Daftuar)
b93974c Add comparison tool test runner, built on mininode (Suhas Daftuar)
6c1d1ba Python p2p testing framework (Suhas Daftuar)
@jonasschnelli
Copy link
Contributor

1st: this is impressive work. Nice!
I think the quick merge honors this job.

Nevertheless here are some first post-ACK reports (more detailed report will follow):

Running maxblocksinflight.py does fail on my local machine:

Running testscript maxblocksinflight.py...
Initializing test directory /var/folders/hp/kb9p9q8x4k3_z_ccy588hxrc0000gn/T/testVtWBqX
MiniNode: Connecting to Bitcoin Node IP # 127.0.0.1:11650
Round 0: success (total requests: 8)
Round 1: success (total requests: 16)
Round 2: success (total requests: 16)
Round 3: success (total requests: 16)
Stopping nodes
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/Users/jonasschnelli/Documents/bitcoin/_bitcoin/qa/rpc-tests/mininode.py", line 1237, in run
    asyncore.loop(0.1, True)
  File "/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/asyncore.py", line 216, in loop
    poll_fun(timeout, map)
  File "/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/asyncore.py", line 145, in poll
    r, w, e = select.select(r, w, e, timeout)
error: (9, 'Bad file descriptor')

same for invalidblockrequest.py

Running testscript invalidblockrequest.py...
Initializing test directory /var/folders/hp/kb9p9q8x4k3_z_ccy588hxrc0000gn/T/testqJOevS
MiniNode: Connecting to Bitcoin Node IP # 127.0.0.1:11506
Test 1: PASS [1]
Test 2: PASS [101]
Assertion failed: Test failed at test 3
  File "/Users/jonasschnelli/Documents/bitcoin/_bitcoin/qa/rpc-tests/test_framework.py", line 119, in main
    self.run_test()
  File "/Users/jonasschnelli/Documents/bitcoin/_bitcoin/qa/rpc-tests/invalidblockrequest.py", line 39, in run_test
    test.run()
  File "/Users/jonasschnelli/Documents/bitcoin/_bitcoin/qa/rpc-tests/comptool.py", line 284, in run
    raise AssertionError("Test failed at test %d" % test_number)
Stopping nodes

''' Can either run this test as 1 node with expected answers, or two and compare them.
Change the "outcome" variable from each TestInstance object to only do the comparison. '''
def __init__(self):
self.num_nodes = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it somehow possible to set this to 2 if --refbinary is set?

@sdaftuar
Copy link
Member Author

@jonasschnelli Thanks, and thanks for running the tests too. Looks to me like the first error message is happening in how the test cleans up, that probably just needs to be made more robust; I'll take a look. The second error message is more concerning though, as the substantive part of the test is failing.

Could you run invalidblockrequest.py --tracerpc (which will produce a lot of debug output, basically every p2p message sent or received, along with the rpc calls) and upload it somewhere for me to analyze? That should help me debug what's going on here.


class NetworkThread(Thread):
def run(self):
asyncore.loop(0.1, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion this on IRC it turned out, that OSes without polling support (OSX) have problems with timeouts below 1 because they always use select.select().
Should be changed to asyncore.loop(1, True)

@jonasschnelli
Copy link
Contributor

Post merge ACK

@sdaftuar
Copy link
Member Author

@jonasschnelli FYI - I did some more digging, and I think I've identified a race condition in the invalidblockrequest.py test (which you triggered the first time you ran, and got the assertion failure at test 3).

I'll PR a cleanup of these issues shortly.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants