New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python P2P testing framework #5981
Conversation
Nice! |
Nice work!
I'd rather not add any external dependencies for the tests. Do we need any kind of persistent block storage in the tests? If we really do, I'd rather hack something together with python. |
@laanwj I'd prefer to leave the disk-backed storage there, even though it's probably not needed for the existing tests I've included in this pull, because it will be needed to write longer tests like the comparison test which currently runs in the pull-tester. I pushed up a commit that replaces leveldb with dbm, which I believe is a standard python package; does that seem like an acceptable option? |
Sure, using a python standard package is fine, my gripe was with using an external dependency. Using disk-backed storage is no problem as long as subsequent tests don't interfere with each other. |
6070e7b
to
9fbbdb9
Compare
Maybe add the new tests (whenever they don't take too long) to the pull tester e.g. The
|
9fbbdb9
to
cde34ff
Compare
@laanwj The I've also shortened EDIT: I see that this errored out in travis; will investigate. |
Ah yes, the The travis error seems to occur on 64-bit Linux - no detailed information unfortunately
|
I'd really like to merge this. Do you have trouble solving the travis error? If so, maybe @theuni can help. |
b7f3812
to
96c34c1
Compare
This needed to be rebased anyway, so I did that and now travis succeeds. I'm not sure what to make of that, since I didn't actually fix anything, and it seems odd to think that this was just a spurious travis failure that happened to only affect a new test I've introduced in this pull. I'll try rebasing again, since I wanted to fold the 4th and 5th commits in anyway, and when I push we'll get one more data point about how travis does with the new tests. |
mininode.py provides a framework for connecting to a bitcoin node over the p2p network. NodeConn is the main object that manages connectivity to a node and provides callbacks; the interface for those callbacks is defined by NodeConnCB. Defined also are all data structures from bitcoin core that pass on the network (CBlock, CTransaction, etc), along with de-/serialization functions. maxblocksinflight.py is an example test using this framework that tests whether a node is limiting the maximum number of in-flight block requests. This also adds support to util.py for specifying the binary to use when starting nodes (for tests that compare the behavior of different bitcoind versions), and adds maxblocksinflight.py to the pull tester.
comptool.py creates a tool for running a test suite on top of the mininode p2p framework. It supports two types of tests: those for which we expect certain behavior (acceptance or rejection of a block or transaction) and those for which we are just comparing that the behavior of 2 or more nodes is the same. blockstore.py defines BlockStore and TxStore, which provide db-backed maps between block/tx hashes and the corresponding block or tx. blocktools.py defines utility functions for creating and manipulating blocks and transactions. invalidblockrequest.py is an example test in the comptool framework, which tests the behavior of a single node when sent two different types of invalid blocks (a block with a duplicated transaction and a block with a bad coinbase value).
96c34c1
to
17f463d
Compare
script.py is modified from the code in python-bitcoinlib, and provides tools for manipulating and creating CScript objects. bignum.py is a dependency for script.py script_test.py is an example test that uses the script tools for running a test that compares the behavior of two nodes, in a comptool- style test, for each of the test cases in the bitcoin unit test script files, script_valid.json and script_invalid.json. (This test is very slow to run, but is a proof of concept for how we can write tests to compare consensus-critical behavior between different versions of bitcoind.) bipdersig-p2p.py is another example test in the comptool framework, which tests deployment of BIP DERSIG for a single node. It uses the script.py tools for manipulating signatures to be non-DER compliant.
17f463d
to
d76412b
Compare
Well, on the first attempt, travis failed again with the 10-minute no-data-received timeout, but this time it was not involving the new p2p tests (it died while the java comparison tool was running). I amended the last commit to force a re-run, and it ran successfully, so now I am a little more willing to believe that these may just be random travis failures that I'm encountering, and not a problem in the code introduced here. |
I came too late.. what was the original error? |
Adds printing to the console before/after calls to bitcoin-cli -rpcwait, if the PYTHON_DEBUG environment variable is initialized.
After discussing with @theuni on IRC, I added a commit that adds some optional debugging (which I enabled for travis) to calls to bitcoin-cli -rpcwait (inside util.py), and then another addressing the bug he caught with the default binary names. |
ACK |
2703412 Fix default binary in p2p tests to use environment variable (Suhas Daftuar) 29bff0e Add some travis debugging for python scripts (Suhas Daftuar) d76412b Add script manipulation tools for use in mininode testing framework (Suhas Daftuar) b93974c Add comparison tool test runner, built on mininode (Suhas Daftuar) 6c1d1ba Python p2p testing framework (Suhas Daftuar)
1st: this is impressive work. Nice! Nevertheless here are some first post-ACK reports (more detailed report will follow): Running
same for invalidblockrequest.py
|
''' Can either run this test as 1 node with expected answers, or two and compare them. | ||
Change the "outcome" variable from each TestInstance object to only do the comparison. ''' | ||
def __init__(self): | ||
self.num_nodes = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it somehow possible to set this to 2 if --refbinary
is set?
@jonasschnelli Thanks, and thanks for running the tests too. Looks to me like the first error message is happening in how the test cleans up, that probably just needs to be made more robust; I'll take a look. The second error message is more concerning though, as the substantive part of the test is failing. Could you run |
|
||
class NetworkThread(Thread): | ||
def run(self): | ||
asyncore.loop(0.1, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussion this on IRC it turned out, that OSes without polling support (OSX) have problems with timeouts below 1 because they always use select.select()
.
Should be changed to asyncore.loop(1, True)
Post merge ACK |
@jonasschnelli FYI - I did some more digging, and I think I've identified a race condition in the invalidblockrequest.py test (which you triggered the first time you ran, and got the assertion failure at test 3). I'll PR a cleanup of these issues shortly. |
Motivated by the discussion in #4545, I've been working on building a python testing framework that will allow us to write tests that exercise the p2p code in bitcoind, so that we can perform end-to-end testing and perform comparisons of behavior across bitcoind versions. This code builds on the existing RPC testing functionality, so that tests using both RPC calls and p2p messages can be written.
I've split this pull into 3 commits to try to make this easier to review.
It's possible to write (crude) tests using mininode.py alone, if you want. Also, mininode supports testing outside of regtest (it can communicate on testnet and mainnet as well), so that theoretically allows for a broader category of testing than we currently can do. All the tests I've been working on have been focused on regtest, however.
Also, in the first commit, I provide one example test (maxblocksinflight.py) that shows how you can use mininode. This is largely a proof-of-concept example, but it does test something useful: 0.10 used to fail this test until #5507 was merged.
EDIT: I forgot to mention that I use py-leveldb in blockstore.py, a file I introduced in this commit that provides disk-backed storage for blocks. I think I had to manually install that package on my machine, so I wanted to flag this dependency in case others think this would be a problem.
If this framework looks okay, then I think a future project would be to re-implement the pull-tester's comparison test in this framework.
I realize this is a lot of code, so if there's anything more I can do to aid in review please let me know.