Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build previous releases and run functional tests #12134

Open
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
@Sjors
Copy link
Member

commented Jan 9, 2018

Includes test for upgrading v0.17.1 wallets and opening v0.18.0 wallets with older versions.

Additional scenarios where it's useful to run tests against earlier releases:

  • creating a wallet with #11403's segwit implementation, copying it to an older node and making sure the user didn't lose any funds (although this PR doesn't support v0.15.1)
  • future consensus changes
  • P2P changes (e.g. to make sure we don't accidentally ban old nodes)

Usage:

contrib/devtools/backwards_compatibility.py -b -f v0.17.1
test/functional/backwards_compatibility.py

Having to build these old versions on developer machines is unacceptably slow and brittle, so tests that involve older versions don't run by default. Travis can cache earlier releases, so it should be able to run these tests with little performance impact.

@ryanofsky

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2018

Can you explain why exactly the patching is needed in this case? It would definitely seem better to test against unpatched previous releases if possible.

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Jan 9, 2018

@ryanofsky the regtest parameters were changed to activate SegWit at the genesis block, which causes nodes after #11389 to reject blocks created by v0.15.1 and older nodes (and vice versa). Maybe there's a less drastic way to change the regtest consensus params for these older releases?

@ryanofsky

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2018

Not sure, but is passing -vbparams=segwit:0:0 an option?

@sdaftuar

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

@Sjors I've spent some time looking into how feasible it would be to write functional tests that would run against older versions of our software, which I think would be great if we could pull off -- for instance, I thought it would be nice in something like the p2p-segwit.py test for us to use an actual old binary in the upgrade_after_activation test, rather than mocking it with a current binary by setting the bip9 parameters to 0.

However, in my experience this proved to be impractical. It seems like we periodically make changes to the RPC test framework infrastructure itself, in such ways that cause problems with backward compatibility. Here are a few examples I have run into, so you get a flavor of some of the issues:

  • sync_blocks() was rewritten to use an rpc call, waitforblockheight, that was introduced in 0.14. So no test that uses sync_blocks can run on a pre-0.14 version of the code (which is basically every test we have!).

  • Even if you solve that problem somehow (eg by rewriting sync_blocks, or avoiding it somehow), we also introduced in 0.14 support for RPC named arguments. This also breaks compatibility with pre-0.14 software (so without a patch, I believe no RPC calls work at all).

  • I might have the details of this one slightly wrong, but I think we made a change at some point to support parsing numbers as strings, versus floats, and then I think our JSON library started to take advantage of that as well? Not sure if that's exactly right, but I remember having an issue running newer tests against older releases when using RPC calls involving numbers, and needing a patch for that.

So my takeaway from this exercise was that if we want to support running modern tests against older binaries, we need to change the way we do things, so that we (a) have a better split between the python test code, and the python utilities/RPC handlers/etc that allow us to talk to a node, and (b) come up with a way to use older python utility code when talking to older nodes.

@ryanofsky

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2018

So no test that uses sync_blocks can run on a pre-0.14 version of the code (which is basically every test we have!).

I don't understand why this would matter. I think the goal here is just to be able to write a small test that brings up a specific version of bitcoind (0.15) and tests a few things with it, not to be able to run large swathes of the test framework against arbitrary bitcoin releases. It doesn't seem it would require a lot of code to support this, and if the tests are run on travis it doesn't seem like whatever code is added could get broken easily.

@sdaftuar

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

@ryanofsky Perhaps I phrased that sentence poorly -- my point wasn't that we'd want to run all the tests against older code (I think that would be absurd), just that we take it for granted that we can call sync_blocks in all sorts of places, so to not have such a basic utility at your disposal can be surprising and hard to work around.

Another way to look at it: imagine you wanted to run a test against 0.15 now, so you write one and you can even use sync_blocks and everything works fine (I agree that this shouldn't be too hard, as 0.15 was very recent). Based on our history, I imagine there will come a time when someone wants to do something which will break the test, because some utility code -- like sync_blocks, or the way we make RPC calls, or something else -- will change, and the test will be busted, and we won't have a good way to fix it.

@jnewbery

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

This seems possible for small, limited tests which don't use the full range of test framework capabilities (for example, if we just want to test downgrading to v0.15 to test segwit wallets). As Suhas has pointed out, there are lots of reasons why this is difficult in the general case, but I think we can work around them for a targeted test cases:

sync_blocks() was rewritten to use an rpc call, waitforblockheight, that was introduced in 0.14.

Not relevant if we only want to test as far back as v0.15, but we can add a waitforblockheight method to TestNode for versions that don't have a waitforblockheight RPC method.

Even if you solve that problem somehow (eg by rewriting sync_blocks, or avoiding it somehow), we also introduced in 0.14 support for RPC named arguments.

We could be careful to not use named arguments in tests where we want to use earlier versions. Or we could add a shim to TestNode to convert named -> positional.

I think we made a change at some point to support parsing numbers as strings, versus floats, and then I think our JSON library started to take advantage of that as well?

As above.

I imagine there will come a time when someone wants to do something which will break the test, because some utility code -- like sync_blocks, or the way we make RPC calls, or something else -- will change, and the test will be busted, and we won't have a good way to fix it.

I expect we'll only want to have tests covering the last one or two releases, so I don't see this as a huge problem.

I think the goal here is ... not to be able to run large swathes of the test framework against arbitrary bitcoin releases.

Yes - I agree that would be almost impossible.

@Sjors - if the goal is only to test official releases, why not download the binaries directly from https://bitcoincore.org/bin/ rather than building them locally?

@sdaftuar

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

This is an aside if the goal here is just to make 0.15 work, but:

We could be careful to not use named arguments in tests where we want to use earlier versions. Or we could add a shim to TestNode to convert named -> positional.

The change to support named args actually affected RPC invocations that don't use named args -- I think we send an empty dictionary instead of an empty list in authproxy.py which an old server wouldn't accept.

Anyway I don't mean to be overly pessimistic if you all think we can make something work! I'm definitely in favor of the overall goal, I just haven't come across a simple solution that I thought was robust.

@sdaftuar

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

I expect we'll only want to have tests covering the last one or two releases, so I don't see this as a huge problem.

I don't follow this point though -- in something like the p2p-segwit.py test, you'd have to do the upgrade after activation test on an 0.13.0 node or earlier. So if we'd written the test in such a way as to use an old binary, we'd have to just decommission that test at some point...? Obviously replacing with a later binary (that has the feature you want to test upgrade from) would mean you can't do that test anymore.

@jnewbery

This comment has been minimized.

Copy link
Member

commented Jan 9, 2018

in something like the p2p-segwit.py test, you'd have to do the upgrade after activation test on an 0.13.0 node or earlier. So if we'd written the test in such a way as to use an old binary, we'd have to just decommission that test at some point...?

Yes - my expectation is that we'd decommission that test at some point. I don't think that testing versions older than a couple of releases is something that we should do as part of travis or developer local testing (although there's nothing stopping anyone from having a custom test rig to do that sort of thing).

@Sjors Sjors force-pushed the Sjors:previous-release branch 14 times, most recently Jan 10, 2018

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Jan 10, 2018

@jnewbery compiling from source seems more flexible than fetching binaries, especially if we need to patch things. There might also be configure flags that are optimal for the functional tests. It can be expanded to fetch code from other repositories (on your own test rig).

Downloading binaries would be faster, but it also involves figuring out which OS you're on and checking the checksum, so it's not necessarily easier to write a script for. That said, the one-off build is painfully slow on Travis atm, so I might add an option to just fetch a binary.

@ryanofsky -vbparams=segwit:0:0 -vbparams=csv:0:0 did the trick, as long as I generate blocks on the modern node (which is fine). So no patch needed for v0.15.1 at the moment.

@sdaftuar I ran into the issue you mentioned for versions before v0.14 and added a comment to the test file to warn about that.

I changed the demonstration test to simply check if old nodes sync blocks from the new node. I'll try some SegWit wallet related scenarios in a different PR.

I added a commit that makes one Travis instance compile v0.14.2 and v0.15.1 (cached) and then run the relevant functional test. It works, but I'm a bit confused as to what the right place is on Travis to put these older builds.

.travis.yml Outdated
@@ -7,6 +7,7 @@ cache:
- depends/built
- depends/sdk-sources
- $HOME/.ccache
- build/releases

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jan 10, 2018

Member

That cache should be covered by .ccache, no?

This comment has been minimized.

Copy link
@Sjors

Sjors Jan 11, 2018

Author Member
  • will look into ccache behavior.

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jan 11, 2018

Member

Forget what I said. It is probably better to cache the resulting binary than the individual obj files, since we are building specific tags that don't change. For master or other branch build, the commits often only change parts of the code, so the other obj files can be cached and re-used. For tag builds everything is static, so caching the binaries works as well.

@MarcoFalke
Copy link
Member

left a comment

Concept ACK. Just noting that every change to the rpc interface is a breaking change in compatibility tests, that potentially needs attention. So we need to run those tests in the pull request travis run.

test/functional/backwards_compatibility.py Outdated
"""Override this method to customize test node setup"""
self.add_nodes(self.num_nodes, self.extra_args, None, None, [
None,
"build/releases/v0.15.1/src/bitcoind",

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jan 10, 2018

Member

Would be nice to be able to pass those paths in. Could overwrite add_options in test_framework.

This comment has been minimized.

Copy link
@Sjors

Sjors Jan 11, 2018

Author Member

Do you mean making --srcdir accept multiple paths? That's probably a useful PR by itself, I made a note.

test/functional/backwards_compatibility.py Outdated
]

def setup_nodes(self):
"""Override this method to customize test node setup"""

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jan 10, 2018

Member

Comment seems misplaced

This comment has been minimized.

Copy link
@Sjors

Sjors Jan 11, 2018

Author Member

Removed.

test/functional/backwards_compatibility.py Outdated

def setup_nodes(self):
"""Override this method to customize test node setup"""
self.add_nodes(self.num_nodes, self.extra_args, None, None, [

This comment has been minimized.

Copy link
@MarcoFalke

MarcoFalke Jan 10, 2018

Member

Use named args for those unnamed arguments to provide documentation and robustness against future interface changes.

This comment has been minimized.

Copy link
@Sjors

Sjors Jan 11, 2018

Author Member

Done.

@jonasschnelli

This comment has been minimized.

Copy link
Member

commented Jan 10, 2018

Concept ACK, haven't looked at the code.

I think not testing with older versions is the biggest lack in our test env.
Things like the UTXO migration (now done) or testing how "older" nodes act on the (new) p2p network.
Also, up- and downgrading wallets is not covered by our tests and could be with something like this.

@fanquake fanquake added the Tests label Jan 11, 2018

@meshcollider

This comment has been minimized.

Copy link
Member

commented Jan 11, 2018

Concept ACK, cross version testing will be a very nice addition to the testing suite IMO

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Jan 5, 2019

Rebased for v0.17.1

ef7c54c1eceb0eb938dd4e8312283c7433fe0c8f added a wait argument to stop (cc @promag), which causes the v0.17.1 node to fail upon shutdown. I worked around this by adding an optional version to TestNode().

@Sjors Sjors force-pushed the Sjors:previous-release branch 2 times, most recently Jan 5, 2019

Show resolved Hide resolved .travis/test_06_script.sh Outdated

@DrahtBot DrahtBot removed the Needs rebase label Jan 6, 2019

@Sjors Sjors force-pushed the Sjors:previous-release branch to 6df9a48 Jan 7, 2019

@jnewbery

This comment has been minimized.

Copy link
Member

commented Jan 7, 2019

@Sjors - what's your intention for this branch? Do you want it to get merged into master, or are you maintaining it as a separate branch for your own testing? (if so perhaps you could add a [do not merge] tag to the PR title)

The version logic that you've added to TestNode is fairly limited now, but I expect that if you want to maintain this branch and have it remain compatible with different versions of bitcoind, the complexity will start to creep up. I think this kind of testing is useful, particularly for any future softfork, but my personal preference would still be to not have this merged into master.

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Jan 9, 2019

@jnewbery I would like it to get merged, but I'm willing to maintain it for the time being even if it doesn't get merged. I think people were worried about maintainability if this is merged. So far it's not too bad in my experience.

I don't think the goal is to support an infinite number of old bitcoind versions, so complexity can probably be kept low by dropping old versions and simplifying the test patches.

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Feb 2, 2019

I'll rebase after 0.18 binaries are up, unless we want to merge this sooner.

I might also add a test for #15226: create a blank wallet in 0.18 and master and check that they can't be opened by 0.17.

@Sjors Sjors force-pushed the Sjors:previous-release branch from 6df9a48 to 41cbc9a Feb 20, 2019

@DrahtBot DrahtBot removed the Needs rebase label Feb 20, 2019

@Sjors Sjors force-pushed the Sjors:previous-release branch from 41cbc9a to 11b758f Feb 20, 2019

@Sjors

This comment has been minimized.

Copy link
Member Author

commented Feb 20, 2019

Rebased and prepared for v0.18.0 release (just need to uncomment a few lines after the binaries are up, but it can be merged before that if we want).

I added tests for opening regular, private key disabled and blank wallets in v0.17.1 (cc @achow101, @instagibbs). They work by creating wallets in v0.18.0 and on master, copying and then opening with v0.18.0 and v0.17.1.

I also added a test to upgrade a v0.17.1 wallet to master, in particular to check if the key origin data is upgraded.

@Sjors Sjors force-pushed the Sjors:previous-release branch from 11b758f to 547f0e5 Feb 20, 2019

@Sjors Sjors force-pushed the Sjors:previous-release branch from 547f0e5 to c28fc7b Mar 7, 2019

@DrahtBot DrahtBot removed the Needs rebase label Mar 7, 2019

Sjors added some commits Jul 6, 2018

@Sjors Sjors force-pushed the Sjors:previous-release branch from c28fc7b to ddeba7b May 3, 2019

@Sjors

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

It now uses v0.18.0 release.

@practicalswift

This comment has been minimized.

Copy link
Member

commented May 3, 2019

Concept ACK

Nice work!

@instagibbs

This comment has been minimized.

Copy link
Member

commented Jun 12, 2019

and the test will be busted, and we won't have a good way to fix it

I think this is an argument for keeping the scope reduced but not convincing to keep out entirely. I sincerely hope that the functional suite can support semi-modern versions with respect to block sync checks, for example.

concept ACK

lack of wallet upgrade tests are some of the most painful points right now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.