Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: create deterministic addrman in the functional tests #29007

Merged
merged 7 commits into from Mar 11, 2024

Conversation

stratospher
Copy link
Contributor

@stratospher stratospher commented Dec 6, 2023

An address is placed in a [bucket,position] in the addrman table (new table or tried table) using the addpeeraddress RPC. This [bucket,position] is calculated using nKey(and other metrics) for the addrman which is chosen randomly during every run.

Supposing there are 2 addresses to be placed in an addrman table. During every test run, a different [bucket,position] would be calculated for each address.These calculated [bucket,position] could even be the same for the 2 addresses in some test runs and result in collisions in the addrman. We wouldn't be able to predict when the collisions are going to happen because we can't predict the nKey value which is chosen at random. This can cause flaky tests.

Because of these non deterministic collisions, we are limited in what we can do to test addrman functionality. Currently in our tests don't add a second address to prevent these collisions from happening - we only keep 1 address in the new table and 1 address in the tried table. See #26988 (comment), #23084, #22831(comment).

This PR lets us create a deterministic addrman with fixed nKey so that we can know the [bucket,position] collisions beforehand, safely add more addresses in an addrman table and write more extensive tests.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 6, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK 0xB10C, amitiuttarwar, mzumsande, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28998 (rpc: addpeeraddress tried return error on failure by 0xB10C)
  • #28802 (ArgsManager: support subcommand-specific options by ajtowns)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

src/init.cpp Outdated
@@ -579,7 +579,7 @@ void SetupServerArgs(ArgsManager& argsman)
argsman.AddArg("-limitancestorsize=<n>", strprintf("Do not accept transactions whose size with all in-mempool ancestors exceeds <n> kilobytes (default: %u)", DEFAULT_ANCESTOR_SIZE_LIMIT_KVB), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-limitdescendantcount=<n>", strprintf("Do not accept transactions if any ancestor would have <n> or more in-mempool descendants (default: %u)", DEFAULT_DESCENDANT_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-limitdescendantsize=<n>", strprintf("Do not accept transactions if any ancestor would have more than <n> kilobytes of in-mempool descendants (default: %u).", DEFAULT_DESCENDANT_SIZE_LIMIT_KVB), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-addrmantest", "Allows to test address relay on localhost", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-addrtest=<option>", strprintf("Test general address related functionalities. Only used in tests. When -addrtest=relay is specified, test address relay on localhost. When -addrtest=addrman is specified, use a deterministic addrman. (default: %s)", "relay"), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this can be generalized to pass test-only options (along with a helper HasTestOption, similar to IsDeprecatedRPCEnabled?

The user-facing debug-help output of this seems the wrong place to document test-only options, and the source code seems sufficient to be self-documenting.

Suggested change
argsman.AddArg("-addrtest=<option>", strprintf("Test general address related functionalities. Only used in tests. When -addrtest=relay is specified, test address relay on localhost. When -addrtest=addrman is specified, use a deterministic addrman. (default: %s)", "relay"), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-test=<option>", "Pass a test-only option", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, (default: %s)", "relay" is wrong, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this can be generalized to pass test-only options (along with a helper HasTestOption, similar to IsDeprecatedRPCEnabled?

what would a function like HasTestOption do though? we could introduce it here to check if we're using a known test option("relay", "addrman"). even if we use unknown test options, we don't need to throw an error right.

In any case, (default: %s)", "relay" is wrong, no?

done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would a function like HasTestOption do though? we could introduce it here to check if we're using a known test option("relay", "addrman"). even if we use unknown test options, we don't need to throw an error right.

It would just be a one-line function to hold the code you already wrote in two places:

return any_of(args.GetArgs("test"), test_option);

Copy link
Contributor

@mzumsande mzumsande Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting to move other existing test-only options (-acceptstalefeeestimates or -fastprune come to mind) under this command?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. (Not here, obviously, but in a follow-up possibly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would just be a one-line function to hold the code you already wrote in two places:

oh got it now.

I wonder if this can be generalized to pass test-only options (along with a helper HasTestOption, similar to IsDeprecatedRPCEnabled?

good idea - liked how we can make sure users don't accidentally use these test-only options in their config file.
done.

src/addrdb.cpp Outdated Show resolved Hide resolved
test/functional/rpc_net.py Outdated Show resolved Hide resolved
@stratospher
Copy link
Contributor Author

thanks @maflcko, @0xB10C, @brunoerg. I've updated the PR to address your comments. Open to more thoughts/suggestions in #29007 (comment).

Comment on lines 457 to 477
},
{
"address": "fc00:1:2:3:4:5:6:7",
"port": 8333,
"services": 9,
"network": "cjdns",
"source": "fc00:1:2:3:4:5:6:7",
"source_network": "cjdns",
},
{
"address": "c4gfnttsuwqomiygupdqqqyy5y5emnk5c73hrfvatri67prd7vyq.b32.i2p",
"port": 8333,
"services": 9,
"network": "i2p",
"source": "c4gfnttsuwqomiygupdqqqyy5y5emnk5c73hrfvatri67prd7vyq.b32.i2p",
"source_network": "i2p",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about how to clear the addrman tables in a functional test. I think it would be ideal if we could avoid the addpeeraddress test leaking into getaddrmaninfo and getrawaddrman tests.

This also becomes relevant in the context of #28998 where we might want to produce a tried table collision on purpose to have coverage of the addpeeraddress RPC failure path. The easiest way of producing a collision is by adding more and more addresses to the tried table (as we do in the unit test). This would leave us with quite a few additional addresses already in the tried table.

I currently see two possibilities for clearing the addrman tables between tests:

  1. shutdown the node, delete peers.dat, and restart
  2. have a third node for this test that's only used for getrawaddrman.

Copy link
Contributor

@0xB10C 0xB10C Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing option 1. in the feature_addrman.py functional test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be ideal if we could avoid the addpeeraddress test leaking into getaddrmaninfo and getrawaddrman tests.

it would be cleaner to not need to know the existing context of addrman when dealing with getaddrmaninfo/getrawaddrman tests. both the options sound good to me!

The easiest way of producing a collision is by adding more and more addresses to the tried table (as we do in the unit test). This would leave us with quite a few additional addresses already in the tried table.

these collisions would only affect future tests added after getaddrmaninfo/getrawaddrman tests though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest way of producing a collision is by adding more and more addresses to the tried table (as we do in the unit test). This would leave us with quite a few additional addresses already in the tried table.

these collisions would only affect future tests added after getaddrmaninfo/getrawaddrman tests though.

We'd need to clean up after we found a collision.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Needs rebase due to a silent conflict with #27581 (feature_asmap.py).

src/net.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@amitiuttarwar amitiuttarwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approach ACK. yay for finally having addrman determinism in the functional tests 🙌🏽
couple thoughts to consider-

1. clearer subtest addrman setup in rpc_net.py

@0xB10C

I think it would be ideal if we could avoid the addpeeraddress test leaking into getaddrmaninfo and getrawaddrman tests.

agreed. right now, the 3 subtests (test_addpeeraddress, test_getaddrmaninfo & test_getrawaddrman) have a lot of implicit dependencies, making the overall test harder to understand & develop on.

here's a suggestion with the intention of making the addrman setup more clear in each subtest:

  • introduce a helper, eg. seed_addrman which uses addpeeraddress to add addresses from all the different networks to addrman.
  • in the beginning of test_getaddrmaninfo & test_getrawaddrman, get a clean addrman (by using either method mentioned by @0xB10C) and call seed_addrman to populate it in an expected way. then each subtest would check the results with the relevant RPC

2. remove addrmantest / test=localaddr

I am not convinced that this test-only behavior is worth maintaining. the only thing that p2p_node_network_limited.py is testing is that 1. an addr message is received & 2. the expected services are advertised. # 1 is forced to true via the test-only codepath in GetLocalAddrForPeer, and the value of # 2 is nominal due to how easily the test-only code could diverge from the mainnet logic.

My vote is to remove this functionality & related code. Curious to hear what you think @stratospher, as well as the opinions of other reviewers.

"""
self.log.info("Test addpeeraddress")
self.restart_node(1, ["-checkaddrman=1"])
self.restart_node(1, ["-checkaddrman=1", "-cjdnsreachable", "-test=addrman"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these init params are added here, but are only relevant until a couple subtests later in test_getaddrmaninfo. having unrelated init params can lead to unexpected behaviors, like in this recent example. implicit dependencies between subtests can also make future development confusing, such as updating setup in an earlier test unexpectedly causing failures in seemingly unrelated tests.

leaving suggestions for improving the structure between subtests in the overall review comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense! added an option in restart_node() to restart the node with an empty addrman which could potentially be useful in other tests. addresses added in addpeeraddress test don't leak into addrman tests now.

test/functional/rpc_net.py Outdated Show resolved Hide resolved
src/init.cpp Outdated
@@ -611,6 +611,7 @@ void SetupServerArgs(ArgsManager& argsman)
argsman.AddArg("-limitdescendantcount=<n>", strprintf("Do not accept transactions if any ancestor would have <n> or more in-mempool descendants (default: %u)", DEFAULT_DESCENDANT_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-limitdescendantsize=<n>", strprintf("Do not accept transactions if any ancestor would have more than <n> kilobytes of in-mempool descendants (default: %u).", DEFAULT_DESCENDANT_SIZE_LIMIT_KVB), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-addrmantest", "Allows to test address relay on localhost", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
argsman.AddArg("-test=<option>", "Pass a test-only option", ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of using -test to group test-only options

two suggestions for improvement:

  • include the options in the help text (eg. how -onlynet or -debug handles)
  • throw some sort of error (or at least warning?) if an invalid option is passed through

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, i've included both improvements!

@stratospher stratospher force-pushed the deterministic-addrman branch 2 times, most recently from 20a8427 to 1b49149 Compare December 21, 2023 06:16
@stratospher
Copy link
Contributor Author

stratospher commented Dec 21, 2023

thanks for the thoughtful feedback @amitiuttarwar, @0xB10C! I've updated the PR to address your comments.

Main changes were: (git diff)

  1. adding an option to restart the node with an empty addrman (method 1 suggested here test: create deterministic addrman in the functional tests #29007 (comment)) to prevent addpeeraddress test leaking into getaddrmaninfo and getrawaddrman tests.
  2. listing options in -help-debug for -test=<option>

I am not convinced that this test-only behavior is worth maintaining. the only thing that p2p_node_network_limited.py is testing is that 1. an addr message is received & 2. the expected services are advertised. # 1 is forced to true via the test-only codepath in GetLocalAddrForPeer, and the value of # 2 is nominal due to how easily the test-only code could diverge from the mainnet logic.

My vote is to remove this functionality & related code. Curious to hear what you think @stratospher, as well as the opinions of other reviewers.

yes, i agree. don't think it's useful to have a command line arg which is used in only 1 line in 1 test. we would want to know if the address with expected services are really advertised in real world logic and not in hard coded test path logic. looks like the option was introduced in this commit.

@stratospher
Copy link
Contributor Author

Updated the PR to remove -addrmantest (git diff) in 5d27993.

Copy link
Contributor

@0xB10C 0xB10C left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 2cc8ca1

Code review and played around with the new -test option.

@@ -1024,6 +1024,22 @@ bool AppInitParameterInteraction(const ArgsManager& args)
if (args.GetBoolArg("-peerbloomfilters", DEFAULT_PEERBLOOMFILTERS))
nLocalServices = ServiceFlags(nLocalServices | NODE_BLOOM);

if (args.IsArgSet("-test")) {
if (chainparams.GetChainType() != ChainType::REGTEST) {
return InitError(Untranslated("-test=<option> should only be used in functional tests"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this makes the error slightly clearer

Suggested change
return InitError(Untranslated("-test=<option> should only be used in functional tests"));
return InitError(Untranslated("-test=<option> can only be used when on regtest"));

Copy link
Contributor

@mzumsande mzumsande Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had the same suggestion independently (we have functional tests that don't use regtest but testnet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

included in #29636

@amitiuttarwar
Copy link
Contributor

ACK 2cc8ca1

reviewed the code & tested the init option

@DrahtBot DrahtBot removed the request for review from amitiuttarwar February 8, 2024 06:57
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK 2cc8ca1

@@ -1024,6 +1024,22 @@ bool AppInitParameterInteraction(const ArgsManager& args)
if (args.GetBoolArg("-peerbloomfilters", DEFAULT_PEERBLOOMFILTERS))
nLocalServices = ServiceFlags(nLocalServices | NODE_BLOOM);

if (args.IsArgSet("-test")) {
if (chainparams.GetChainType() != ChainType::REGTEST) {
return InitError(Untranslated("-test=<option> should only be used in functional tests"));
Copy link
Contributor

@mzumsande mzumsande Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had the same suggestion independently (we have functional tests that don't use regtest but testnet)

auto addrman{std::make_unique<AddrMan>(netgroupman, /*deterministic=*/false, /*consistency_check_ratio=*/check_addrman)};
bool deterministic = HasTestOption(args, "addrman"); // use a deterministic addrman only for tests

auto addrman{std::make_unique<AddrMan>(netgroupman, /*deterministic=*/deterministic, /*consistency_check_ratio=*/check_addrman)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't know if we have recomendations about this, but I find /*deterministic=*/deterministic a bit silly, might as well drop the named argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

included in #29636

@DrahtBot DrahtBot requested review from mzumsande and removed request for mzumsande February 13, 2024 18:37
@maflcko maflcko added this to the 28.0 milestone Feb 22, 2024
@achow101
Copy link
Member

ACK 2cc8ca1

@achow101 achow101 merged commit a945f09 into bitcoin:master Mar 11, 2024
15 of 16 checks passed
Comment on lines +48 to +55
node.addpeeraddress(address="1.2.3.4", tried=True, port=8333)
node.addpeeraddress(address="2.0.0.0", port=8333)
node.addpeeraddress(address="1233:3432:2434:2343:3234:2345:6546:4534", tried=True, port=8333)
node.addpeeraddress(address="2803:0:1234:abcd::1", port=45324)
node.addpeeraddress(address="fc00:1:2:3:4:5:6:7", port=8333)
node.addpeeraddress(address="pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd.onion", tried=True, port=8333)
node.addpeeraddress(address="nrfj6inpyf73gpkyool35hcmne5zwfmse3jl3aw23vk7chdemalyaqad.onion", port=45324, tried=True)
node.addpeeraddress(address="c4gfnttsuwqomiygupdqqqyy5y5emnk5c73hrfvatri67prd7vyq.b32.i2p", port=8333)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we should check that these addresses are successfully added to the addrman. They work with the current deterministic addrman, but this might break silently when the addrman bucket/position calculation changes and these cause a collision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

included in #29636

@0xB10C
Copy link
Contributor

0xB10C commented Mar 12, 2024

Maybe this reduces someone's time spent debugging: I noticed in #28998 that restarting a node with -test=addrman doesn't make an existing addrman deterministic. You also need to set clear_addrman=True to start off with fresh and deterministic addrman.

self.restart_node(x, ["-test=addrman"], clear_addrman=True)

0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 12, 2024
Just having deterministic is enough. See bitcoin#29007 (comment)
0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 12, 2024
@0xB10C 0xB10C mentioned this pull request Mar 12, 2024
0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 13, 2024
Just having deterministic is enough. See bitcoin#29007 (comment)
0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 13, 2024
0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 23, 2024
Just having deterministic is enough. See bitcoin#29007 (comment)
0xB10C added a commit to 0xB10C/bitcoin that referenced this pull request Mar 23, 2024
fanquake added a commit that referenced this pull request Mar 25, 2024
9a44a20 init: clarify -test error (0xb10c)
3047c3e addrman: drop /*deterministic=*/ comment (0xb10c)
89b84ea test: check that addrman seeding is successful (0xb10c)

Pull request description:

  A few, small follow-ups to #29007. See commit messages for details.

ACKs for top commit:
  maflcko:
    lgtm ACK 9a44a20
  stratospher:
    tested ACK 9a44a20.
  mzumsande:
    Code Review ACK 9a44a20

Tree-SHA512: 987245e035da08fa7fe541a1dc3b7c2d90f703a6f9813875048d286335c63ffa5201db702a3f368087c00fa02c3fdafb06cf54dc7a92922749a94588b1500e98
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this pull request Apr 6, 2024
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this pull request Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants