Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't query all DNS seeds at once #15558

Merged
merged 1 commit into from Sep 23, 2019

Conversation

@sipa
Copy link
Member

commented Mar 7, 2019

Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them.

Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on.

This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them.

@fanquake fanquake added the P2P label Mar 7, 2019
@DrahtBot

This comment has been minimized.

Copy link
Contributor

commented Mar 7, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@sipa sipa force-pushed the sipa:201903_dnsoneatatime branch from 70d24bd to 9f36b04 Mar 7, 2019
@sipa sipa changed the title Query DNS seeds one by one Don't query all DNS seeds at on ce Mar 8, 2019
@fanquake fanquake changed the title Don't query all DNS seeds at on ce Don't query all DNS seeds at once Mar 8, 2019
@luke-jr

This comment has been minimized.

Copy link
Member

commented Mar 8, 2019

Wouldn't this mean if one DNS seed is compromised, it could potentially feed sybil addresses to nodes and its victims wouldn't have any honest nodes (eg, from other DNS seeds)?

Does the issue it intends to address actually exist? I would think DNS queries would get cached and proxied by ISP DNS servers sufficiently to deny any information to the real DNS servers, no?

@jimpo

This comment has been minimized.

Copy link
Contributor

commented Mar 8, 2019

3 seems like an OK number to query for at once, but a reasonable thing to bikeshed a bit.

Code looks good to me.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

commented Mar 8, 2019

Yeah, my comment in IRC was that one creates an immediate eclipse vulnerability. Three sounds reasonable.

We also really should look at why this is triggered as much as it is, locally I see it run on every restart these days.

@laanwj

This comment has been minimized.

Copy link
Member

commented Mar 9, 2019

Concept ACK.

Yeah, my comment in IRC was that one creates an immediate eclipse vulnerability. Three sounds reasonable.

But only for new nodes, I suppose? If it already has gossiped peers before, then 'query one DNS seed' seems more reasonable. But okay, 3 is a safer option in any case.

We also really should look at why this is triggered as much as it is, locally I see it run on every restart these days.

My guess would be that the peers it tries initially to connect to might not be selected for connectability. Maybe 11 seconds is also too little. I think it would be fine to increase this to say, 30 or even 60 seconds.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2019

It may just be that we need to try connections faster initially. ... I dunno I really have the impression that it was frequently not using dns seeds in the past but is now, but I don't have hard data to support that. In theory feelers should mean that our tried table is better than ever before, but it might just be that there has been a lot of node churn lately.

@jgarzik

This comment has been minimized.

Copy link
Contributor

commented Mar 9, 2019

When the code was added circa 2014 in 2e7009d, 11 seconds was completely arbitrary. It was always the intent to adjust that number based on field experience. It sounds like it needs to increase.

The driving theory is to make DNS seeds a fallback and "give it a good try" to avoid querying DNS seeds at all.

The code already skips the delay for an empty AddrMan -- ie. fresh bitcoind install -- thus narrowing the user impact of a longer delay to existing nodes that presumably have addrman data to sort through.

It boils down to a question of how much dependence should a node have on DNS seeds, in general? 300 seconds is a reasonable answer for the new delay, if we want to give AddrMan time to "warm up" and preserve privacy a bit more.

@luke-jr

This comment has been minimized.

Copy link
Member

commented Mar 10, 2019

It boils down to a question of how much dependence should a node have on DNS seeds, in general?

IMO nodes should only consult DNS seeds at first run, and then never again.

@jonasschnelli

This comment has been minimized.

Copy link
Member

commented Mar 15, 2019

Concept ACK

@Sjors

This comment has been minimized.

Copy link
Member

commented Mar 15, 2019

Concept ACK with > 1 seed per round.

utACK 9f36b04 (3 seeds)

The frequent DNS queries recently have hopefully been addressed with #15486.

IMO nodes should only consult DNS seeds at first run, and then never again.

I like that idea as well, but for another PR, as it requires much more thought than this change.

11 seconds was completely arbitrary. It was always the intent to adjust that number based on field experience. It sounds like it needs to increase

As a GUI use I'd get antsy if "nothing happens" for 15+ seconds, but as long as there's some indicator that we're trying various new peers that should be fine. In that case we can wait much longer before trying the DNS again.

Copy link
Member

left a comment

utACK, with a couple of minor suggestions

src/net.cpp Outdated Show resolved Hide resolved
src/net.cpp Outdated Show resolved Hide resolved
@fanquake fanquake added this to the 0.19.0 milestone Aug 4, 2019
@fanquake

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

@naumenkogs Did you want to take a look here?

This has a number of Concept ACKs, have tagged with 0.19.0.

@practicalswift

This comment has been minimized.

Copy link
Member

commented Aug 4, 2019

Concept ACK

@naumenkogs

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2019

First of all, could we update the initial message of the PR so that it's clear that it suggests connecting to 3 (not one) without reading full discussion? Maybe strike-through would work.

The choice of 3 seeds and the general idea seems reasonable to me.
utACK

Instead, when necessary, query 3. If that leads to a sufficient number
of connects, stop. If not, query 3 more, and so on.
@sipa sipa force-pushed the sipa:201903_dnsoneatatime branch from 9f36b04 to 6170ec5 Aug 7, 2019
@sipa

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

Addressed the comments above.

@Sjors

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

ACK 6170ec5

With -forcednsseed=1 it now fetches all (testnet) seeds.

When I restart it sometimes fetches (3) seeds, about 10 seconds after dnsseed thread start. Other times it finds several outbound nodes and the DNS seed exited.

When I delete peers.dat it uses all DNS seeds.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2019

One high-level concern with this is the monoculture of DNS seed software we have, if you select three at random you're almost certain to get three seeds serving from sipa's seeder implementation, whereas we should really be trying to diversify a little bit (not to mention get things like dnssec going, which I dont believe sipa's seeder support). Otherwise, Concept ACK.

@fanquake

This comment has been minimized.

Copy link
Member

commented Sep 13, 2019

@naumenkogs @luke-jr Did you want to re-review here?

@sdaftuar This might also interest you?

@sdaftuar

This comment has been minimized.

Copy link
Member

commented Sep 18, 2019

ACK 6170ec5

@jonasschnelli

This comment has been minimized.

Copy link
Member

commented Sep 18, 2019

utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model).

Copy link
Member

left a comment

ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without -forcednsseed and/or a peers.dat and monitored the DNS activity.

src/bitcoind -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:36:24Z Loaded 0 banned node ips/subnets from banlist.dat  0ms
2019-09-23T04:36:24Z net: setting try another outbound peer=false
2019-09-23T04:36:29Z init message: Loading P2P addresses...
2019-09-23T04:36:29Z Loaded 6640 addresses from peers.dat  18ms
2019-09-23T04:36:29Z init message: Starting network threads...
2019-09-23T04:36:29Z net thread start
2019-09-23T04:36:29Z dnsseed thread start
2019-09-23T04:36:40Z P2P peers available. Skipped DNS seeding.
2019-09-23T04:36:40Z dnsseed thread exit
src/bitcoind -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:20:17Z Loaded 0 banned node ips/subnets from banlist.dat  0ms
2019-09-23T04:20:17Z net: setting try another outbound peer=false
2019-09-23T04:20:22Z init message: Loading P2P addresses...
2019-09-23T04:20:22Z Loaded 2863 addresses from peers.dat  7ms
2019-09-23T04:20:22Z init message: Starting network threads...
2019-09-23T04:20:22Z net thread start
2019-09-23T04:20:22Z dnsseed thread start
2019-09-23T04:20:33Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:20:33Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:20:33Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:20:44Z P2P peers available. Skipped DNS seeding.
2019-09-23T04:20:44Z dnsseed thread exit
src/bitcoind -forcednsseed -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:09:52Z Loaded 0 banned node ips/subnets from banlist.dat  0ms
2019-09-23T04:09:52Z net: setting try another outbound peer=false
2019-09-23T04:09:58Z init message: Loading P2P addresses...
2019-09-23T04:09:58Z Loaded 9738 addresses from peers.dat  23ms
2019-09-23T04:09:58Z init message: Starting network threads...
2019-09-23T04:09:58Z net thread start
2019-09-23T04:09:58Z dnsseed thread start
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.sipa.be
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoinstats.com
2019-09-23T04:09:58Z Loading addresses from DNS seed dnsseed.bitcoin.dashjr.org
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.jonasschnelli.ch
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.sprovoost.nl
2019-09-23T04:09:58Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:09:59Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:09:59Z 294 addresses found from DNS seeds
2019-09-23T04:09:59Z dnsseed thread exit
src/bitcoind -forcednsseed -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:11:34Z Loaded 0 banned node ips/subnets from banlist.dat  0ms
2019-09-23T04:11:34Z net: setting try another outbound peer=false
2019-09-23T04:11:39Z init message: Loading P2P addresses...
2019-09-23T04:11:39Z ERROR: DeserializeFileDB: Failed to open file /Users/michael/Library/Application Support/Bitcoin/peers.dat
2019-09-23T04:11:39Z Invalid or missing peers.dat; recreating
2019-09-23T04:11:39Z Flushed 0 addresses to peers.dat  4ms
2019-09-23T04:11:39Z init message: Starting network threads...
2019-09-23T04:11:39Z dnsseed thread start
2019-09-23T04:11:39Z net thread start
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.sprovoost.nl
2019-09-23T04:11:39Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.jonasschnelli.ch
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.sipa.be
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoinstats.com
2019-09-23T04:11:40Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:11:40Z Loading addresses from DNS seed dnsseed.bitcoin.dashjr.org
2019-09-23T04:11:40Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:11:40Z 294 addresses found from DNS seeds
2019-09-23T04:11:40Z dnsseed thread exit
const std::vector<std::string> &vSeeds = Params().DNSSeeds();
int found = 0;
for (const std::string& seed : seeds) {
// goal: only query DNS seed if address need is acute

This comment has been minimized.

Copy link
@fanquake

fanquake Sep 23, 2019

Member

goal: only query DNS seed if address need is acute

From my understanding, "acute" here is essentially on first run, or if you've got a peers.dat with low quality/unlikely to connect quickly peers?

@fanquake

This comment has been minimized.

Copy link
Member

commented Sep 23, 2019

One high-level concern with this is the monoculture of DNS seed software we have, if you select three at random you're almost certain to get three seeds serving from sipa's seeder implementation, whereas we should really be trying to diversify a little bit (not to mention get things like dnssec going, which I dont believe sipa's seeder support).

Opened #16938 if anyone wants to brainstorm / discuss DNS seeder software diversity.
This PR is also related to #15434.

fanquake added a commit that referenced this pull request Sep 23, 2019
6170ec5 Do not query all DNS seed at once (Pieter Wuille)

Pull request description:

  Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them.

  Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on.

  This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them.

ACKs for top commit:
  Sjors:
    ACK 6170ec5
  sdaftuar:
    ACK 6170ec5
  jonasschnelli:
    utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model).
  fanquake:
    ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity.

Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
@fanquake fanquake merged commit 6170ec5 into bitcoin:master Sep 23, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@MarcoFalke

This comment has been minimized.

Copy link
Member

commented Sep 23, 2019

Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on

I guess this should say "try again with another three"


2019-09-23T14:06:29Z Loading addresses from DNS seed seed.bitcoin.sipa.be
2019-09-23T14:06:29Z Loading addresses from DNS seed seed.bitcoin.sprovoost.nl
2019-09-23T14:06:29Z Loading addresses from DNS seed dnsseed.bluematt.me



2019-09-23T14:06:40Z Loading addresses from DNS seed seed.bitcoinstats.com
2019-09-23T14:06:40Z Loading addresses from DNS seed seed.bitcoin.jonasschnelli.ch
2019-09-23T14:06:40Z Loading addresses from DNS seed seed.btc.petertodd.org



2019-09-23T14:06:51Z Loading addresses from DNS seed dnsseed.bitcoin.dashjr.org
2019-09-23T14:06:51Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T14:06:51Z 0 addresses found from DNS seeds
2019-09-23T14:06:51Z dnsseed thread exit

sidhujag added a commit to syscoin/syscoin that referenced this pull request Sep 23, 2019
6170ec5 Do not query all DNS seed at once (Pieter Wuille)

Pull request description:

  Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them.

  Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on.

  This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them.

ACKs for top commit:
  Sjors:
    ACK 6170ec5
  sdaftuar:
    ACK 6170ec5
  jonasschnelli:
    utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model).
  fanquake:
    ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity.

Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
// Avoiding DNS seeds when we don't need them improves user privacy by
// creating fewer identifying DNS requests, reduces trust by giving seeds
// less influence on the network topology, and reduces traffic to the seeds.
if (addrman.size() > 0 && seeds_right_now == 0) {

This comment has been minimized.

Copy link
@ajtowns

ajtowns Sep 24, 2019

Contributor

Post-merge review: I think there's a bug here: if this loop is hit with addrman.size()==0 and seeds_right_now==0 it will then query the first seed, and set seeds_right_now==-1, at which point this condition will keep failing on future loops, preventing a delay between batches of 3 seeds, and preventing early exit. I think this means that new nodes with no peers.dat will still always immediately query all dns seeds. In #16939 I've made the logic if (addrman==0 && seeds_right_now==0) { current code } else if (seeds_right_now == 0) { seeds_right_now += 3; } which should give more sensible behaviour.

This comment has been minimized.

Copy link
@sdaftuar

sdaftuar Sep 24, 2019

Member

My understanding was that this was intentional behavior (eg @Sjors points this out in #15558 (comment)), but perhaps this should be better commented.

It made sense to me that we would query all dns seeds when starting up for the first time, which is consistent with prior behavior, and that the purpose of this PR was to limit dns seed usage only in the case of nodes that have been online before.

This comment has been minimized.

Copy link
@ajtowns

ajtowns Sep 24, 2019

Contributor

Seems surprising to me. If it's intentional, could simplify the logic a bit, and make it:

if (forcednsseed || addrman.size() == 0) seeds_right_now = seeds.size();
for (seed : seeds) {
    if (seeds_right_now == 0) { sleep(); check(); seeds_right_now += 3; }
    lookup();
    --seeds_right_now;
}

This comment has been minimized.

Copy link
@jgarzik

jgarzik Sep 24, 2019

Contributor

Aligning with the code comments that are quoted above + @luke-jr 's comments in related PR, this seems like an adequate summary?

  • Newly initialized nodes with no addrman data should query seeds immediately
  • Nodes with healthy addrman should not query seeds at all

The latter point seems to maximally align with the code comment & user privacy

@jonatack jonatack referenced this pull request Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.