Remove bitnodes.io from dnsseeds. #5545

gmaxwell · 2014-12-27T09:40:33Z

I'm not comfortable with retaining this entry.

ayeowch · 2014-12-27T09:46:45Z

Hi Greg, care to elaborate more? I notice it could be the surge of few getaddr.bitnodes.io crawlers from AWS but they are not mine though:
54.194.231.211
54.183.64.54
54.174.10.182
54.66.214.167
54.94.200.247
54.67.33.14
54.77.251.214
54.66.220.137
54.173.72.127
54.94.195.96

The crawler for the Bitnodes project should only come from 148.251.238.178.

gmaxwell · 2014-12-27T09:52:44Z

Please explain to me why 148.251.238.178 is still staying pinned up to every node in the network? It was my understanding that after the last round of concerns you said you would discontinue that behaviour.

Are you saying you have no other hosts connecting outbound at large scale to the network than 148.251.238.178?

ayeowch · 2014-12-27T10:00:58Z

I only operate Bitnodes crawler from 148.251.238.178. I am not sure about the AWS ones; anyone could run the crawler using the same user agent: https://github.com/ayeowch/bitnodes/blob/master/protocol.py#L103

At the moment, my crawler does keep at most 1 active connection with each reachable node to aggregate the metrics for https://getaddr.bitnodes.io/dashboard/. It keeps the bandwidth usage to minimal without sending further data after a handshake except for periodical ping/pong messages. You will see an inbound from 148.251.238.178:
{
"addr" : "148.251.238.178:27426",
"addrlocal" : "",
"services" : "00000001",
"lastsend" : 1419674287,
"lastrecv" : 1419674209,
"bytessent" : 69195478,
"bytesrecv" : 155751,
"conntime" : 1418186362,
"pingtime" : 0.00000000,
"version" : 70001,
"subver" : "/getaddr.bitnodes.io:0.1/",
"inbound" : true,
"startingheight" : 333662,
"banscore" : 0,
"syncnode" : true
}

luke-jr · 2014-12-27T10:05:12Z

Um, if bitnodes is maintaining a connection to every peer they can, then ACK ACK this PR...

ayeowch · 2014-12-27T10:10:53Z

@luke-jr Most seeders, e.g. /bitcoinseeder:0.01/, enumerate all reachable nodes periodically anyway to get a snapshot of the network. Bitnodes simply use this data to come up with the network metrics to allow us to know the state/size of the network at any one time. I will ACK this PR anyway since it has come up again and again. It's quite silly.

luke-jr · 2014-12-27T10:12:43Z

There's a big difference from crawling the network periodically and maintaining a constant connection. You'll note my DNS seed provides the same network metrics (longer than you have, actually) without keeping any persistent connections...

ayeowch · 2014-12-27T10:16:56Z

@luke-jr My background is in networking so I may be looking at different perspective as I imagine it is inefficient to disconnect from nodes that you are already connected to if you are going to connect to them again after hearing about them from addr response from a peer in the next crawl.

laanwj · 2014-12-27T14:56:50Z

@ayeowch The problem of staying connected is that you're holding up a connection slot permanently, that otherwise could be used by a node or SPV wallet. It is better-behaved to poll once in a reasonable while, request e.g. the version and immediately disconnect.

sipa · 2014-12-27T15:42:31Z

@ayeowch You're right that there is a certain overhead involved in connecting and disconnecting from a peer, but that's almost nothing compared to the resources you're requiring from them for staying connected (a few MB of RAM per connection, plus all CPU involved in sending and tracking advertized blocks and transactions). In addition, as @laanwj already mentions: connection slots are limited resource that is valuable for the network. Please don't waste it.

jgarzik · 2014-12-27T19:55:51Z

ACK, based on criteria provided.

ayeowch · 2014-12-27T21:25:32Z

@laanwj @sipa Thanks for the valuable input. I had in the past configured the crawler to connect and disconnect after a full handshake. This results in the crawler having to reconnect after a full snapshot of the network is taken. This makes the crawler appear abusive in nature to some users.

@sipa I have monitored the transferred bytes closely between the crawler and each node but didn't realize the undesirable RAM/CPU usage coming from it just to maintain the connection.

I am going to revisit my approach again in the next couple of days and at the very least revert the crawler back to its previous behaviour, i.e. connect and disconnect from each node to take snapshot of the network.

petertodd · 2014-12-28T10:09:11Z

ACK for now, but NACK if problems are fixed in the next few days.

sipa · 2014-12-28T10:16:06Z

@andyeow Thanks for considering to change your approach.

Let's wait a few days before merging.

gmaxwell · 2014-12-28T10:36:34Z

I will continue to be uncomfortable if the behaviour is changed.

I really don't want to get into a debate here but we've had several instances where the behaviour of the bitnodes crawlers was bad in a way which was very obvious all (or nearly all) of the regular contributors to the project. In each case there was some discussion and at least I left with the impression that the behaviour would be fixed. Then later we find it doing something substantially similar, not changed, or even worse.

Clearly there are some consistent problems with communication and/or understanding.

In the case of the dnsseed itself, much of its behaviour is even more difficult to monitor or audit than the crawler. Plus, we have no great and urgent need for more dnsseeds, in fact our reliance on them is lower than it was in the past. With that in mind, I'd rather not waste my cycles trying to figure out ways to determine if this seed is behaving correctly, and I suspect that Ayeowch would probably rather not spend time defending the seed service which is nothing but costs to him, especially with us being demanding and seemingly unappreciative.

I'm currently working on tools that will help protect the users/network from abusive or intrusive behaviour like we've seen from this crawler software, regardless of well things are or aren't communicated or what people's intentions are, so I do not consider the behaviour of those particular hosts gating here (though it should also be improved), but a rather just as a sign that there is some kind of persistent serious misunderstanding and miscommunication and that I'd rather not have a DNS seed operating under that.

Cheers.

ayeowch · 2014-12-28T10:58:48Z

@gmaxwell Well said Greg :) I am glad that you already have work underway to better establish what is considered abusive/non-abusive towards the network.

As much as I wanted to defend myself and certainly want to help with the network by offering the DNS seeder, I agree much of the problem with the crawler was due to miscommunication. English is not my first language so that could be another reason? I'm still working on the crawler right now to avoid keeping connection. The only immediate gating issue for me is on the node alert feature on the Bitnodes site. The feature currently sends an alert to subscribed node owner as soon as it is disconnected from the crawler, so I will have to think of another way to do this. If you have thoughts about this or any issues regarding the crawler, please feel free to let me know (ayeowch@gmail.com).

At any rate, I am now less than comfortable having my DNS seeder in the reference implementation so let's proceed to ACK the removal for now.

luke-jr · 2014-12-28T18:57:47Z

I think it's probably fair to take up a connection slot on a node that someone has manually requested monitoring of.

zander · 2014-12-29T09:26:35Z

Using the new peers view I see 10 incoming connections with the useragent bitnodes.io, that app needs fixing!

Next to that, it makes no sense to have a crawler in the seed. They should find nodes, not the other way around.

ACK.

ayeowch · 2014-12-29T09:32:23Z

@zander I notice the 10 crawlers coming from 54.x.x.x too; they are not mine and does not contribute to stats for the project website at getaddr.bitnodes.io. I mentioned this in my second comment in this issue. I have reverted my crawler earlier today to maintain only 1 instance to perform a full handshake with a reachable node and disconnect immediately. It will however reconnect again after a full snapshot of the network is taken, i.e. approx. 5 to 8 minutes later. You should be able to see it appearing in your getpeerinfo as an inbound from 148.251.238.178.

a094b3d Remove bitnodes.io from dnsseeds. (Gregory Maxwell)

I'm not comfortable with retaining this entry. Rebased-From: a094b3d Github-Pull: #5545

mikehearn · 2014-12-31T16:19:11Z

I think I will be keeping Addy's seed in both bitcoinj and Bitcoin XT for a couple of reasons:

I don't see how removing a seed helps Bitcoin users. Ignoring for a moment that several popular services hold open lots of connections to the p2p network, the fix for this is discussion, dialog and new features, not reducing startup redundancy in Bitcoin Core.
There's a written DNS seed policy, which currently says nothing about the behaviour of the associated crawlers. Rather than just targeting Addy's service specifically if this is really the highest priority it'd be better to come up with a specific policy and provide notice to seed operators, time for transition etc.
Even if we accept the rationale that removing seeds as 'punishment' is a good way to effect change, Addy has now modified his crawler but the commit was merged anyway. Why?

Addy, for what it's worth your English is excellent. I don't think the issue here is language related.

Remove bitnodes.io from dnsseeds.

a094b3d

I'm not comfortable with retaining this entry.

gmaxwell added the P2P label Dec 27, 2014

gmaxwell added this to the 0.10.0 milestone Dec 27, 2014

laanwj merged commit a094b3d into bitcoin:master Dec 31, 2014

laanwj added a commit that referenced this pull request Dec 31, 2014

Merge pull request #5545

639c79a

a094b3d Remove bitnodes.io from dnsseeds. (Gregory Maxwell)

gmaxwell added a commit that referenced this pull request Dec 31, 2014

Remove bitnodes.io from dnsseeds.

18021d0

I'm not comfortable with retaining this entry. Rebased-From: a094b3d Github-Pull: #5545

ayeowch mentioned this pull request Feb 1, 2015

/getaddr.bitnodes.io:0.1/ peers ayeowch/bitnodes#12

Closed

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove bitnodes.io from dnsseeds. #5545

Remove bitnodes.io from dnsseeds. #5545

gmaxwell commented Dec 27, 2014

ayeowch commented Dec 27, 2014

gmaxwell commented Dec 27, 2014

ayeowch commented Dec 27, 2014

luke-jr commented Dec 27, 2014

ayeowch commented Dec 27, 2014

luke-jr commented Dec 27, 2014

ayeowch commented Dec 27, 2014

laanwj commented Dec 27, 2014

sipa commented Dec 27, 2014

jgarzik commented Dec 27, 2014

ayeowch commented Dec 27, 2014

petertodd commented Dec 28, 2014

sipa commented Dec 28, 2014

gmaxwell commented Dec 28, 2014

ayeowch commented Dec 28, 2014

luke-jr commented Dec 28, 2014

zander commented Dec 29, 2014

ayeowch commented Dec 29, 2014

mikehearn commented Dec 31, 2014

Remove bitnodes.io from dnsseeds. #5545

Remove bitnodes.io from dnsseeds. #5545

Conversation

gmaxwell commented Dec 27, 2014

ayeowch commented Dec 27, 2014

gmaxwell commented Dec 27, 2014

ayeowch commented Dec 27, 2014

luke-jr commented Dec 27, 2014

ayeowch commented Dec 27, 2014

luke-jr commented Dec 27, 2014

ayeowch commented Dec 27, 2014

laanwj commented Dec 27, 2014

sipa commented Dec 27, 2014

jgarzik commented Dec 27, 2014

ayeowch commented Dec 27, 2014

petertodd commented Dec 28, 2014

sipa commented Dec 28, 2014

gmaxwell commented Dec 28, 2014

ayeowch commented Dec 28, 2014

luke-jr commented Dec 28, 2014

zander commented Dec 29, 2014

ayeowch commented Dec 29, 2014

mikehearn commented Dec 31, 2014