Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Better peer discovery mechanism #32

Closed
adambabik opened this issue Nov 16, 2019 · 18 comments
Closed

Better peer discovery mechanism #32

adambabik opened this issue Nov 16, 2019 · 18 comments

Comments

@adambabik
Copy link

Problem

At the very beginning, each Status client had only a single list of peers it could connect to. Mailservers and Whisper nodes were separated. As you can imagine, this is not very clever way of sharing peers' addresses but it is simple and works great if all goes fine.

However, later, we started thinking about issues that may occur and had actually occurred. Namely, many of our peers hosted in AWS or GCP were blocked in Russia and China. So, we introduced another set of server peers in Asia + implemented the old PoC Discovery V5 algorithm from geth. When we tested it, it turned out that it consumes a lot of bandwidth and also other resources like CPU and memory. So, we want with a strategy to only turn on Discovery V5 if there are not enough peers connected and we also looked at other peer discovery algorithms like Rendezvous.

All that effort brought more and more complexity, it became harder and harder to measure the source of bandwidth usage and battery draining. It also decreased our velocity.

We need a cleaner, simpler solution for this task.

Acceptance criteria

  • We don't need extremely versatile and dynamic solution because we, Status, will very likely continue to be the main maintainer of the infrastructure for the Status clients. However, we need an ability to share new peers' addresses without updating the clients.
  • Updating the list of peers must not consume a lot of bandwidth and resources. Ideally, it should not require starting a node so that bootnodes also are not necessary.
  • It should be safe, i.e. it should be impossible to spoof the peers' addresses when obtaining them.
  • It should be easy to implement and manage.
  • The act of obtaining peers should be private and encrypted.

Possible Solutions

  1. Node Discovery via DNS + DNS over HTTPS. This is actually only a part of the story because it suggests only sharing bootnodes this way and later use a dynamic peers discovery mechanism. In our case, this might be sufficient as end-to-end solution.
@oskarth
Copy link
Contributor

oskarth commented Nov 16, 2019

Thanks for the issue!

Right now this issue seems very Status/client specific. Can we rewrite it to be more general where Status client and its specific circumstances is just a specific instance? It references Status infrastructure and "us" running the cluster, which is more of a client/downstream/deployment concern.

The problem description makes it seem like the issue in is complexity, but it doesn't really explain what the actual issue is with the current setup ('m not saying there aren't any). As far as I can the current strategy is:

  • Use Discovery v5 upon startup
  • Turn into off once some peers are found

Which seems like a reasonable algorithm, assuming Discovery v5 is well documented and sane and meets our needs. If it isn't or doesn't, then that seems like a more relevant spec bug/feature request.

Can we be more precise about the problems with that approach?

I also don't think we are likely to change peer discovery mechanism in version zero, but for future iterations for sure. It also seems like a much larger and general issue than Waku mode per se. I.e. it's an onoging topic in Eth2 afaik, etc.

@adambabik
Copy link
Author

adambabik commented Nov 17, 2019

but it doesn't really explain what the actual issue is with the current setup

I did describe that, maybe not in a single place. The currently used Discovery V5 is not sustainable on mobile, it's too much bandwidth and resources. And the complexity is a problem because the current approach is hybrid Discovery V5 + Rendezvous + caching + peer pools + no spec.

We can try to use the new Discovery V5 but as far as I know it's still experimental and not even in go-ethereum master branch. We would still need to confirm Discovery V5 works fine in a case where nodes supporting a specific protocol are sparse.

It references Status infrastructure and "us" running the cluster, which is more of a client/downstream/deployment concern.

Because that's the reality. We can generalize it in the spec of course. I think we still should optimize for a single entity providing Waku nodes. Which entity would do that does not matter.

I also don't think we are likely to change peer discovery mechanism in version zero, but for future iterations for sure.

I think we should because it's gonna be a problem for Nim. As far as I know they only have Discovery V4 and that's useless for us (Disc v4 is not topic-based peers discovery algorithm).

I think we should do as simple as possible so either static or EIP 1459: Node Discovery via DNS but instead of bootnodes provide Waku nodes instead.

@oskarth
Copy link
Contributor

oskarth commented Nov 21, 2019

I think we should do as simple as possible so either static or EIP 1459: Node Discovery via DNS but instead of bootnodes provide Waku nodes instead.

Agree with KISS, I wonder how this plays into the finding closest mailserver change?

--

It also seems like this also touches on a larger research issue, which is finding discovery mechanism for resource restricted devices.

Eth2 and libp2p https://github.com/ethereum/eth2.0-specs/blob/02bb92e71455adaa7da101563a6c367efe9e1cc7/specs/networking/p2p-interface.md#the-discovery-domain-discv5 as well as Swarm Kademlia light connection strategy

What about splitting this into two issues, one vac/reseach and another KISS for status/spec deploy which is more like static nodes or possible the DNS thing? cc @decanus too

@adambabik
Copy link
Author

Agree with KISS, I wonder how this plays into the finding closest mailserver change?

It's still can be applied. You get a list of Waku nodes via DNS and HTTPS. You collect RTTs for all of them and pick the closest one.

What about splitting this into two issues, one vac/reseach and another KISS for status/spec deploy which is more like static nodes or possible the DNS thing?

Yes, that is exactly the approach I would like to take for the time being.

@decanus
Copy link
Contributor

decanus commented Nov 22, 2019

@adambabik why would a rendezvous only protocol not work?

@adambabik
Copy link
Author

@decanus that would work but this protocol (1) is not used by anyone else, (2) is not researched very well , for example I have no idea what kind of security guarantees it makes.

To me, it seems like EIP 1459 is a safer choice. With rendezvous you also need to have bootnodes so solution described in EIP 1459 would still be needed. Basically, having EIP 1459 is a prerequisite to whatever dynamic peers discovery mechanism we decide on.

@oskarth
Copy link
Contributor

oskarth commented Dec 2, 2019

@decanus what's the state of this issue?

@oskarth
Copy link
Contributor

oskarth commented Dec 5, 2019

Issue not well defined, too broad. Should be factored out into a more minimal acceptance criteria (and more long term research Q).

@oskarth
Copy link
Contributor

oskarth commented Dec 16, 2019

@decanus @adambabik what's happening with this issue? Are you going to factor it out into something more minimal? It's been pending for two weeks with no updates...

@adambabik
Copy link
Author

We want Waku/0 with both geth and Nimbus so we should figure out the easiest solution. Like I said in the comment, EIP-1459 seems the easiest and will stay useful in the future. It requires development from both status-go and Nimbus but considerably less than implementing the current peers discovery algorithm by Nimbus.

@decanus
Copy link
Contributor

decanus commented Dec 17, 2019

@adambabik so I think a problem with the DNS option is an issue we also have documented in the rendezvous issue. Blocking by an adversary becomes rather simple, I am questioning what other schemes we can use in case an adversary blocks access to certain DNS entries. status-im/rendezvous#8

@adambabik
Copy link
Author

@decanus not sure we should try to solve this problem in this issue. We don't even have the simplest possible way to share node IPs except for hardcoding them in the code/app itself.

With the current solution, an adversary can block our bootnode IPs and we would need to release a new version of the app or share the new bootnode IPs with the community in some way. Blocking DNS entries might be a bit more tricky and one can always switch DNS resolver.

In my opinion, using DNS, for now, improves the current situation and also allows us to remove huge chunk of complicated architecture we need to maintain, and in the case of Nim, develop.

@decanus
Copy link
Contributor

decanus commented Dec 17, 2019

@adambabik I agree that DNS is probably a good solution for now at least and that it solves most of our issues, I'm highlighting that concern here so that it is persisted somewhere.

@decanus
Copy link
Contributor

decanus commented Dec 18, 2019

so I am currently looking into various protocols that use some form of DNS discovery, both libp2p and bitcoin come to mind although the libp2p DNS spec is for LAN based discovery. See mdns. It probably makes sense to make it similar to some other formats using a certain standard.

@kdeme
Copy link
Contributor

kdeme commented Dec 18, 2019

I don't mind initially going for static or for EIP 1459. And using it together with Discovery v5 in the longer run.
Regarding the latter, discovery v5 (latest spec) is already (far) in development in nim-eth. Which comes with ENR support, which is also needed for EIP-1459.

@decanus
Copy link
Contributor

decanus commented Dec 19, 2019

Oh also I thought that instead of using ENR we could use mulitaddrs. Seems like a more versatile standard. @kdeme what do you think?

@decanus
Copy link
Contributor

decanus commented Dec 19, 2019

@adambabik @oskarth @kdeme, I decided the best way to go forth here was quickly write up a forum post on my suggestion of how I would do this: https://forum.vac.dev/t/node-discovery-via-dns/29. If this is fine we can spec it out, if not further discuss.

@fjl
Copy link

fjl commented Jan 13, 2020

I'd be so happy to see Waku use the devp2p discovery stack as-is. If you already have an implementation of discv5 proper, you can tune it for your needs.

@oskarth oskarth changed the title Waku 0: peers discovery mechanism Better peer discovery mechanism Apr 20, 2020
@oskarth oskarth closed this as completed Mar 31, 2021
@vacp2p vacp2p locked and limited conversation to collaborators Mar 31, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants