Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency goals #339

Open
pipermerriam opened this issue Dec 28, 2020 · 1 comment
Open

Efficiency goals #339

pipermerriam opened this issue Dec 28, 2020 · 1 comment

Comments

@pipermerriam
Copy link
Member

The current master 5738cd8 fully demonstrates the manner that the network is intended to operate.

  • Nodes broadcast advertisements for content they have stored locally.
  • Nodes maintain a database of advertisements for content that is "near" them
  • Nodes gather and serve content that is "near" them
  • Nodes can lookup content and retrieve it from the network

The network however does not operate at an efficiency level suitable for hosting the entirety of the mainnet chain history.

Mainnet Numbers

The mainnet has 12 million headers/blocks/receipts. We can optionally add the canonical chain index to this list which ads another 12 million. That gives us ~48 million things that need to be stored today. In order to be future proof, we can scale this up to 100 million things which should be adequate for the immediate foreseeable future.

These things are on average 1kb in size. Block headers are about 550 bytes. Early blocks and receipts are very small. Later blocks and receipts are in the 50kb range. In total this data is about 100GB.

For the network to operate healthily it needs the data to be replicated across a few nodes. We'll pick 10 as our desired minimum replication factor.

So, with 100 million things and a 10x replicatiion factor we need to be able to store 1 billion things in the network. The current mainnet is comprised of about 10,000 nodes which we can use as a baseline, meaning that each node on the network would need to store 100,000 things.

Under the current network architecture, which uses advertisements to locate data, each node needs to be able to fully advertise its content in a reasonable amount of time. 1 hour seems to be a reasonable bound to place on this, which gives us a minimum rate of advertisement that needs to be reached, 100,000 things / 1 hour -> 28 things / second.

At present the client is running at about 1 thing/second.

There is plenty of room to gain this efficiency. Currently, each advertisement is comprised of:

  • 1x Network.broadcast
    • 1x Network.explore to source nodes for broadcast
      • 1x Network.recursive_find_nodes to seed the exploration
      • 1x `Network.ping/pong exchange with each node that we find to check liveliness
      • 1+ Network.find_nodes for each node to further seed the explore
    • 1x Network.advertise to send the advertisement.

Places where we can gain efficiency:

  • The Network.explore can be cached by continually maintaining a slightly out-of-date snapshot of the network and returning data from the snapshot.
  • The Network.ping/pong liveliness check can be skipped if we know the node was recently reachable.

These things should easily give us a significant boost in advertisements/second.

@pipermerriam
Copy link
Member Author

I think the first step in this is to measure our maximum messages/second throughput both in the core protocol and then in the alexandria sub-protocol so that we know the theoretical limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant