[ProjectTracking]: Resolve network concerns for stateless validation #61

Longarithm · 2024-03-19T08:29:51Z

Goals

Make sure that increased network usage and network latencies don't impact mainnet stability.

Initial analysis of the problem space

Latency: We have observed some chunk production failures even when the state witness is small or empty, which could be due to network latency being too high. @Tayfun Elmas (tayfunelmas)'s PR #10824 should help understand if that is actually the case. @Jan Ciołek (jancionear)'s proposal of switching to T1 could significantly help with latency in general, not only due to use of direct connections but also because TIER2 routing only minimizes hop count, not latency. I also wonder if we have observed anything to indicate that the state witness is arriving, but arriving too late?
Throughput: Alex observed very poor throughput for certain pairs of locations. This is interrelated to latency because TCP (and also QUIC, btw) adjusts send rate based on observed round-trip time.
Loss: We also see significant packet loss (?) between certain pairs of locations. I think this is the part which received least discussion, could you share or point me to your findings @Alex Logunov (Longarithm)? Packet loss would also have a significant impact on TCP send rate.
Bandwidth Usage: The lower bound on the bandwidth usage to distribute a payload of size S to N nodes is S * N of both egress and ingress bandwidth. Nodes pay for bandwidth and it is not cheap. Due to low diameter of the TIER2 network, I would expect that even today the bandwidth usage is only roughly 2SN. As Robin pointed out we can optimize it to N + sqrt(N) with some kind of fanout. As I understand this is not an important direction at the moment since currently monetary cost is not a blocker for mainnet, and also there should only be roughly 2x room for improvement in this area from network efficiency.

Issues

Statelessnet shard 5 stalls for the high load. The only chunk producer' upload network queues are overflowed because of large state witnesses.
- zulip link
Localnet with 2 distant validators misses 25% of chunks, apparently because 0.6s is not enough to endorse an empty chunk.
- zulip link

Links to external documentations and discussions

https://near.zulipchat.com/#narrow/stream/407237-core.2Fstateless-validation/topic/mainnet.20traffic/near/421904005

Estimated effort

TBD

Assumptions

Average state witness size is 2 MB and worst case is 16 MB. So,

Chunk producer periodically has to send 200 MB worth of data within 1s on average and 1.6 GB within 1s worst-case.
Chunk validator has to receive 2 MB/s on average and 16 MB/s worst-case.

Pre-requisites

#46

Out of scope

NA

The text was updated successfully, but these errors were encountered:

walnut-the-cat · 2024-04-04T21:01:49Z

@saketh-are and @shreyan-gupta , is there a different github issue we are using for network optimization? If not, let's update this as we make progress

walnut-the-cat · 2024-04-15T19:40:40Z

Current plans (by @saketh-are )

Based on the current experiments, we can adjust our P2P TCP connections to handle traffic bursts of size 2 MB without latency degradation. We can push this limit but start to allocate more and more memory for send/recv buffers, which at some point becomes an issue when multiplied by number of connections. Assume 2 MB for now.
We will use erasure coding to break the state witness into 50 parts, each of which is sent on a separate direct connection to a separate chunk validator. Each CV then forwards its part to all other CVs. This 2-hop strategy is what is done in prod for chunk distribution, which we know works reasonably well to deliver chunk information within the block production timeframe. The erasure coding redundancy adds 50% overhead; that is, if we can support a 2 MB message what we really manage to send is 2 * 2/3 = 1.33 MB.
So far we have tested handling of a burst of traffic on a single connection. It remains to be seen to what extent performance degrades due to the fact that the chunk producer has to send on 50 different connections at once. I’ll try to obtain some numbers on this today. Pending those results, an upper bound on what we may be able to handle for state witness after implementing the erasure coding looks currently like 2 MB * 50 * 2/3 = ~66 MB.

Benefit of erasure encoding (by @shreyan-gupta )

It significantly reduces the instant network requirements for a single chunk producer. Without it, we needed to send 2 MB * 50 validators of data from one node to all other validators. After implementation of erasure encoding, we would need to send 2 MB * 1.5 overhead / 50 parts * 50 validators of data, which is just ~3 MB
It's "fairer" to validators from the perspective that random connection drops would not result in validators not receiving the state witness. In the future this opens up the possibility where we may want to reward validators depending on whether they've generated an endorsement which gets included in the block.

walnut-the-cat · 2024-04-17T18:01:44Z

Depends on

Longarithm added T-core project-tracking labels Mar 19, 2024

walnut-the-cat mentioned this issue Mar 20, 2024

[ProjectTracking]: Stateless validation Mainnet Release #46

Open

52 tasks

Longarithm assigned saketh-are Mar 27, 2024

walnut-the-cat assigned shreyan-gupta Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ProjectTracking]: Resolve network concerns for stateless validation #61

[ProjectTracking]: Resolve network concerns for stateless validation #61

Longarithm commented Mar 19, 2024 •

edited by walnut-the-cat

Loading

walnut-the-cat commented Apr 4, 2024

walnut-the-cat commented Apr 15, 2024

walnut-the-cat commented Apr 17, 2024

[ProjectTracking]: Resolve network concerns for stateless validation #61

[ProjectTracking]: Resolve network concerns for stateless validation #61

Comments

Longarithm commented Mar 19, 2024 • edited by walnut-the-cat Loading

Goals

Initial analysis of the problem space

Issues

Links to external documentations and discussions

Estimated effort

Assumptions

Pre-requisites

Out of scope

walnut-the-cat commented Apr 4, 2024

walnut-the-cat commented Apr 15, 2024

Current plans (by @saketh-are )

Benefit of erasure encoding (by @shreyan-gupta )

walnut-the-cat commented Apr 17, 2024

Longarithm commented Mar 19, 2024 •

edited by walnut-the-cat

Loading