Network Improvements to TaskManager in Neo 3.0 - Prefer lower latency & higher availability nodes #542
Labels
discussion
Initial issue state - proposed but not yet accepted
enhancement
Type - Changes that may affect performance, usability or add new features to existing modules.
p2p
Module - peer-to-peer message exchange and network optimisations, at TCP or UDP level (not HTTP).
While the network should still communicate with nodes randomly to ensure decentralization, it could weight it's random selection based on node health as determined by latency, availability, etc.
(Following two paragraphs taken from one of my responses on #522)
By measuring network latency and entire "data request request/response" latency and keeping track of these per remote node we can optimizing the network and potentially protect against bad actors. Recently, to prevent waiting too long for block data, there was the change that allows the block to be requested from up to 3 remote nodes. If we knew more about the health (average network and data latencies) of the nodes from which blocks are requested, we could weight down talking to unhealthy nodes and have better confidence with using less redundancy. This would require a number of changes to TaskManger.
Currently the first to respond with block hashes are the ones that get the task to return block data. This already has the effect somewhat of having nodes with lower latency serve requests; however, responding to getblocks could be mostly an in memory operation, while responding to getdata for a block far in the past could be more a disk operation. So, a node asking for some blocks far back in the chain may be faster than others to respond to getblocks and slower than others to respond to getdata. Having metrics of 'data request/response latency' for connected remote nodes could help factor this in if a scheduling algorithm were built into TaskManger that took into account more information about all the connected remote nodes when making data requests.
The text was updated successfully, but these errors were encountered: