You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In case of a persistent network problem or downtime of a sealer node the out-of-turn signing requests solve the problem of a stalled network and every sealer waiting for the missing sealer. After a "wiggle" time the other sealers start sealing a new block. Now in order to not have all sealers do the out-of-turn signing at the same time there's a randomized delay:
Unfortunately the randomization doesn't lead to a reliable spread of the delay between the nodes. When out-of-turn signing requests happen often, there are also many "collisions" between the out-of-turn blocks, because different nodes calculate more or less the same delay.
Steps to reproduce the behaviour
Let's assume we have 4 sealers in our network with a block time of 5s and 1 of the sealers is not reachable, then we have out-of-turn signing requests every 20s:
We can expect that wiggle=1.5s leads to a difference of less than 0.1s in the delays on the 3 nodes for every 5th attempt (not sure if my mental calculation is correct, but I'm pretty sure that it happens pretty often).
Suggestion
Maybe we can use the position of each sealer in the snap.Signers array to get a reliably different delay between the nodes. However we also have to make sure that the 1st sealer in the array doesn't always get the minimum delay in which case it would always be the same node who seals the out-of-turn signing requests. Maybe also do a blockNumber % (number-of-sealers + 1) or something alike.
The text was updated successfully, but these errors were encountered:
In the meantime we've out-voted one sealer, so there are three left. Now all sealers have direct connections to each other, but the Out-of-turn signing requests still happen after every round (which is now every 15s):
System information
Geth version:
OS & Version: RHEL 7.9
Expected behaviour
In case of a persistent network problem or downtime of a sealer node the out-of-turn signing requests solve the problem of a stalled network and every sealer waiting for the missing sealer. After a "wiggle" time the other sealers start sealing a new block. Now in order to not have all sealers do the out-of-turn signing at the same time there's a randomized delay:
Actual behaviour
Unfortunately the randomization doesn't lead to a reliable spread of the delay between the nodes. When out-of-turn signing requests happen often, there are also many "collisions" between the out-of-turn blocks, because different nodes calculate more or less the same delay.
Steps to reproduce the behaviour
Let's assume we have 4 sealers in our network with a block time of 5s and 1 of the sealers is not reachable, then we have out-of-turn signing requests every 20s:
We can expect that
wiggle=1.5s
leads to a difference of less than 0.1s in the delays on the 3 nodes for every 5th attempt (not sure if my mental calculation is correct, but I'm pretty sure that it happens pretty often).Suggestion
Maybe we can use the position of each sealer in the
snap.Signers
array to get a reliably different delay between the nodes. However we also have to make sure that the 1st sealer in the array doesn't always get the minimum delay in which case it would always be the same node who seals the out-of-turn signing requests. Maybe also do ablockNumber % (number-of-sealers + 1)
or something alike.The text was updated successfully, but these errors were encountered: