New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/optimise consensus sigcheck #4467
Conversation
- do not pass multisigner as parameter since it is not needed anymore - no need to verify agg sig when created only with valid sigs
…n-on-consensus Optimistic signature aggregation by leader
the mechanism shouldn't be susceptible to exploits, of this nature (to trigger node drops) when the nodes behave-follow the protocol, as the blacklisting relies on 2 signatures verification, where one of the signatures (p2p - ecdsa w/ secp256k1 verifies) but the BLS sig is invalid, which is clearly malicious behavior, and cannot happen when nodes follow the protocol. As for how many nodes can be dropped (jailed due to the blacklisting) - there is no limit, as many as there are. Even if all nodes would be jailed, there is one protection where part of the jailed nodes are still able to participate in consensus in order to maintain/satisfy the 400 nodes/shard constraint |
…-log-message fix consensus message pubkey in log trace
…closing fix peer blacklist handler closing
…-consensus Merge 1.4.0 into optimize consensus
consensus/message.proto
Outdated
@@ -22,4 +22,5 @@ message Message { | |||
bytes AggregateSignature = 11; | |||
bytes LeaderSignature = 12; | |||
bytes OriginatorPid = 13; | |||
bytes InvalidSigners = 14; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalid indentation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
singleSigner := sr.SingleSigner() | ||
err = singleSigner.Verify(pubKey, cnsMsg.BlockHeaderHash, cnsMsg.SignatureShare) | ||
if err != nil { | ||
log.Trace("verifyInvalidSigner: confirmed that node provided invalid signature", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe turn this to debug level? This should not happen normally, and when it happens we should see the print only once/node/round
consensus/blacklist/blacklist.go
Outdated
case <-ctx.Done(): | ||
log.Debug("peerBlacklist's go routine is stopping...") | ||
return | ||
case <-time.After(durationSweepP2PBlacklist): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion to replace the time.After with a timer.
Example: https://github.com/ElrondNetwork/elrond-go/blob/bd762dcc88845e11bc2ad51fe0d9af3dedbf37c8/common/statistics/resourceMonitor.go#L173
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a timer here
} | ||
|
||
// BlacklistPeer will blacklist a peer for a certain amount of time | ||
func (pb *peerBlacklist) BlacklistPeer(peer core.PeerID, duration time.Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a lot to remember where I'd seen this. Found it in the initP2PAntiFloodComponents
function and in the peerTimeCache
and p2pAntiflood
structs. We might reuse this code in those antiflooding components one day.
…se-consensus-sigcheck fixes after review: fix indentation + use timer in blacklist
…-consensus Merge 1.5.0 feat optimize consensus
…ix-integration-tests Fix integration tests
2809726
Description of the reasoning behind the pull request
This PR brings several changes on the way block signatures are handler during consensus in order to optimize the CPU usage.
Each signature share used in the aggregation of signatures for the block consensus proof, was previously individually verified.
Assuming all validators are honest, the verification takes ~3ms per signature. In metachain, each block accumulates ~400 signatures which is individually checked by all participants in consensus, taking a cumulative ~1.2s CPU time for each block that passes consensus.
Proposed Changes
This 1.2 seconds for metachain and ~200 ms for shards each round (for the synchronized nodes) can be optimized out, if instead the aggregation is done assuming all signature shares are valid, and verification is done on the aggregation instead.
The cost of the verification of the aggregated signature was optimized by the previous PR #4314 down to ~3ms as well.
This however can only be saved if all aggregated signatures are valid, otherwise, if the aggregated signature does not verify we will have an extra cost of 3ms, as the leader will aggregate first the signatures and then verify the aggregated signature. If the aggregated signature does not verify, the leader will have to check the signatures individually.
In order to avoid the increase and instead always have this optimization, a penalty for providing invalid signatures will be implemented which we call a pseudo slashing, that should provide enough disincentive.
The pseudo slashing is implementing through the addition of an extra message with the proof for received invalid signatures, sent on the consensus. This message is not mandatory to be sent during a consensus round, but if it is sent, the proof will be verified and used by nodes listening to the consensus messages to temporarily blacklist the misbehaving nodes.
While blacklisted, the messages from the nodes in the blacklist will be dropped and this will cause rating drop and the eventual jailing, that penalize the node through the unjailing fee, missing block rewards/fees and queue time for becoming eligible for consensus again.
Testing procedure
Standard system test
run with log trace for consensus
Custom system test with invalid signers
optimistic-consensus-sigcheck-malicious-node
TDB