-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consensus: require multiple receipts to seal #450
Conversation
…re sealing; * fixed logic for re-requesting second receipt (as a single one is insufficient now); * added unit tests
requester.WithRetryInitial(2*time.Second), | ||
requester.WithRetryMaximum(30*time.Second), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default for the RetryMaximum
was 2min, which is way too slow as it directly impacts time to seal , which is critical for NBA
engine/consensus/matching/core.go
Outdated
// However, as soon as we have at least 2 receipts, the matching Engine will produce a candidate | ||
// seal and put it in the mempool. To prevent spamming the logs, we log only when there is | ||
// no seal in the IncorporatedResultSeals mempool: | ||
_, sealAlreadyGenerated := c.seals.ByID(incorporatedResult.ID()) // IncorporatedResultSeal has the same ID as incorporatedResult |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This caused some confusion in the previous version of this PR (see comments). I have revised the comment in response, specifically this part:
flow-go/engine/consensus/matching/core.go
Lines 736 to 738 in 5df7490
// seal and put it in the mempool. To prevent spamming the logs, we log only when there is | |
// no seal in the IncorporatedResultSeals mempool: | |
_, sealAlreadyGenerated := c.seals.ByID(incorporatedResult.ID()) // IncorporatedResultSeal has the same ID as incorporatedResult |
Additional context:
Please note the naming convention for matching.Core
:
- the persistent storage components have names ending in
DB
, such as:flow-go/engine/consensus/matching/core.go
Line 71 in 5df7490
headersDB storage.Headers // used to check sealed headers - Components without
DB
refer to mempools. Specifically,c.seals
has the following type:
flow-go/engine/consensus/matching/core.go
Line 76 in 5df7490
seals mempool.IncorporatedResultSeals // holds candidate seals for incorporated results that have acquired sufficient approvals; candidate seals are constructed without consideration of the sealability of parent results IncorporatedResultSeal
s, whose ID method is defined as
flow-go/model/flow/incorporated_result_seal.go
Lines 16 to 20 in 86124f6
// ID implements flow.Entity.ID for IncorporatedResultSeal to make it capable of // being stored directly in mempools and storage. func (s *IncorporatedResultSeal) ID() Identifier { return s.IncorporatedResult.ID() }
Co-authored-by: Yura <yuraolex@gmail.com>
engine/consensus/matching/core.go
Outdated
return false | ||
} | ||
|
||
l := c.log.Info(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this generate lots of info log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change the log level. Let me know what you prefer.
The logging should essentially be almost identical to the initial version you built: #363 (comment) (including logging on info Level). The only noteworthy differences between your version and the revised behaviour as of this PR:
FirstUnmatchedChunkIndex: -1
is now omitted.SealedByEmergency
is only included when we actually evaluated the emergency sealing condition (as opposed to before, where the default value was false even if we didn't check whether the result qualified for emergency sealing)- we also have information whether 2 receipts for the result are known
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to use debug
log.
The log I built only print logs for next unsealed result, whereas the added log here will print log for all unsealed results.
Since you've added extensive logging for the next unsealed result, I think this logging could be debug level
… result has sufficient approvals (for all chunks)
# Conflicts: # engine/consensus/matching/indexer.go # storage/badger/receipts.go # storage/mock/execution_receipts.go # storage/receipts.go
ExecutedBlock *flow.Header // block the incorporated Result is for | ||
IncorporatedResult *flow.IncorporatedResult // the incorporated result | ||
|
||
// SufficientApprovalsForSealing: True iff all chunks in the result have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// SufficientApprovalsForSealing: True iff all chunks in the result have | |
// SufficientApprovalsForSealing: True if all chunks in the result have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is intentional and not a typo 😉 :
[Wiki]: In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false.
// FirstUnmatchedChunkIndex: Index of first chunk that hasn't received | ||
// sufficient approval (ordered by chunk index). Optional value: only set | ||
// if SufficientApprovalsForSealing == False and nil otherwise. | ||
FirstUnmatchedChunkIndex *uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have pointers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a proper value for FirstUnmatchedChunkIndex
in some cases. It only has a well-defined value if one of the chunks is missing approvals. If all chunks have sufficient approvals, what should we set FirstUnmatchedChunkIndex
to? Previously, we used to set it to -1
. Personally, I prefer to have this expressed more explicitly.
In Java, I would have implemented this using Optional
. Did some looking around, and it seems people tend to use pointer types for this in Go (-> stackoverflow) ... open to alternative suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To main purpose of the SealingTracker
is to provide logging information. We don't include the values if they haven't been set by the sealing logic:
flow-go/engine/consensus/matching/sealingtracker/sealing_tracker.go
Lines 86 to 94 in bedd704
if record.firstUnmatchedChunkIndex != nil { | |
kvps["first_unmatched_chunk_index"] = *record.firstUnmatchedChunkIndex | |
} | |
if record.qualifiesForEmergencySealing != nil { | |
kvps["qualifies_for_emergency_sealing"] = *record.qualifiesForEmergencySealing | |
} | |
if record.hasMultipleReceipts != nil { | |
kvps["has_multiple_receipts"] = *record.hasMultipleReceipts | |
} |
Looks good to me, just a few comments related to tests and some minor stuff. |
if rs == nil { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if rs
is nil, then rs.SetHasMultipleReceipts
would hit nil pinter first
if rs == nil { | |
return | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, having nil
as legitimate value for SealingRecord
is just too confusing and error-prone. Now working only with non-nil
values SealingRecord
s. Hope that is clearer now 😅
ExecutedBlock: executedBlock, | ||
IncorporatedResult: incorporatedResult, | ||
} | ||
st.records = append(st.records, record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when do we remove the record from it, would the memory keeps growing as we record and hold more?
engine/consensus/matching/core.go
Outdated
// have a child yet. Then, the chunk assignment cannot be computed. | ||
// - All other errors are unexpected and symptoms of internal bugs, uncovered edge cases, | ||
// or a corrupted internal node state. These are all fatal failures. | ||
func (c *Core) hasEnoughApprovals(incorporatedResult *flow.IncorporatedResult, sealingTracker *sealingtracker.SealingTracker) (*sealingtracker.SealingRecord, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is too complex as it takes in a stateful sealingTracker
type and returns another sealingRecord
type.
I would suggest not to depend on those types in order to keep it simple.
This caller seems only care about whether there is enough approvals for the given result, and the FirstUnmatchedChunkIndex
if not enough. In other words, we don't care about the chunk that had enough approvals, right?
So we could let this function return (FirstUnmatchedChunkIndex int, hasEnoughApprovals bool)
.
If we had enough approvals, we return (-1, true)
; if we the 3rd chunk didn't have enough approval, we return (2, false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Agree. I was concerned about the complexity here as well.
I have refactored hasEnoughApprovals
to:
flow-go/engine/consensus/matching/core.go
Line 639 in 9fd1bbb
func (c *Core) hasEnoughApprovals(incorporatedResult *flow.IncorporatedResult) (*sealingtracker.SealingRecord, error) { |
While I still return the SealingRecord
for the incorporatedResult
, I have made several modifications which hopefully address the heart of your concern:
- removed the stateful
sealingTracker
from the arguments - Regarding the usage of
SealingRecord
, I have applied significant simplifications:SealingRecord
is now directly generated byhasEnoughApprovals
(i.e. I removedSealingTracker
as a factory forSealingRecord
s)SealingRecord
is never nil and always reflects accurately the sealing status of the incorporated Result- Essentially, the
SealingRecord
holds the values(FirstUnmatchedChunkIndex int, hasEnoughApprovals bool)
plus some other flags for later sealing checks.
Please let me know whether the code is clear enough with these revisions.
Int("sealable_incorporated_results", len(sealedBlockIDs)). | ||
Int64("duration_ms", time.Since(startTime).Milliseconds()). | ||
Str("next_unsealed_results", sealingTracker.String()). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does sealingTracker.String()
include all records, or just the next unsealed result?
Could you provide a sample log?
Better not all records, because there will be too much noise, as we only care about the next unsealed when debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only track the sealing status of the IncorporatedResult
(s) for the next unsealed finalized block:
flow-go/engine/consensus/matching/sealingtracker/sealing_tracker.go
Lines 58 to 65 in 3b33804
// Track tracks the given SealingRecord, provided the it should be tracked | |
// according to the SealingTracker's internal policy. | |
func (st *SealingTracker) Track(sealingRecord *SealingRecord) { | |
executedBlockID := sealingRecord.IncorporatedResult.Result.BlockID | |
if st.isRelevant(executedBlockID) { | |
st.records = append(st.records, sealingRecord) | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging behaviour should be almost identical to the initial version you built: #363 (comment)
When missing approval for the next unsealed block
[{
"executed_block_height":2190957487,
"executed_block_id":"9efa9526c0ab51db9c5a904bfc589f95955c1659f6b439196c0146dc8e2a3694",
"first_unmatched_chunk_index":2,
"incorporated_result_id":"b159f2623bd9d660fa1125d05487d7658e5725fcba479568da139e880df8422e",
"number_chunks":17,
"qualifies_for_emergency_sealing":false,
"result_id":"53c8d0dd81bf4aef407331c2388bcdd53f4bd5ae5cb81ad5242a9ee6655706db",
"sufficient_approvals_for_sealing":false}]
When sealed by emergency
The SealedByEmergency
becomes true
[{
"executed_block_height":2190957487,
"executed_block_id":"9efa9526c0ab51db9c5a904bfc589f95955c1659f6b439196c0146dc8e2a3694",
"first_unmatched_chunk_index":2,
"has_multiple_receipts":true,
"incorporated_result_id":"b159f2623bd9d660fa1125d05487d7658e5725fcba479568da139e880df8422e",
"number_chunks":17,
"qualifies_for_emergency_sealing":true,
"result_id":"53c8d0dd81bf4aef407331c2388bcdd53f4bd5ae5cb81ad5242a9ee6655706db",
"sufficient_approvals_for_sealing":false}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for providing the log
engine/consensus/matching/core.go
Outdated
// contains a receipt that commits to this result. | ||
assignment, err := c.assigner.Assign(incorporatedResult.Result, incorporatedResult.IncorporatedBlockID) | ||
// Can we seal following the happy-path protocol? | ||
sealingStatus, err := c.hasEnoughApprovals(incorporatedResult, sealingTracker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I expected was that:
- we will require multiple receipts to produce a seal
- in other words, we will never produce a seal if there is less two receipts.
Now the new behavior is that:
- we will require multiple receipts and either enough approvals or emergency sealing to produce a seal
- in other words, we won't produce a seal even if we've received multiple receipts, but haven't received enough approvals yet, and emergency sealing has not been triggered.
I'm not comfortable to have the sealing logic requiring enough approvals, because we still have one unsolved issue, which is verification node can't resume the verification on results which are produced when it's off. So I'm afraid the emergency sealing will often be triggered, and makes the sealing sometimes fast and sometimes slow, making it hard to debug.
I would suggest to keep the original plan to just seal with multiple receipts until we fix the sealing liveness issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that the logic seals under the condition
we will require multiple receipts and either enough approvals or emergency sealing to produce a seal
This is also stated here in the code comment:
flow-go/engine/consensus/matching/core.go
Lines 605 to 612 in 34a14af
// Determine sealability: | |
// (i) the incorporatedResult must qualify for happy path sealing | |
// or qualify for emergency sealing | |
// AND | |
// (ii) there must be at least 2 receipts from _different_ ENs | |
// committing to the result | |
// comment: we evaluate condition (ii) only if (i) is true | |
if happySealable || emergencySealable { |
Essentially, the consensus node already implements the general case, where we could require some approvals to seal (or wait until emergency sealing kicks in). With the parametrization DefaultRequiredApprovalsForSealConstruction = 0
, the function hasEnoughApprovals
always returns true
(for any incorporatedResult
). Therefore, (i)
is always true and the sealing condition
(i) the incorporatedResult must qualify for happy path sealing
or qualify for emergency sealing
AND
(ii) there must be at least 2 receipts from _different_ ENs
committing to the result
reduces to
there must be at least 2 receipts from _different_ ENs committing to the result
which is exactly what you expected.
Note:
From my perspective, the current logic should be the same as what we are currently running on Mainnet (PR #444):
flow-go/engine/consensus/matching/engine.go
Lines 812 to 817 in c4ea872
if matched || emergencySealed { | |
if e.resultHasMultipleReceipts(incorporatedResult) { | |
// add the result to the results that should be sealed | |
results = append(results, incorporatedResult) | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the following explicit shortcut in hasEnoughApprovals
flow-go/engine/consensus/matching/core.go
Lines 646 to 649 in 4bdb072
// shortcut: if we don't require any approvals, any incorporatedResult has enough approvals | |
if c.requiredApprovalsForSealConstruction == 0 { | |
return sealingTracker.SufficientApprovals(incorporatedResult), nil | |
} |
- which hopefully makes the logic clearer
- it shortcuts some work of determining assignments if we don't require any approvals to seal
Co-authored-by: Yura <yuraolex@gmail.com>
…ow-go into alex/5341_two-receipts-to-seal
• removed nil as a valid instance of SealingRecord • moved String generation from SealingRecord to SealingTracker
…ex/5341_two-receipts-to-seal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice work!
engine/consensus/matching/core.go
Outdated
return false | ||
} | ||
|
||
l := c.log.Info(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to use debug
log.
The log I built only print logs for next unsealed result, whereas the added log here will print log for all unsealed results.
Since you've added extensive logging for the next unsealed result, I think this logging could be debug level
engine/consensus/matching/core.go
Outdated
// (ii) there must be at least 2 receipts from _different_ ENs | ||
// committing to the result | ||
// comment: we evaluate condition (ii) only if (i) is true | ||
if sealableWithEnoughApprovals || emergencySealable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] there are quite a few nested if
and ||
conditions in this function, would be great to keep the logic flat, however I haven't found a cleaner way. Maybe something for later in the next phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flattened the logic a bit:
flow-go/engine/consensus/matching/core.go
Lines 598 to 613 in c72ae1e
// Determine sealability: | |
// (i) the incorporatedResult must qualify for happy path sealing | |
// or qualify for emergency sealing | |
// AND | |
// (ii) there must be at least 2 receipts from _different_ ENs | |
// committing to the result | |
// comment: we evaluate condition (ii) only if (i) is true | |
if !(sealableWithEnoughApprovals || emergencySealable) { // condition (i) is false | |
continue | |
} | |
hasMultipleReceipts := c.resultHasMultipleReceipts(incorporatedResult) | |
sealingStatus.SetHasMultipleReceipts(hasMultipleReceipts) | |
if !hasMultipleReceipts { // condition (ii) is false | |
continue | |
} | |
results = append(results, incorporatedResult) // add the result to the results that should be sealed |
not sure whether this is simpler or just as complex (only in a different way) 😅
# Conflicts: # engine/access/rpc/backend/backend_test.go
// (ii) there must be at least 2 receipts from _different_ ENs | ||
// committing to the result | ||
// comment: we evaluate condition (ii) only if (i) is true | ||
if !(sealableWithEnoughApprovals || emergencySealable) { // condition (i) is false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !(sealableWithEnoughApprovals || emergencySealable) { // condition (i) is false | |
if sealableWithEnoughApprovals == false && emergencySealable == false { // condition (i) is false |
production version of MainNet fix: #444 (Require multiple receipts to seal)
Overview of Changes
and
matching.Core
):sealableResults
we now have distinct methods for that implement the various sealing checks or conditions:flow-go/engine/consensus/matching/core.go
Lines 585 to 586 in a04e65d
flow-go/engine/consensus/matching/core.go
Lines 595 to 598 in a04e65d
flow-go/engine/consensus/matching/core.go
Line 618 in a04e65d
hasEnoughApprovals
and removed a lot of duplicated requests / queriesO(1)
instead of iterating over all elements to search for a matchO(n)
:flow-go/model/flow/incorporated_result.go
Lines 163 to 170 in a04e65d
Thanks @jordanschalm and @zhangchiqing for your comments on the previous version of this PR: #444. I have addressed all your comments here.
Ran this on localnet with disabled receipts broadcast for the Execution Nodes, to confirm sealing is live.