Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algod: state-proof key deletion safety #4601

Merged

Conversation

algonathan
Copy link
Contributor

Summary

Due to StateProof keys' lifetime, the signer might delete the key while not all signatures are present in the DB.
This critical error might occur when the DB fails to record a signature and a new signature arrives from the same key.

@codecov
Copy link

codecov bot commented Sep 28, 2022

Codecov Report

Merging #4601 (ae5aa9b) into feature/stateproofs-recoverability (9ebdbce) will decrease coverage by 0.50%.
The diff coverage is 58.62%.

@@                          Coverage Diff                           @@
##           feature/stateproofs-recoverability    #4601      +/-   ##
======================================================================
- Coverage                               54.53%   54.03%   -0.51%     
======================================================================
  Files                                     408      408              
  Lines                                   52650    52658       +8     
======================================================================
- Hits                                    28714    28453     -261     
- Misses                                  21540    21791     +251     
- Partials                                 2396     2414      +18     
Impacted Files Coverage Δ
data/account/participationRegistry.go 78.01% <ø> (-0.49%) ⬇️
stateproof/signer.go 50.00% <ø> (-2.39%) ⬇️
stateproof/worker.go 89.28% <ø> (ø)
data/accountManager.go 72.61% <20.00%> (+1.36%) ⬆️
stateproof/builder.go 75.42% <66.66%> (-4.72%) ⬇️
stateproof/recovery.go 0.00% <0.00%> (-77.78%) ⬇️
crypto/onetimesig.go 0.00% <0.00%> (-76.07%) ⬇️
data/bookkeeping/txn_merkle.go 62.96% <0.00%> (-22.23%) ⬇️
data/account/participation.go 45.68% <0.00%> (-18.11%) ⬇️
crypto/rand.go 11.76% <0.00%> (-17.65%) ⬇️
... and 30 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@algonathan algonathan changed the base branch from master to feature/stateproofs-recoverability September 28, 2022 14:46
stateproof/signer.go Outdated Show resolved Hide resolved
@id-ms id-ms marked this pull request as ready for review October 24, 2022 06:52
@id-ms id-ms force-pushed the stateprf-keydelete branch 2 times, most recently from d99b8e3 to 65d8b21 Compare October 25, 2022 07:00
@id-ms id-ms requested a review from almog-t October 25, 2022 08:11
stateproof/builder.go Outdated Show resolved Hide resolved
stateproof/builder.go Outdated Show resolved Hide resolved
return
}

oldestRoundToRemove := stateProofNextRound.SubSaturate(basics.Round(proto.StateProofInterval))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, for every round (?) we're going to be going over all of the participation records, extracting the state proof keys from them and removing stale keys? It might be worth it to hold the previous state proof next round to decide if we have to do that, even if we put aside the question of changing state proof intervals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point. I've addressed that. take a second look

stateproof/worker_test.go Outdated Show resolved Hide resolved
stateproof/builder.go Outdated Show resolved Hide resolved
func (spw *Worker) deleteOldBuilders(currentHdr *bookkeeping.BlockHeader) {
oldestRoundToRemove := GetOldestExpectedStateProof(currentHdr)
func (spw *Worker) deleteStaleKeys(latestRoundToKeep basics.Round) {
keys := spw.accts.StateProofKeys(latestRoundToKeep)
Copy link
Contributor

@almog-t almog-t Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we miss out on purging stale data from accounts that can't sign latestRoundToKeep but still have old keys in their DB?
It seems to me we should simply iterate over all accounts and invoke spw.accts.DeleteStateProofKey(participationID, latestRoundToKeep) for all of them, no?

spw.log.Errorf("deleteOldKeys: could not calculate keylifetime for account %v on round %s: %v", key.ParticipationID, roundToRemove, err)
continue
}
err = spw.accts.DeleteStateProofKey(key.ParticipationID, basics.Round(roundToRemove))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to use roundToRemove? Why not latestRoundToKeep instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we delete roundToRemove we might delete a key that should be used for a later state proof.

Copy link
Contributor

@almog-t almog-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's iron out the kinks in the deletion part, otherwise looks good.

data/accountManager.go Outdated Show resolved Hide resolved
@id-ms id-ms merged commit 7f11ab3 into algorand:feature/stateproofs-recoverability Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants