-
Notifications
You must be signed in to change notification settings - Fork 102
Refactor miner WindowPoSt state into per-partition aggregates and queues #648
Conversation
|
Do we want to forbid sealing sectors, winning blocks, etc. as well? Or just forbid withdrawal. |
It's looking like we're going to need to make this queue per-partition to deal with massive termination batches. If we do that, we're going to need:
The other question here is how we're going to deal with early terminations. We don't want miner's withdrawing funds while they haven't payed per-deal early termination penalties.
(obviously leaning towards 1 here) |
The other question is whether or not this actually needs to be a queue. Unless we expect it to fall behind regularly, I'm not sure if it does. edit: It needs to be a queue, but only for early terminations (as far as I can tell?). |
Proposal:
Am I missing something here? edit: Yes, there's a per-deal client collateral. |
@Stebalien and I talked about this.
Defrag of a partition will have to wait for any early terminations to be processed. |
So, I've hit a bit of a snag. We garbage collect deals as soon as they expire. If we process early terminations late, we may try to access missing deals when we compute fines. Can we keep dead deals around for a bit? (and possibly bound how late we can process early termination fees?). |
Answer: For manual terminations, always empty the queue before processing. |
No. For 14-day fault terminations we'll just deal with this occasionally coming a few epochs late if we have a large queue to work through. |
I can fix 2 without fixing 1. But in that case, I won't be able to clear the queue from 2 until I've processed all outstanding sector expirations. |
I think out-of-order is ok for the simplicity gains of doing it by partition. The state needed for termination fees must be written into the sector on-chain info. The BR(StartEpoch) can be computed from the initial pledge, and my impression was that that would suffice. We can store the BR explicitly if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
|
||
// - Skipped faults that are not in the provided partition triggers an error. | ||
// - Skipped faults that are already declared (but not delcared recovered) are ignored. | ||
func processSkippedFaults(rt Runtime, st *State, store adt.Store, faultExpiration abi.ChainEpoch, partition *Partition, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we push this down to the state? We'd lose the IllegalArgument/IllegalState distinction, but then the state can maintain relations between the fields.
f8c0f1d
to
0d08972
Compare
42e9e92
to
7bc00e6
Compare
Codecov Report
@@ Coverage Diff @@
## master #648 +/- ##
=========================================
- Coverage 68.4% 51.0% -17.5%
=========================================
Files 44 50 +6
Lines 4787 5736 +949
=========================================
- Hits 3279 2929 -350
- Misses 1115 2436 +1321
+ Partials 393 371 -22 |
This saves us some state updates.
95e2eb5
to
3e9fbd6
Compare
I agree, checking more assumptions and preconditions in the partition would be worthwhile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to handle downtime but that might be a question for another PR.
builtin.RequireNoErr(rt, err, exitcode.ErrIllegalState, "failed to load proven sector info") | ||
|
||
// Skip verification if all sectors are faults | ||
// Skip verification if all sectors are faults. | ||
// We still need to allow this call to succeed so the miner can declare a whole partition as skipped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to do this?
|
||
newSector := *sector | ||
newSector.Expiration = decl.NewExpiration | ||
//qaPowerDelta := big.Sub(QAPowerForSector(info.SectorSize, &newSector), QAPowerForSector(info.SectorSize, sector)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented out code
// That way, don't re-schedule a cron callback if one is already scheduled. | ||
hadEarlyTerminations = havePendingEarlyTerminations(rt, &st) | ||
|
||
// Note: because the cron actor is not invoked on epochs with empty tipsets, the current epoch is not necessarily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we're multiple deadlines ahead? I'm guessing we should skip the missed deadlines and give the miner a pass, but we should probably do something.
// Increment current deadline, and proving period if necessary. | ||
if dlInfo.PeriodStarted() { | ||
st.CurrentDeadline = (st.CurrentDeadline + 1) % WPoStPeriodDeadlines | ||
if st.CurrentDeadline == 0 { | ||
st.ProvingPeriodStart = st.ProvingPeriodStart + WPoStProvingPeriod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to handle getting a full proving period behind?
// Set new proving period start. | ||
if deadline.PeriodStarted() { | ||
// Increment current deadline, and proving period if necessary. | ||
if dlInfo.PeriodStarted() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be false, can it?
|
||
for i := uint64(0); i < partitions.Length(); i++ { | ||
key := PartitionKey{dlInfo.Index, i} | ||
proven, err := deadline.PostSubmissions.IsSet(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is potentially slow. We should probably just expand this into a map (it's not too large). We can also do this with bitfield magic, but that's probably more complicated than it's worth.
// Accumulate sectors info for proof verification. | ||
for _, post := range params.Partitions { | ||
key := PartitionKey{params.Deadline, post.Index} | ||
alreadyProven, err := deadline.PostSubmissions.IsSet(post.Index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expand PostSubmissions into a map.
penaltyTarget := PledgePenaltyForUndeclaredFault(epochReward, pwrTotal.QualityAdjPower, penalizePowerTotal) | ||
// Subtract the "ongoing" fault fee from the amount charged now, since it will be added on just below. | ||
penaltyTarget = big.Sub(penaltyTarget, PledgePenaltyForDeclaredFault(epochReward, pwrTotal.QualityAdjPower, penalizePowerTotal)) | ||
penalty, err := st.UnlockUnvestedFunds(store, currEpoch, penaltyTarget) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this once at the very end?
1. This needs to happen on compaction. 2. This can't happen here anyways.
It's kind of awkward to take slices of bitfields, this is something the caller should generally do.
I think all the important bits have either been implemented or recorded in new issues. My only remaining concern is dealing with too many null blocks, but we can extract that into a new issue as well. |
And test termination result type.
For motivation, see #599.
This giant PR restructures the miner actor's state, representing partitions as first-class objects. Sectors, faults, recoveries, expiration and terminations are all tracked per-partition. A few totals of power and pledge are maintained in the partition so that power and penalty accounting for faults etc need not load the
SectorOnChainInfo
s for all the sectors (which can be a lot). The heavy per-sector information is only loaded in miner-initiated messages, never from cron.Significant:
TODO before merging:
TODO follow-up:
Closes #391
Closes #357
Closes #391
Closes #411
Closes #418
Closes #483
Closes #519
Closes #535
Closes #552
Closes #593