Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too much stake too fast #5443

Closed
mvines opened this issue Aug 6, 2019 · 17 comments

Comments

@mvines
Copy link
Member

commented Aug 6, 2019

The root cause of the TdS stage 0 dry run 2 cluster failure was due to too much stake coming online from epoch 6 to epoch 7.

Epoch 6 Stake

2QMWpknvQT9GzQaAS2LZRRbi2kVo5Q1ruAqyoj25jnaq   | stake=8589934592  
5QRvZmVCC6FDsvQw7CxMijrX8NdJCGA49YQo3veE9y4B   | stake=8589934592  
8ohZ3JU9jGJQt9ftBNa2ScPpn5ui7j5F8aSn13hi8uWy   | stake=8589934592  
G3BHHnWZR7bvmfhqhRBF7V5sVmABZEeQaXodfguy2n1q   | stake=8589934592  

Epoch 7 Stake

2QMWpknvQT9GzQaAS2LZRRbi2kVo5Q1ruAqyoj25jnaq   | stake=8589934592  
2X5JSTLN9m2wm3ejCxfWRNMieuC2VMtaMWSoqLPbC4Pq   | stake=5726623060  
33LfdA2yKS6m7E8pSanrKTKYMhpYHEGaSWtNNB5s7xnm   | stake=5726623060  
47UuTGPAQZX2HnVcfxKk8b1BtA4rRTduVaHnvxzQe6AJ   | stake=5726623060  
4Bx5bzjmPrU1g74AHfYpTMXvspBt8GnvZVQW3ba9z4Af   | stake=5726623060  
55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U   | stake=5726623060  
5NH47Zk9NAzfbtqNpUtn8CQgNZeZE88aa2NRpfe7DyTD   | stake=5726623060  
5QRvZmVCC6FDsvQw7CxMijrX8NdJCGA49YQo3veE9y4B   | stake=8589934592  
6dMH3u76qZ7XG4bVboVRnBHR2FfrxEqTTTyj4xmyDMWo   | stake=5726623060
8FaFEcUFgvJns6RAU4dso3aTm2qfzZMt2xXtSgCh3kn9   | stake=5726623060  
8ohZ3JU9jGJQt9ftBNa2ScPpn5ui7j5F8aSn13hi8uWy   | stake=8589934592  
9J8WcnXxo3ArgEwktfk9tsrf4Rp8h5uPUgnQbQHLvtkd   | stake=5726623060  
G3BHHnWZR7bvmfhqhRBF7V5sVmABZEeQaXodfguy2n1q   | stake=8589934592  

image

cc:

(fork_stake.stake as f64 / self.epoch_stakes.total_staked as f64)
> self.threshold_size

@mvines mvines added this to the Mavericks v0.18.0 milestone Aug 6, 2019

@mvines mvines added this to To do in TdS Stage 0 via automation Aug 6, 2019

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

this is ameliorated by using live stakes as in #5426, but making everyone slow their roll would be a good thing TM

this is also potentially hot-patch-able by removing the threshold check

@mvines

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

@rob-solana - it looks like the stake went from 0 to 66%, not 33%

@carllin

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@rob-solana, I can also hotfix it by using the EpochStakes for the threshold bank rather than the new EpochStakes, seems messy though.

I hacked something together, going to see if it works :P

@rob-solana rob-solana self-assigned this Aug 7, 2019

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@mvines the stake changing from 0-66% is an artifact of epoch stakes, which are a snapshot. this is basically the expected behavior.

Say we take a snapshot halfway through epoch 5 (for epoch 7), if a stake activates in the very next slot, the next time it ends up in epoch stakes will be halfway through 6 (for epoch 8), in which case it's had 2 epochs of ramp (5 and 6)

@mvines

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

I don't think we need to hack together a super quick fix for this if it moves us away from a more complete solution. Is that #5426 or do we need more in addition? I think it's reasonable to continue to expect that in a TdS setup, we'll go from 0 delegated stake to a large N very quickly as participants jump on at the start.

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

#5426 will help with this, because the stakes will ramp 0->33->66 correctly, as long as the newly staked nodes are voting

looking at epoch stakes with the wallet will always show the bumpy ramp, though

@mvines

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

But we could still run into the same problem if we go from 4 staked validators in epoch 6 to ~50 staked validators in epoch 7? Even with just 33% active, the amount of new stake would exceed that threshold check right?

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

in #5426 the stakes used are for the bank under voting consideration, which means the stakes come on by slot, not epoch, and so more smoothly

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@mvines

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

Cool so #5426 will fix this issue yes? :)

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

yes, I think it should

@carllin

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@rob-solana, I think it will need to be paired with the warmup limitations as well?

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@carllin I don't think it matters for TdS dry runs

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

live stakes are smoother than epoch stakes, even with 33% per epoch warmup

@carllin

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

@rob-solana is there a guarantee that the live stake won't shift by more than 33% between the threshold bank and the live bank you are considering?

@rob-solana

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

no guarantees, just a lot less likely than if you're looking at epoch stakes

@mvines mvines moved this from To do to Blocking Dry Run 3 in TdS Stage 0 Aug 8, 2019

@mvines

This comment has been minimized.

Copy link
Member Author

commented Aug 15, 2019

Calling this done based on what I observe on the edge testnet this morning, my node's stake is coming in nice and slow

@mvines mvines closed this Aug 15, 2019

TdS Stage 0 automation moved this from Blocking Dry Run 3 to Done Aug 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
3 participants
You can’t perform that action at this time.