btcstaking: improved algorithm for voting power update #486

SebastianElvis · 2024-02-19T03:27:48Z

This PR introduces the improved algorithm for updating voting power table and distribution cache. Instead of iterating over all finality providers and BTC delegations, the algorithm reconciles the previous voting power distribution with all new voting power update events. This reduces the complexity of BeginBlocker from O(# of all BTC delegations) to O(# of BTC delegations whose states are updated at this height).

The algorithm works as follows:

Find the BTC tip height h at current Babylon height and BTC tip height h' at the prior Babylon height (ie current - 1)
Find all voting power distribution update events during [h', h]

This includes newly active BTC delegations, newly unbonded BTC delegations, and slashed finality providers

Reconcile the last voting power distribution with all these events, as follows
exclude all finality providers that are newly slashed
exclude all newly unbonded BTC delegations
add all newly active BTC delegations
filter out top N finality providers as active fp set
record the new voting power table and cache
record all metrics
remove all processed events

Along the way, this PR also simplifies and fixes all existing fuzz tests, which have covered most of the scenarios.

TODO (that can be in subsequent PRs):

update docs
add more fuzz tests

x/btcstaking/keeper/btc_delegations.go

RafilxTenfen · 2024-02-21T10:14:14Z

x/btcstaking/keeper/btc_delegations.go

+		NewState:      types.BTCDelegationStatus_PENDING,
+	}
+	if err := sdkCtx.EventManager().EmitTypedEvent(event); err != nil {
+		panic(fmt.Errorf("failed to emit EventBTCDelegationStateUpdate for the new pending BTC delegation: %w", err))


IMO it is a little too harsh to panic if the emit event fails, maybe just log or return err

Given that events are part of the consensus, if some validators successfully emit events while others do not, then there will be consensus failure. Panicking here allows the validators with event issues to stop immediately, resolve the issues, and restart. Wdyt?

Events do not affect appHash
The error that can happen on EmitTypedEvent is to unmarshal the typed event, not on EmitEvent part

So, if one validator sends the event and other doesn't, it does not affect consensus...

my opinion is

panics should be avoided in almost all cases (besides from rarely bad cases like consensus failures)

It's not a part of app hash, but a part of Tendermint consensus (i.e., as input of some hash values in Tendermint headers). There was a decision that Tendermint makes events a part of consensus to make them deterministic (here and here). So in case if many validators fail to emit events, the Cosmos SDK app hash won't diverge, but Tendermint consensus will stall.

I kinda agree with @RafilxTenfen in the sense that if the panic indeed happens, then it happens in all the validators and the chain will completely stall until a fix is deployed over >2/3 validators. But if we log error/alert here, we won't have a stall.

As we have used panics in all events emitting places, we can revisit this problem later.

then it happens in all the validators and the chain will completely stall until a fix is deployed over >2/3 validators. But if we log error/alert here, we won't have a stall.

This is not necessarily the only possibility -- another possible scenario (which I think is more likely) is that some validators did something wrong in their configurations and they failed to emit events, while other validators emit events without problem. In this case if validators with invalid configurations return an error while other validators go through the process, then the two sets of validators will end up with different app hash. If we panick here these validators still have a chance to fix the configuration and restart the node to catch up, without doing any roll back.

x/btcstaking/keeper/power_dist_change.go

x/btcstaking/types/incentive.go

RafilxTenfen

A couple of suggestions, but no impediments
utACK

gitferry

Sorry for the delay. Spent some time to get my head around. In general, both the algorithm and implementation lgtm. Awesome work!

One question regarding the necessity of introducing ActiveFinalityProviders field. If we always store FinalityProviders in the order of voting power. It would be easy to get active set out of this. So do we really need ActiveFinalityProviders?

Some other comments:

x/btcstaking/keeper/btc_delegations.go

gitferry · 2024-02-22T07:05:43Z

x/btcstaking/keeper/btc_delegations.go

+
+	// for each finality provider the delegation restakes to, update its index
+	for _, fpBTCPK := range btcDel.FpBtcPkList {
+		var btcDelIndex = types.NewBTCDelegatorDelegationIndex()


Suggested change

var btcDelIndex = types.NewBTCDelegatorDelegationIndex()

var btcDelIndex *types.BTCDelegatorDelegationIndex

Is NewBTCDelegatorDelegationIndex needed?

I think so, since later on we invoke btcDelIndex.Add. If it's nil then the function invocation will panick

gitferry · 2024-02-22T07:17:24Z

x/btcstaking/keeper/btc_height_index.go

+		// if the previous height is not indexed (which might happen at genesis),
+		// use the base header


I think Genesis is the only case that btc height is not indexed, right? Maybe we can check whether babylonHeight == 0 and panic if not (actually cosmos forbids access of height 0, so maybe we can panic eight way)

I'm not sure if genesis is the only case that this might happen. Since BTC light client is initialised earlier than BTC staking module I would say using the base header is the safest solution. Wdyt?

x/btcstaking/keeper/incentive.go

x/btcstaking/keeper/power_dist_change.go

SebastianElvis · 2024-02-22T08:55:06Z

If we always store FinalityProviders in the order of voting power. It would be easy to get active set out of this.

Good point, you are right! Will revise

x/btcstaking/keeper/power_dist_change.go

KonradStaniec · 2024-02-22T10:16:55Z

x/btcstaking/keeper/power_dist_change.go

+	/*
+		filter and classify all events into new/expired BTC delegations and slashed FPs
+	*/
+	for _, event := range events {


this could probably be separate function called filterAndClassifyAllEvents

On the other hand, why do we need this classification step ?

Wouldn't it be enough to:

Create map fpBtcPkHex -> FinalityProviderDistInfo based on old distribution cache

iterate over all events and apply the changes to this map based on events

Create new distribution cache based on this new map

This flow maps a bit more nicely to the flow I had in mind for the whole thing i.e newDistCache = appalyEvents(oldDistCahce, Events) but maybe I am missing some detail here.

This is because the distribution cache uses lists for finality providers and BTC delegations, thus cannot support efficient query. Say there is a BTC delegation expired, we need to do an O(n) operation to find that BTC delegation and remove it. This needs to be done for every BTC delegation.

Right now we classify these events and assign them to maps, such that we only need to iterate over distribution cache once

hmm I think I understand.

On the other hand with this approach with map you need to iterate over all finality providers and all their delegations in this middle loop:

// iterate over all finality providers and apply all events for i := range dc.FinalityProviders { // create a copy of the finality provider fp := *dc.FinalityProviders[i] fp.TotalVotingPower = 0 fp.BtcDels = []*types.BTCDelDistInfo{} fpBTCPKHex := fp.BtcPk.MarshalHex() // if this finality provider is slashed, continue to avoid recording it if _, ok := slashedFPs[fpBTCPKHex]; ok { continue } // add all BTC delegations that are not unbonded to the new finality provider for j := range dc.FinalityProviders[i].BtcDels { btcDel := *dc.FinalityProviders[i].BtcDels[j] if _, ok := unbondedBTCDels[btcDel.StakingTxHash]; !ok { fp.AddBTCDelDistInfo(&btcDel) } } // process all new BTC delegations under this finality provider if fpActiveBTCDels, ok := activeBTCDels[fpBTCPKHex]; ok { // handle new BTC delegations for this finality provider for _, d := range fpActiveBTCDels { fp.AddBTCDel(d) } // remove the finality provider entry in activeBTCDels map, so that // after the for loop the rest entries in activeBTCDels belongs to new // finality providers with new BTC delegations delete(activeBTCDels, fpBTCPKHex) } // add this finality provider to the new cache if it has voting power if fp.TotalVotingPower > 0 { newDc.AddFinalityProviderDistInfo(&fp) } }

which seems wasteful. I think main advantage of having the event based model here is that we are able caluclate to per block reward and power distribution diffs.

In most cases the differences in rewards/power between Babylon blocks will be minimal like few new delegations and few unbondings. Given that it seems to me that it we should avoid iteration over all data from past distribution.

Not a blocker for me for now, we can improve it after adding more tests to this 👍

In most cases the differences in rewards/power between Babylon blocks will be minimal like few new delegations and few unbondings. Given that it seems to me that it we should avoid iteration over all data from past distribution.

Yep agreed. Roughly speaking the current version has the complexity of O(n*x), while the optimal complexity is O(n log n), where n is the number of active finality providers and x is the number of active BTC delegations under each finality provider. To achieve the optimal value we need to use a map to store all BTC delegation's voting power, which seems not straightforward given that we cannot use map that is non-deterministic. Added a TODO for now.

x/btcstaking/keeper/power_dist_change.go

proto/babylon/btcstaking/v1/incentive.proto

x/btcstaking/keeper/btc_delegations.go

x/btcstaking/types/incentive.go

SebastianElvis · 2024-02-23T01:25:46Z

Thanks all for the great comments! I have addresses or answers all comments, notably

sorting finality providers in the voting power distribution cache and avoid saving a copy of active ones
only saving events that affect voting power distribution (i.e., excluding pending BTC delegations)
some refactoring of public/private functions

Please feel free to take a look again.

KonradStaniec

Great work! Left only some potential sugguestions for future improvements :)

KonradStaniec · 2024-02-23T13:05:10Z

x/btcstaking/keeper/power_dist_change.go

+	/*
+		filter and classify all events into new/expired BTC delegations and slashed FPs
+	*/
+	for _, event := range events {


hmm I think I understand.

On the other hand with this approach with map you need to iterate over all finality providers and all their delegations in this middle loop:

// iterate over all finality providers and apply all events for i := range dc.FinalityProviders { // create a copy of the finality provider fp := *dc.FinalityProviders[i] fp.TotalVotingPower = 0 fp.BtcDels = []*types.BTCDelDistInfo{} fpBTCPKHex := fp.BtcPk.MarshalHex() // if this finality provider is slashed, continue to avoid recording it if _, ok := slashedFPs[fpBTCPKHex]; ok { continue } // add all BTC delegations that are not unbonded to the new finality provider for j := range dc.FinalityProviders[i].BtcDels { btcDel := *dc.FinalityProviders[i].BtcDels[j] if _, ok := unbondedBTCDels[btcDel.StakingTxHash]; !ok { fp.AddBTCDelDistInfo(&btcDel) } } // process all new BTC delegations under this finality provider if fpActiveBTCDels, ok := activeBTCDels[fpBTCPKHex]; ok { // handle new BTC delegations for this finality provider for _, d := range fpActiveBTCDels { fp.AddBTCDel(d) } // remove the finality provider entry in activeBTCDels map, so that // after the for loop the rest entries in activeBTCDels belongs to new // finality providers with new BTC delegations delete(activeBTCDels, fpBTCPKHex) } // add this finality provider to the new cache if it has voting power if fp.TotalVotingPower > 0 { newDc.AddFinalityProviderDistInfo(&fp) } }

which seems wasteful. I think main advantage of having the event based model here is that we are able caluclate to per block reward and power distribution diffs.

In most cases the differences in rewards/power between Babylon blocks will be minimal like few new delegations and few unbondings. Given that it seems to me that it we should avoid iteration over all data from past distribution.

Not a blocker for me for now, we can improve it after adding more tests to this 👍

x/btcstaking/keeper/btc_delegations.go

KonradStaniec · 2024-02-23T13:10:57Z

x/btcstaking/keeper/power_dist_change.go

+			delEvent := typedEvent.BtcDelStateUpdate
+			if delEvent.NewState == types.BTCDelegationStatus_ACTIVE {
+				// newly active BTC delegation
+				btcDel, err := k.GetBTCDelegation(ctx, delEvent.StakingTxHash)


hmm I our activation event would contain all necesary data to build:

type BTCDelDistInfo struct { // btc_pk is the Bitcoin secp256k1 PK of this BTC delegation // the PK follows encoding in BIP-340 spec BtcPk *github_com_babylonchain_babylon_types.BIP340PubKey `protobuf:"bytes,1,opt,name=btc_pk,json=btcPk,proto3,customtype=github.com/babylonchain/babylon/types.BIP340PubKey" json:"btc_pk,omitempty"` // babylon_pk is the Babylon public key of the BTC delegation BabylonPk *secp256k1.PubKey `protobuf:"bytes,2,opt,name=babylon_pk,json=babylonPk,proto3" json:"babylon_pk,omitempty"` // staking_tx_hash is the staking tx hash of the BTC delegation StakingTxHash string `protobuf:"bytes,3,opt,name=staking_tx_hash,json=stakingTxHash,proto3" json:"staking_tx_hash,omitempty"` // voting_power is the voting power of the BTC delegation VotingPower uint64 `protobuf:"varint,4,opt,name=voting_power,json=votingPower,proto3" json:"voting_power,omitempty"` }

We could avoid this random db read. Maybe todo ?

This is not straightforward and in some sense not evitable. A BTC delegation can stake to multiple finality providers thus has the field FpBtcPkList for a list of finality providers' PKs. Meanwhile, BTCDelDistInfo does not care about finality providers' PKs.

Given that we limit the number of GetBTCDelegation invocations (the hotspot) to O(# of newly active BTC delegation) (which is optimal), maybe this is fine. Wdyt?

hmm maybe we will see when we will do load testing again :)

I wondering, as we will launch mainnet without restaking, maybe we should just have 1 to 1 mapping between delegation and finality provider ? I really do not like having features just in case :)

KonradStaniec · 2024-02-23T13:18:34Z

x/btcstaking/keeper/btc_delegations.go

+	}
+
+	// NOTE: we don't need to record events for pending BTC delegations since these
+	// do not affect voting power distribution


just food for thought.

After thinking on it a bit longer, maybe it is usefull to have pending delegations in our VotingPowerDistCache ?

If we would do that then we could make it that the order of BTCDelDistInfo under each FinalityProviderDistInfo will be the same as the order of delegations in finality provider delegations index. This mean that whole VotingPowerDistCache will be a bit like current snapshot of our whole db.

Having such setup, we could add delegationIndex under each finality provider to each activation/unbonding event. Then during voting power update, we could use this indexes to avoid linear scans.

If we would do that then we could make it that the order of BTCDelDistInfo under each FinalityProviderDistInfo will be the same as the order of delegations in finality provider delegations index. This mean that whole VotingPowerDistCache will be a bit like current snapshot of our whole db.

There is a problem that delegations in finality provider delegations index in the KV store are out of order -- BTC delegations are ordered by their staking tx hash, and finality provider delegations are ordered by the finality provider's BTC PK. So their orders are different from VotingPowerDistCache here. So this does not help with avoiding linear scans unfortunately

hmm from what I see delegations in index are ordered by time of arribal i.e with each new delegation we just append them to index:

if err := btcDelIndex.Add(stakingTxHash); err != nil { return types.ErrInvalidStakingTx.Wrapf(err.Error()) }

So if we would update BtcDels []*BTCDelDistInfo just by processing events in order then this field would have the same order as delegation index of validator.

What am I missing ?

Ah you are right, yes in the BTC delegation index the BTC delegations are ordered by time. However I do not see how this helps with avoiding linear scan. The BTC delegation index is keyed by finality provider PK and BTC staker PK, while the events are only keyed by the Babylon height. It needs some thought if we want to use BTC delegation index to track state updates.

gitferry

Very nice work!

gitferry · 2024-02-23T12:06:56Z

x/btcstaking/keeper/btc_delegations.go

+		NewState:      types.BTCDelegationStatus_PENDING,
+	}
+	if err := sdkCtx.EventManager().EmitTypedEvent(event); err != nil {
+		panic(fmt.Errorf("failed to emit EventBTCDelegationStateUpdate for the new pending BTC delegation: %w", err))


I kinda agree with @RafilxTenfen in the sense that if the panic indeed happens, then it happens in all the validators and the chain will completely stall until a fix is deployed over >2/3 validators. But if we log error/alert here, we won't have a stall.

As we have used panics in all events emitting places, we can revisit this problem later.

x/btcstaking/keeper/power_dist_change.go

SebastianElvis marked this pull request as ready for review February 21, 2024 04:48

SebastianElvis requested review from vitsalis, gitferry and KonradStaniec February 21, 2024 04:48

SebastianElvis changed the title ~~new algorithm for voting power update~~ btcstaking: improved algorithm for voting power update Feb 21, 2024

SebastianElvis requested a review from RafilxTenfen February 21, 2024 04:49

RafilxTenfen reviewed Feb 21, 2024

View reviewed changes

x/btcstaking/keeper/btc_delegations.go Outdated Show resolved Hide resolved

RafilxTenfen reviewed Feb 21, 2024

View reviewed changes

x/btcstaking/keeper/btc_delegations.go Outdated Show resolved Hide resolved

RafilxTenfen reviewed Feb 21, 2024

View reviewed changes

x/btcstaking/keeper/power_dist_change.go Outdated Show resolved Hide resolved

RafilxTenfen reviewed Feb 21, 2024

View reviewed changes

x/btcstaking/types/incentive.go Outdated Show resolved Hide resolved

RafilxTenfen approved these changes Feb 21, 2024

View reviewed changes

SebastianElvis force-pushed the base/index-optimise-btcstaking branch from 1716aae to 7d5b82b Compare February 21, 2024 23:35

SebastianElvis added 11 commits February 22, 2024 10:35

minor

86b1ae6

half way here

43d227b

1st impl

1d0d6dc

some refactor

745bfc8

fix the old algo

46564d0

more refactor

74bc3ff

a bunch of fixes

706efeb

one last failed test

7f7feff

fix all tests

0409208

metrics and TODOs

d335bbc

fix benchmark

e1a40d2

SebastianElvis force-pushed the new-voting-power-update-algorithm branch from 176697b to e1a40d2 Compare February 21, 2024 23:35

pr comments

f1b5856

gitferry approved these changes Feb 22, 2024

View reviewed changes

KonradStaniec reviewed Feb 22, 2024

View reviewed changes

first batch of pr comments

19695db

SebastianElvis added 3 commits February 23, 2024 10:53

another batch of pr comments

d9445ab

remove active fp field

e8d2ed2

minor

089d41c

SebastianElvis requested review from gitferry, RafilxTenfen and KonradStaniec February 23, 2024 01:25

update docs

863c280

KonradStaniec approved these changes Feb 23, 2024

View reviewed changes

gitferry approved these changes Feb 24, 2024

View reviewed changes

pr comments

a895823

SebastianElvis merged commit 32ef3f5 into base/index-optimise-btcstaking Feb 26, 2024
4 checks passed

SebastianElvis deleted the new-voting-power-update-algorithm branch February 26, 2024 00:35

SebastianElvis added a commit that referenced this pull request Feb 26, 2024

btcstaking: improved algorithm for voting power update (#486)

12b6c0f

SebastianElvis added a commit that referenced this pull request Feb 28, 2024

btcstaking: improved algorithm for voting power update (#486)

c1fa53b

SebastianElvis added a commit that referenced this pull request Feb 29, 2024

btcstaking: improved algorithm for voting power update (#486)

6a27fa9

SebastianElvis added a commit that referenced this pull request Mar 4, 2024

btcstaking: improved algorithm for voting power update (#486)

009a25c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

btcstaking: improved algorithm for voting power update #486

btcstaking: improved algorithm for voting power update #486

SebastianElvis commented Feb 19, 2024 •

edited

Loading

RafilxTenfen Feb 21, 2024

SebastianElvis Feb 21, 2024

RafilxTenfen Feb 22, 2024

SebastianElvis Feb 22, 2024 •

edited

Loading

gitferry Feb 23, 2024

SebastianElvis Feb 25, 2024 •

edited

Loading

RafilxTenfen left a comment

gitferry left a comment

gitferry Feb 22, 2024

SebastianElvis Feb 22, 2024

gitferry Feb 22, 2024

SebastianElvis Feb 22, 2024

SebastianElvis commented Feb 22, 2024

KonradStaniec Feb 22, 2024

SebastianElvis Feb 22, 2024

KonradStaniec Feb 23, 2024

SebastianElvis Feb 26, 2024

SebastianElvis commented Feb 23, 2024

KonradStaniec left a comment

KonradStaniec Feb 23, 2024

KonradStaniec Feb 23, 2024

SebastianElvis Feb 26, 2024

KonradStaniec Feb 26, 2024

KonradStaniec Feb 23, 2024

SebastianElvis Feb 26, 2024

KonradStaniec Feb 26, 2024

SebastianElvis Feb 26, 2024

gitferry left a comment

gitferry Feb 23, 2024

	var btcDelIndex = types.NewBTCDelegatorDelegationIndex()
	var btcDelIndex *types.BTCDelegatorDelegationIndex

		// if the previous height is not indexed (which might happen at genesis),
		// use the base header

btcstaking: improved algorithm for voting power update #486

btcstaking: improved algorithm for voting power update #486

Conversation

SebastianElvis commented Feb 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SebastianElvis Feb 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SebastianElvis Feb 25, 2024 • edited Loading

Choose a reason for hiding this comment

RafilxTenfen left a comment

Choose a reason for hiding this comment

gitferry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SebastianElvis commented Feb 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SebastianElvis commented Feb 23, 2024

KonradStaniec left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gitferry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SebastianElvis commented Feb 19, 2024 •

edited

Loading

SebastianElvis Feb 22, 2024 •

edited

Loading

SebastianElvis Feb 25, 2024 •

edited

Loading