Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds stake-tracker pallet and integrates with the staking pallet #1933

Open
wants to merge 143 commits into
base: master
Choose a base branch
from

Conversation

gpestana
Copy link
Contributor

@gpestana gpestana commented Oct 18, 2023

This PR adds and integrates the stake-tracker pallet in the staking system.

Goals:

  • To keep a TargetList list of validators strictly and always sorted by their approval votes. Approvals consist of validator's self-vote and the sum of all the corresponding nominations across all the system.
  • The TargetList sorting must be always kept up to date, even in the event of new nomination updates, nominator/validator slashes and rewards. The stake-tracker pallet must ensure that the scores of the targets are always up to date and the targets are sorted by score at all time.
  • To keep a VoterList list of voters that may be either 1) strictly and always sorted by their score (i.e. bonded stake of an individual voter) or 2) loosely sorted list. Choosing between mode 1) and 2) can be done through stake-tracker configurations.

TL;DR 2nd order changes

  • An idle or unbonded target account may have a node in T::TargetList, insofar as there are still nominators nominating it;
  • New nominations must contain either a validator or idle staker. Otherwise, calling Staking::nominate will fail;
  • Staking::nominate will remove all the duplicate nominations implicitly, if any.
    • Note: the migration will remove all the duplicate nominations for all the nominators in the system and all the dangling nominations.
  • If a nominator's bond drops to 0 after a slash, the nominator will be chilled.

Why?

Currently, we select up to N registered validators to be part of the snapshot for the next election. When the number of registered validators in the system exceeds that number, we'll need to have an efficient way to select the top validators with more approvals stake to construct the snapshot.

Thus, we need to keep list of validators sorted by their approval stakes at all time. This means that any update to nominations and their stake (due to slashing, bonding extra, rewards, etc) needs to be reflected in the targets nominated by the stake. Enters the pallet-stake-tracker: this pallet keeps track of relevant staking events (through implementing the trait OnStakingEvent) and updates the target bags list with the target's approvals stake accordingly.

In order to achieve this, the target list must keep track of all target stashes that have at least one nomination in the system (i.e. their approval_stake > 0), regardless of their state. This means that it needs to keep track of the stake of active validators, idle validator and even targets that are not validators anymore but have still nominations lingering.

How?

The stake-tracker pallet implements the OnStakingUpdate trait to listen to staking events and multiplexes those events to one or multiple types (e.g. pallets). The stake tracker pallet is used as a degree of indirection to maintain the target and voter semi-sorted lists (implemented by the bags list pallet) up to date.

The main goal is that all the updates to the targets and voters lists are performed at each relevant staking event through the stake-tracker pallet. However, the voter and target list reads are performed directly through the SortedListProvider set in the staking's config.

269416354-88dc1aa6-d3f0-4359-af0e-5133fc658553

Changes to assumptions in chilled and removed stakers

This PR changes some assumptions behind chilled stakers: the chilled/idle validators will be kept in the Target lists, and only removed from the target list when:

  1. It's ledger is unbonded and
  2. It's approval voting score is zero (i.e. no other stakers are nominating it).

This allows the stake-tracker to keep track of the chilled validators's and respective score after the validator is chilled and completely unbonds. This way, when a validator sets the intention to re-validate, the target's score is brought up with the correct sum of approvals in the system (i.e. self stake + all current nominations, which have been set previous to the re-validation).

Changes to Call::nominate

New nominations can only be performed on active or chilling validators. "Moot" nominations still exist (i.e. nominations that points at an inactive/inexistent validator), but only when a validator stops nominating or is chilled (in which case it may remain in the target list if the approvals are higher than 0).

In addition, the runtime ensures that each nominator does not nominates a target more than once at a time. This is achieved by deduplicating the nominations in the extrinsic Staking::nominate.

Changes to OnStakingUpdate

1. New methods

Added a couple more methods to the OnStakingUpdate trait in order to differentiate removed stakers from idle (chilling) stakers. For a rationale on why this is needed see in this discussion #1933 (comment).

pub trait OnStakingUpdate<AccountId, Balance> {
  // snip

  /// Fired when an existng nominator becomes idle.
  ///
  /// An idle nominator stops nominating but its stake state should not be removed.
  fn on_nominator_idle(_who: &AccountId, _prev_nominations: Vec<AccountId>) {}
  
  // snip

  /// Fired when an existing validator becomes idle.
  ///
  /// An idle validator stops validating but its stake state should not be removed.
  fn on_validator_idle(_who: &AccountId) {}
}

2. Refactor existing methods for safety

With this refactor, the event emitter needs to explicitly call the events with more information about the event. The advantage is that this new interface design prevents potential issues with data sync, e.g. the event emitter does not necessarily need to update the staking state before emitting the events and the OnStakingUpdate implementor does not need to rely as much on the staking interface, making the interface less error prone.

Changes to SortedListProvider

Added a new method to the trait, gated by try-runtime, which returns whether a given node is in the correct position in the list or not given its current score. This method will help with the try state checks for the target list.

pub trait SortedListProvider<AccountId> {
  // snip

  /// Returns whether the `id` is in the correct bag, given its score.
  ///
  /// Returns a boolean and it is only available in the context of `try-runtime` checks.
  #[cfg(feature = "try-runtime")]
  fn in_position(id: &AccountId) -> Result<bool, Self::Error>;
}

Migrations

The migration code has been validated against the Polkadot using the externalities tests in polkadot/runtime/westend/src/lib.rs. Upon running the migrations, we ensure that:

  • All validators have been added to the target list with their correct approvals score (as per the try-state checks).
  • All nominations are "cleaned" (see def. of clean above)
  • Try-state checks related to stake-tracker and approvals pass.

Check #4673 for more info on migrations and related tests.

Note that the migrations will "clean" the current nominations in the system namely:

  • Migration removes duplicate nominations in all nominators, if they exist (changes by calling fn do_add_nominator with dedup nominations)
  • Migration removes all the non active validator nominations to avoid adding dangling nominations (changes by calling fn do_add_nominator if necessary)

Weight complexity

Keeping both target and voter list sorted based on their scores requires their scores to be up to date after single operations (add nominator/validator, update stake, etc) and composite staking events (validator slashing and payouts). See https://hackmd.io/FH8Uhi2aQ5mD0aMm-BbqMQ for more details and back of the envelope calculations.

This PR #4686 shows how the target list affects the staking's MaxExposurePageSize based on benchmarks with different modes. In sum:

 == Strict VoterList sorting mode
 - Max. page_size: 1984
 - Weight { ref_time: 1470154955707, proof_size: 7505385 }

 == Lazy VoterList sorting mode
 - Max page_size: 2496
 - Weight { ref_time: 1469080809430, proof_size: 9437673 }

 == No stake-tracker
 - Max page_size: 3008
 - Weight { ref_time: 1474536186486, proof_size: 11362971 }

To do later

A. Remove legacy CurrencyToVote

Remove the need for the CurrencyToVote converter type in the pallet-staking. This type converts coverts from BalanceOf<T> to a u64 "vote" type, and from a safe u128 (i.e. ExtendedBalance) back to BalanceOf<T>. In both conversion directions, the total issuance of the system must be provided.

The main reason for this convertion is that the current phragmen implementation does not correctly support types u128 as the main type. Thus, the conversion between balance (u128) and the supported "vote" u64 type.

Relying on the current issuance will be a problem with the staking parachain (let's assume that the staking runtime is not deployed in AH). In addition, it removing the need for this conversion will simplify and make it cheaper to run the stake-tracker and the associated list updates.


To finish

  • throughout testing in stake-tracker pallet and integration with the Staking pallet
  • bring back nominations from slashed and chilled validator after re-validate
  • throughout try-runtime checks in stake-tracker which also run after the staking pallet tests
  • figure out a way to do switch between target list providers (to allow for a phased rollout of this feature)
  • benchmarks for Call::drop_dangling_nomination
  • migrations (requires MBMs)
  • test MBM migrations (see Stake tracker improvements (migration and try-state checks OK in Polkadot) #4673)
  • migrate + follow-chain in Polkadot

Closes #442

@gpestana gpestana added the T1-FRAME This PR/Issue is related to core FRAME, the framework. label Oct 18, 2023
@gpestana gpestana self-assigned this Oct 18, 2023
@gpestana gpestana requested review from a team October 18, 2023 18:29
@gpestana gpestana marked this pull request as draft October 18, 2023 18:29
@gpestana gpestana mentioned this pull request Oct 18, 2023
1 task
@gpestana gpestana marked this pull request as ready for review November 7, 2023 14:57
@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6383364 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=pallet_staking. Check out https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 3-a21c1cbc-499b-49d0-857d-5ac845653e0d to cancel this command or bot cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=pallet_staking has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6383364 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6383364/artifacts/download.

@gpestana
Copy link
Contributor Author

gpestana commented Jun 3, 2024

bot bench polkadot-pallet --runtime=westend --pallet=pallet_staking

@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6387304 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=pallet_staking. Check out https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 6-6ed494ea-75a8-4828-a8ae-cf3490eada80 to cancel this command or bot cancel to cancel all commands in this pull request.

…=westend --target_dir=polkadot --pallet=pallet_staking
@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=pallet_staking has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6387304 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6387304/artifacts/download.

@gpestana
Copy link
Contributor Author

gpestana commented Jun 3, 2024

bot bench polkadot-pallet --pallet=pallet_staking

@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6388231 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=rococo --target_dir=polkadot --pallet=pallet_staking. Check out https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 8-00657f8c-1b16-4ad3-a9ec-b4a06016b5e2 to cancel this command or bot cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Jun 3, 2024

@gpestana Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=rococo --target_dir=polkadot --pallet=pallet_staking has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6388231 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6388231/artifacts/download.

Copy link
Contributor

@kianenigma kianenigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed most of this, and looks mainly good to me.

The only major issue I see is that the MBM step is way too small, and prefer not to pause the chain for a long long time.

];

/// Upper thresholds delimiting the targets bag list.
pub const TARGET_THRESHOLDS: [u128; 200] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the distribution of initial validators in these bags good? We could go for a different bag distribution for targets, given that approval stakes are generally much higher. Just an option.

substrate/frame/election-provider-support/src/lib.rs Outdated Show resolved Hide resolved

/// Fired when a portion of a staker's balance has been withdrawn.
fn on_withdraw(_stash: &AccountId, _amount: Balance) {}
/// Representation of the `OnStakingUpdate` events.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives me a possible refactoring idea which I want to share but I don't necessarily want you to do, unless if you see clear benefits to it:

trait OnStakingUpdate {
    fn update(event: OnStakingUpdateEvent) {}
}
pub enum OnStakingUpdateEvent { .. }

And now we just have the events listed once :D

}

impl pallet_balances::Config for Test {
type Balance = Balance;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use pallet_balances::TestDefaultConfig please :)

I am generally going to shamelessly be more demanding on requesting people to use our latest features internally :D


/// V13 Multi-block migration to introduce the stake-tracker pallet.
///
/// A step of the migration consists of processing one nominator in the [`Nominators`] list or one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the migration would take thousands of blocks, right?

not great, as all other extrinsics are blocked while this happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, from reading the docs first time, I got the impression that the migrations logic would fit as many steps in a block as possible. After revisiting the code and docs again, I see that's not the case and that a step will be executed once per block as you mentioned.

I will refactor this code accordingly and try to fit as many nominator migration per block as possible, thanks!

if meter.remaining().any_lt(required) {
return Err(SteppedMigrationError::InsufficientWeight { required });
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the actual outcome when we return this? the chain will ditch this migration and move on? or just this step and it will retry?

// if no nominations are left, chill the nominator.
let _ = <Pallet<T> as StakingInterface>::chill(&who)
.map_err(|e| {
log!(error, "error when chilling {:?}", who);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many of these we have in Polkadot now?

@kianenigma kianenigma requested a review from Ank4n June 10, 2024 03:12
@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6459705

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: ✂️ In progress.
Status: Audited
Development

Successfully merging this pull request may close these issues.

Consider automatic rebaging when rewards are received or slashes happen
8 participants