Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

babe: support online configuration upgrades #5514

Merged
merged 16 commits into from
Apr 24, 2020
Merged

Conversation

sorpaas
Copy link
Member

@sorpaas sorpaas commented Apr 3, 2020

This is an attempt to support online configuration upgrades for BABE engine. Still needs to write the migration scripts, but would love to get some reviews.

Currently, BABE's configuration is only read at genesis block. It's then fixed, and cannot be changed. As a result, it's not possible to perform online upgrades, for example, from primary-slots only to primary-and-secondary-slots. This PR tries to fix that, by doing the following:

  • Split the configuration of BABE into genesis configuration and epoch configuration.
  • Genesis configuration is fixed and only read at genesis. Epoch configuration is read at every epoch, and stored alongside the epoch.
  • Genesis configuration contains information that cannot be changed. This includes the slot duration, genesis authorities and genesis randomness.
  • Epoch configuration contains other information that can be changed. This includes the value of c, and whether the engine supports secondary-slots. The encoding of epoch configuration can also be flexibly extended in the future.
  • At every next epoch descriptor, the parent block's epoch_configuration is read, and used as the epoch configuration for the epoch described by the next epoch descriptor. This config is then used by authorship and verification rules.

@sorpaas sorpaas added A0-please_review Pull request needs code review. A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 3, 2020
Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great idea and I like the approach taken here. Some questions though:

  • I think currently it is not safe to change epoch length. The runtime module only emits NextEpochDescriptor digests which don't contain any epoch length, and from the client-side we always calculate end slot based on duration from current epoch (so basically genesis epoch duration).
  • I think it should be possible to change epoch length dynamically but it will need some support from runtime module (we also need to think about implications for light client).
  • epoch_configuration is added as a runtime API method so should we add some support for this in the runtime module? Currently the runtime doesn't know about c so this could be handled just by reading from chainspec (a hardfork basically).

@rphmeier
Copy link
Contributor

rphmeier commented Apr 3, 2020

It will need light client support because any changes in configuration is technically a hard-fork and requires finality to process.

As Andre says it's probably not safe to change epoch length or slot time dynamically at the moment.

@sorpaas
Copy link
Member Author

sorpaas commented Apr 6, 2020

Discussed with Rob. I'm going to add the config struct also to NextEpochDescriptor, to fix the light client issue.

  • For full node, the engine reads BabeEpochConfiguration from runtime, and check that it matches the data provided in NextEpochDescriptor.
  • For light client, the engine just accepts whatever BabeEpochConfiguration provided in NextEpochDescriptor.

@sorpaas
Copy link
Member Author

sorpaas commented Apr 7, 2020

Changed this a little bit, and I think I've got a design that I'm satisfied with a) maintains backward compatibility, b) support light clients, and c) extensible:

  • The genesis config BabeGenesisConfiguration is unchanged (same as old BabeConfiguration), and the runtime interface is unchanged.
  • The epoch config now just contains two values that I think is safe to change. c and secondary_slots. Not including epoch length as I'm just not quite sure. It can be added in the future.
  • A new digest is added, NextConfigDescriptor. The digest is optional, and if it is there together with NextEpochDescriptor, then the epoch after the current epoch will use the configuration as described by the descriptor. This is backward compatible of current BABE because blocks without NextConfigDescriptor is valid.
  • In the future, when we need to add new values to epoch config, it can just be defined as an additional digest type, and enact separately with the same rule. Changing genesis config will still require runtime interface change, though (so to support VRF from genesis in babe: secondary blocks with VRF #5501 the runtime version there sill have to be changed).

@sorpaas sorpaas marked this pull request as ready for review April 7, 2020 15:24
@sorpaas sorpaas requested a review from Demi-Marie as a code owner April 7, 2020 15:24
@sorpaas sorpaas added B4-breaksauthoring and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 8, 2020
Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes lgtm but there's a test failure on aux_schema. did you have a chance to test this on kusama? or maybe we can test after merging this and before #5501.

client/consensus/babe/src/lib.rs Outdated Show resolved Hide resolved
@andresilva
Copy link
Contributor

In a follow-up PR we should add a method to the babe pallet, callable by root, that allows signaling a config change. This should queue the change so that the event is sent together with the next epoch descriptor.

sorpaas and others added 2 commits April 21, 2020 00:02
Co-Authored-By: André Silva <123550+andresilva@users.noreply.github.com>
@sorpaas
Copy link
Member Author

sorpaas commented Apr 20, 2020

NOTE: This needs to be merged together with #5501 to avoid multiple migrations (merge this PR first, and then merge #5501). I'll do Kusama testing tomorrow.

@sorpaas
Copy link
Member Author

sorpaas commented Apr 20, 2020

Finished test syncing Kusama in #5501.

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants