Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIP 8 - replace FAILING with MUST_SIGNAL #950

Closed
wants to merge 3 commits into from

Conversation

ajtowns
Copy link
Contributor

@ajtowns ajtowns commented Jul 26, 2020

Prior to this patch bip8 allows for states to go STARTED -> FAILING -> ACTIVE in which case the last block of FAILING state signalling or not signalling would change the consensus rules applied to the very next block. This potentially gives very little warning of an upgrade.

Additionally, if it is desirable to reduce the timeoutheight for a bip8 deployment (for example from X to Y, Y < X), then activation via lockinontimeout at the shorter height would not result in consistent consensus rules. In particular, implementations would see:

  • timeoutheight=Y, lockinontimeout=true: height Y-2016: STARTED, height Y: LOCKED_IN, height Y+2016: ACTIVE
  • timeoutheight=Y, lockinontimeout=false: height Y-2016: STARTED, height Y: FAILING, height Y+2016: ACTIVE
  • timeoutheight=X, lockinontimeout=*: height Y-2016: STARTED, height Y: STARTED, height Y+2016: LOCKED_IN, height Y+4032: ACTIVE

This patch also reduces the number of blocks that must signal during the new MUST_SIGNAL phase from the current 100% requirement to the same requirement as needed to trigger from STARTED to LOCKED_IN; this ensures that provided the same threshold is used, if the deployment transitions to LOCKED_IN at height X for any bip8 deployment, it will transition to LOCKED_IN at height X for every bip8 deployment, given the same startheight and timeoutheight > X.

It also uses the MUST_SIGNAL state to trigger validation rules checking blocks signal. It would be possible to use these rules for checking if reindexing is necessary -- if you added an "impossible" transition from MUST_SIGNAL to INVALID if the threshold isn't met, then reindexing necessary precisely when the current block is INVALID, or MUST_SIGNAL and fails validation, and you need to reindex back to the last STARTED/DEFINED block.

bip-0008.mediawiki Show resolved Hide resolved
Miners must continue setting the bit in LOCKED_IN phase so uptake is visible and acknowledged.
Blocks without the applicable bit set are invalid during this period.
For flexibility, this rule does NOT require the top 3 bits to be set any particular way.
Copy link
Member

@luke-jr luke-jr Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This drops mandatory signalling in the MASF case. Maybe that's okay, but I'm not sure yet. Can you move it to a separate PR?

Copy link
Contributor Author

@ajtowns ajtowns Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it drops mandatory signalling to 95% rather than 100%, but MASF won't happen without 95% signalling, so "MASF drops mandatory signalling" is technically saying "miners did 95% signalling to trigger activation, but they didn't do 95% signalling" which doesn't make sense to me... I'm not seeing what a 100% signal period buys over a 95% signal period (and it'd cost either miners or users having to immediately react to seeing it, or and extra month between lockin and activation, with states going STARTED -> PAUSE1 -> ALL_SIGNAL -> PAUSE2 -> ACTIVE assuming users are meant to react based on ALL_SIGNAL rather than the 95% signalling in STARTED)

Copy link
Member

@luke-jr luke-jr Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 95% is before LOCKED_IN, so fundamentally separate from the mandatory signalling during LOCKED_IN.

I'm not saying it should definitely be rejected, just that it should be separate from this PR and evaluated independently.

Copy link
Contributor Author

@ajtowns ajtowns Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I'm looking at the fundamentals is that, at present:

  • miners need to react instantly when they see 95% signalling during STARTED, and do 100% signalling during LOCKED_IN
  • users need to react instantly when they see 100% signalling during FAILING/LOCKED_IN, and start enforcing soft fork rules
  • with lockinontimeout, miners need to signal at 100% at a particular time

And after this patch:

  • miners don't need to upgrade from 95% signalling to 100% signalling
  • users have a retarget period to react when they see 95% signalling during STARTED/MUST_SIGNAL, and then need to start enforcing the soft fork rules
  • with lockinontimeout, miners are required to signal at 95% at a particular time

The only reason I can see for "miners need to upgrade from 95% signalling to 100% signalling" is if users want to only react when they see 100% signalling (which would make it more plausible to reduce the activation threshold for a deployment I guess), but to preserve the "don't have to react immediately" that would look add another two retarget periods, something like:

 * STARTED -> THRESHOLD_MET (95% threshold)
 * THRESHOLD_MET -> SIGNAL_ALL (always)
 * SIGNAL_ALL -> LOCKED_IN (always)
 * LOCKED_IN -> ACTIVE (always)
 * STARTED -> LOCKED_IN (100% threshold)

 * all blocks must signal during SIGNAL_ALL
 * STARTED -> LOCKED_IN transition ensures compatibility with lock in via reduced threshold activation

But if you want to reduce the activation threshold you can just have another deployment like bip91 that signals on a different bit, so instead of STARTED -> THREDHOLD_MET -> SIGNAL_ALL -> LOCKED_IN -> ACTIVE you go (STARTED,STARTED) -> (STARTED,LOCKED_IN) -> (STARTED,ACTIVE) -> (LOCKED_IN,COMPLETE) -> (ACTIVE,COMPLETE) which takes the same amount of time, but doesn't complicate deployments that don't need their threshold reduced. And that approach works even if you change the period length as well as the threshold.

Copy link
Member

@luke-jr luke-jr Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not saying we shouldn't make this change, only that it is logically independent from the rest of the changes here, and should be a separate PR.

@@ -121,7 +121,8 @@ We remain in the initial state until we reach the start block height.
After a period in the STARTED state, we tally the bits set,
and transition to LOCKED_IN if a sufficient number of blocks in the past period set the deployment bit in their
version numbers. The threshold is ≥1916 blocks (95% of 2016), or ≥1512 for testnet (75% of 2016).
If the threshold hasn't been met, and we reach the timeout, then we either transition to LOCKED_IN state anyway (if lockinontimeout is true), or we transition to FAILING.
If the threshold hasn't been met, and lockinontimeout is true and we are at the last period before the timeout, then we transition to MUST_SIGNAL.
Copy link
Member

@luke-jr luke-jr Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep FAILING as an alternative here, for UX only (no logic difference from STARTED).

Copy link
Contributor Author

@ajtowns ajtowns Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UX only should be in the code, not in the bip, I think?

Copy link
Member

@luke-jr luke-jr Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the UX side is simply conveying "deployment status". It might be cleaner to recognise a status of "others might be MUST_SIGNAL".

Buuut... the point of this change is to make it so that could be arbitrarily earlier. So maybe FAILING would be bad to reintroduce in the end.

Copy link
Contributor Author

@ajtowns ajtowns Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ideally you'd somehow track "there's a lower work header chain where the fork has activated, and the tip is not terribly old" and alert based on that?

}
}

Implementations should be careful not to ban peers that send blocks that are invalid due to not signalling (or blocks that build on those blocks), as that would allow an incompatible chain that is only briefly longer than the compliant chain to cause a split of the p2p network. If that occurred, nodes that are not enforcing lockinontimeout may not see new blocks in the compliant chain, and thus not reorg to it at the point when it has more work, and would thus not be following the valid chain with the most work.
Copy link
Member

@luke-jr luke-jr Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But even more importantly, nodes enforcing MUST_SIGNAL need to ensure they remain attached to other such nodes.

Copy link
Contributor Author

@ajtowns ajtowns Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the details of that probably need to be left to the bips that want to deploy via bip-8; they could ensure they remain connected to other nodes by making sure lockinontimeout only happens when everybody is happy with it, in which case no special code is needed; or they could allocate a temporary service flag bit for it and preferentially connect to other nodes that way for some period.

Still worth mentioning, but nothing's coming to mind as to what to say exactly.

Copy link
Contributor Author

@ajtowns ajtowns Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note about this in a separate commit

bip-0008.mediawiki Outdated Show resolved Hide resolved
@ajtowns ajtowns force-pushed the 202007-bip8 branch 4 times, most recently from 16728c3 to d1c8642 Compare Aug 6, 2020
@ajtowns
Copy link
Contributor Author

ajtowns commented Aug 6, 2020

Updated with some of the comments addressed, and a little restructuring. (EDIT: and rebased on master)

Corresponding patch at https://github.com/ajtowns/bitcoin/commits/202008-bip8-pr950 -- it's on top of @luke-jr's bip8 branch rebased onto master and with some warnings/errors fixed.

ajtowns added 3 commits Aug 6, 2020
This removes the FAILING state and replaces compulsory signalling during
LOCKED_IN with compulsory signalling during a new MUST_SIGNAL phase during
the last retarget period prior to the timeout height.  Also reduces the
amount of blocks that are required to signal to match the threshold.

This ensures that if a deployment occurs using bip8 with
lockinontimeout=false and timeoutheight=N, that a later deployment using
bip8 with lockinontimeout=true and timeoutheight=K, where K<N that any
chain where LOCKED_IN is reached prior to height K, will be accepted as
valid by nodes using either set of deployment parameters.

It also ensures that the soft-fork's changed rules are only enforced
on chain a retarget period after signalling indicates enforcement is
expected (which was not previously the case if the FAILING to ACTIVE
transition took place).
@@ -59,10 +59,10 @@ With each block and soft fork, we associate a deployment state. The possible sta

# '''DEFINED''' is the first state that each soft fork starts out as. The genesis block is by definition in this state for each deployment.
# '''STARTED''' for blocks at or beyond the startheight.
# '''LOCKED_IN''' for one retarget period after the first retarget period with STARTED blocks of which at least threshold have the associated bit set in nVersion, or for one retarget period after the timeout when '''lockinontimeout''' is true.
# '''MUST_SIGNAL''' for one retarget period prior to the timeout, if LOCKED_IN was not reached and '''lockinontimeout''' is true.
# '''LOCKED_IN''' for one retarget period after the first retarget period with STARTED blocks of which at least threshold have the associated bit set in nVersion.
Copy link
Member

@luke-jr luke-jr Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...with STARTED (or MUST_SIGNAL) blocks...

@luke-jr
Copy link
Member

luke-jr commented Oct 15, 2020

Partially merged this in 3c63846, da9cdd6, and 0f683f7

@ajtowns
Copy link
Contributor Author

ajtowns commented Oct 19, 2020

Follow ups in #1019 (wording tweaks) #1020 (signalling non-mandatory during locked in) and #1021 (must signal at 95% threshold instead of 100%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants