Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition nodes with old protocol version from network in advance of hard fork #2775

Closed
daira opened this issue Nov 27, 2017 · 12 comments
Closed

Comments

@daira
Copy link
Contributor

@daira daira commented Nov 27, 2017

This works in conjunction with #2738.

@daira daira added this to Work Queue in Network Upgrade 0 via automation Nov 27, 2017
@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Nov 27, 2017

Nodes running new protocol version can start dropping peers running old protocol version at a given block height in preparation for NU0.

Upstream core implemented something similar by looking at service bits, rather than protocol version, but concept is the same:
bitcoin/bitcoin#10982

@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Nov 28, 2017

Making a note that MIN_PEER_PROTO_VERSION might be useful.

@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Nov 28, 2017

The goal is to:

  • refuse new connections with the old protocol version
  • drop any existing connections with the old protocol version
@daira

This comment has been minimized.

Copy link
Contributor Author

@daira daira commented Dec 12, 2017

Note to self: expand on rationale before next NU0 meeting.

@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Dec 18, 2017

Based on discussion:

@daira would like to achieve:

  • (a) separation: separates the things that could go wrong in a hard-fork upgrade over time
  • (b) network topology: ensure that each upgrade-supporting node has time to form a connected network with only other upgrade-supporting nodes (avoiding some of its connection slots being uselessly taken up by non-upgrade-supporting nodes);
  • (c) simplification: simplify reasoning about what will happen at the consensus rule activation point, by ensuring that upgrade-supporting nodes are only connected to each other at that point [this is potentially more significant if the upgrade requires both consensus and network protocol changes]

Devil's advocate:

  • On (a) there is a history of hard forks where a flag day works out fine i.e. consensus rules activate at block H which results in a clean fork and network partitioning e.g. Monero scheduled hard forks, Bitcoin->Bitcoin Cash

  • On (b) if the network is partitioned at height H1 and consensus rules activate at H2, during the period between H1 and H2, transactions can be replayed on both sides of the split.

    • With a non-controversial fork, auto-scenescence means that nodes running the old protocol version may already be deprecated
    • With a controversial fork, attackers on one side can deliberately replay transactions on the other side’s fork.
      • Also, as seen with the Segwit2x proposed hard fork, one side trying to parition the network early by dropping nodes based on service bits, simply resulted in the other side faking their service bits.
  • On (c), I think this is covered by points about (b) and externally, I think it will be confusing to the ecosystem if there are two block heights to worry about. That is, "partition and then activation" might be confusing, whereas "upgrade/fork and then activate" might be better but still confusing.

A few tidbits:

  • The time between partitioning and activating is completely arbitrary. In discussion it seemed that the appropriate time should be something which is sufficient that developers can release a hot fix in case of emergency. This has varied from hours to days. It is hard to know what is sufficient and how long the ecosystem should suspend economic activity e.g. not sending transactions.

  • In a world of multiple clients, trying to split the network early will most likely fail due to clients not implementing the partitioning logic (it's not a consensus rule so they don't have to), maliciously gaming the protocol version, or deciding to implement the partition logic but there are bugs. The only guarantee of a network split is when the consensus rules change.

Proposal

During discussion the following mechanism may satisfy (b) and (c) but not (a):

  • Activation occurs at block H, and the first N blocks must be empty.

What value should N be? With an expected 24 blocks per hour, setting N to 288 would be approximately ~ 12 hours. A shorter value for N might not be enough time for the engineering team, which is distributed around the world, to work together to respond in time.

This mechanism:

  • is simple to understand.
  • provides a period of time for the network to coalesce

It also provides guidance to the ecosystem who will want to know when it's safe to send transactions. Here we are saying it's okay to send transactions in the new format, they will be in the mempool, or you can wait until block H+N is reached.

Related, but should be in another ticket or in spec:

Regarding the mempool:

  • arriving transactions which do not satisfy the new consensus rules will be rejected
  • we should have a mechanism were when block H arrives, the mempool will evict any old format transactions

Regarding peers:

  • we should have a mechanism were when block H arrives, we drop nodes running older protocol version
  • this could be proactive when the new block arrives, or passive, when new data is sent by a peer we check the protocol version.
@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Dec 18, 2017

We also discussed that if there is a partition block height, there should be empty blocks until the activation block height, to counter replaying of transactions in the old format.

@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Dec 18, 2017

P = network partition
Q = new consensus rules
R = new format tx get mined

On upgraded branch:

 P  ---- empty blocks ---> Q ---- empty blocks ---> R

vs

P=Q ---- empty blocks ---> R
@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Dec 18, 2017

Consider if a node with new protocol version only has connections to a node with old protocol version, between P and Q, how do you drop the nodes? The behaviour will be the same as a hard fork P=Q unless the dropping logic is non-deterministic or throttled.

@daira

This comment has been minimized.

Copy link
Contributor Author

@daira daira commented Dec 18, 2017

The behaviour is not the same. If P < Q, then the new node has time to reconnect to other new nodes before Q.

@str4d str4d added this to 1.0.15: P2P / Net in Release planning Dec 20, 2017
@str4d str4d added this to the 1.0.15 milestone Jan 2, 2018
@daira

This comment has been minimized.

Copy link
Contributor Author

@daira daira commented Jan 11, 2018

Consider making pchMessageStart equal to the branch id. This may or may not do what we want.

@bitcartel

This comment has been minimized.

Copy link
Contributor

@bitcartel bitcartel commented Jan 15, 2018

Could we make use of INIT_PROTO_VERSION ?

@str4d str4d modified the milestones: v1.0.15, v1.1.0 Feb 26, 2018
@str4d

This comment has been minimized.

Copy link
Contributor

@str4d str4d commented Feb 26, 2018

We decided to instead implement preferred eviction of pre-Overwinter nodes, but we should keep an eye on the network during the Overwinter activations on testnet and mainnet, and potentially re-open this then if we feel it would be a better strategy.

@str4d str4d closed this Feb 26, 2018
Network Upgrade 0 automation moved this from Work Queue to Complete Feb 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Network Upgrade 0
  
Complete
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.