New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, eth: enforce network split post DAO hard-fork #2795

Closed
wants to merge 6 commits into
base: develop
from

Conversation

Projects
None yet
7 participants
@karalabe
Member

karalabe commented Jul 8, 2016

Prerequisite PRs, merge these first to get rid of duplicate commits:

  • cmd, core, miner: add extradata validation to consensus rules #2794

This PR is meant to enforce the network split after passing the DAO hard-fork block number. It is done by requesting the DAO fork-block's header immediately after an eth handshake completes. If the DAO challenge isn't completed within 15 seconds, the connection is dropped.

To complete the DAO challenge, the remote peer must reply with the requested header (every peer will do so, it's how the eth protocol works). A few things can happen:

  • The returned header's extra-data conforms to the side we're on. Challenge completed.
  • The returned header's extra-data doesn't conform to our side. Challenge failed, peer dropped.
  • No header is returned, this complicates things:
    • We ourselves haven't reached the fork point, assume friendly (no way to challenge)
    • We've already reached the fork point, so we already have our fork block:
      • If peer's TD is smaller than this, we cannot check it's side, assume friendly
      • If peer's TD is higher than this, assume the packet is a reply to something else, wait further
@robotally

This comment has been minimized.

Show comment
Hide comment
@robotally

robotally Jul 8, 2016

Vote Count Reviewers
👍 0
👎 0

Updated: Wed Jul 20 02:22:26 UTC 2016

robotally commented Jul 8, 2016

Vote Count Reviewers
👍 0
👎 0

Updated: Wed Jul 20 02:22:26 UTC 2016

@julian1

This comment has been minimized.

Show comment
Hide comment
@julian1

julian1 Jul 10, 2016

Would it be better if p2p network selectivity was a general capability available to support any main-chain fork?

The code is written in a way that makes it appear very DAO specific.

Also if a node administrator wanted to support DAO protocol rules, but relay transactions for both networks, would they be able to do that? What happens if a client lies?

julian1 commented Jul 10, 2016

Would it be better if p2p network selectivity was a general capability available to support any main-chain fork?

The code is written in a way that makes it appear very DAO specific.

Also if a node administrator wanted to support DAO protocol rules, but relay transactions for both networks, would they be able to do that? What happens if a client lies?

@whatisgravity

This comment has been minimized.

Show comment
Hide comment
@whatisgravity

whatisgravity Jul 11, 2016

Why is so much code being added specifically for a bug within a contract? Do you not see how poor this solution is if it requires this many fundamental changes specific to this single event and not more general tools? It would be easier to swallow of these changes were general and not DAO specific.

The geth is the primary client, you should not be making it so specific, you should be making it generic as possible. Making it easier to work with private networks so proper testing is easier to accomplish.

What happens if a client lies?

This is the basic question anyone should ask whenever building anything. If this is not always on your mind you are making insecure software.

whatisgravity commented Jul 11, 2016

Why is so much code being added specifically for a bug within a contract? Do you not see how poor this solution is if it requires this many fundamental changes specific to this single event and not more general tools? It would be easier to swallow of these changes were general and not DAO specific.

The geth is the primary client, you should not be making it so specific, you should be making it generic as possible. Making it easier to work with private networks so proper testing is easier to accomplish.

What happens if a client lies?

This is the basic question anyone should ask whenever building anything. If this is not always on your mind you are making insecure software.

@karalabe

This comment has been minimized.

Show comment
Hide comment
@karalabe

karalabe Jul 11, 2016

Member

@julian1 Yes, it would be better to have generic selectivity. There are plans already to extend the discovery protocol (these are very old plans, no idea when they'll get implemented) + a nicer solutions would be to modify the eth protocol's handshake to include these sanity checks. The reason for this quick hack is because we don't have time to do a P2P upgrade. In its current form it is fully eth compliant to a later protocol extension would still play nicely with older peers not updating.

No, a node operator shouldn't be able to relay transactions for both networks. Sooner or later those will not be compatible (i.e. non-executable transactions), which would just burden each others' networks. Is there a particular reason it's desirable?

@julian1 @whatisgravity We always have in mind that clients might lie. In this particular case, a client may lie, will get connected, but at a consensus level will not be able to feed junk because of the header extra-data section validations. Think of this check similar to CORS: the idea is to prevent nodes en-mass to make cross-network connections, causing strange anomalies. You can of course do that with your particular node(s), but as long as the majority plays nice, a few bad apples can't cause too big issues.

The geth is the primary client, you should not be making it so specific, you should be making it generic as possible. Making it easier to work with private networks so proper testing is easier to accomplish.

That's exactly what we've been trying to do with this PR collection. I can't program a generic forker because there's no telling what a fork will include, so there's no point in programming it as it will most probably be wrong (also time is of essence here). However I did try to make this DAO specific fork as generic as possible (configurable via genesis.json, override flags, etc), specifically to make them properly testable in private networks.

Further, I've also written a client tester/simulator specifically for this case, to simulate networks of (arbitrary) clients and test various aspects of them in a client-independent and even tester-independent way. The tester is called hive and already has two DAO hard-fork tests covering extra-data rules and mid-fork P2P network splits.

Member

karalabe commented Jul 11, 2016

@julian1 Yes, it would be better to have generic selectivity. There are plans already to extend the discovery protocol (these are very old plans, no idea when they'll get implemented) + a nicer solutions would be to modify the eth protocol's handshake to include these sanity checks. The reason for this quick hack is because we don't have time to do a P2P upgrade. In its current form it is fully eth compliant to a later protocol extension would still play nicely with older peers not updating.

No, a node operator shouldn't be able to relay transactions for both networks. Sooner or later those will not be compatible (i.e. non-executable transactions), which would just burden each others' networks. Is there a particular reason it's desirable?

@julian1 @whatisgravity We always have in mind that clients might lie. In this particular case, a client may lie, will get connected, but at a consensus level will not be able to feed junk because of the header extra-data section validations. Think of this check similar to CORS: the idea is to prevent nodes en-mass to make cross-network connections, causing strange anomalies. You can of course do that with your particular node(s), but as long as the majority plays nice, a few bad apples can't cause too big issues.

The geth is the primary client, you should not be making it so specific, you should be making it generic as possible. Making it easier to work with private networks so proper testing is easier to accomplish.

That's exactly what we've been trying to do with this PR collection. I can't program a generic forker because there's no telling what a fork will include, so there's no point in programming it as it will most probably be wrong (also time is of essence here). However I did try to make this DAO specific fork as generic as possible (configurable via genesis.json, override flags, etc), specifically to make them properly testable in private networks.

Further, I've also written a client tester/simulator specifically for this case, to simulate networks of (arbitrary) clients and test various aspects of them in a client-independent and even tester-independent way. The tester is called hive and already has two DAO hard-fork tests covering extra-data rules and mid-fork P2P network splits.

@karalabe karalabe referenced this pull request Jul 11, 2016

Closed

core, params, tests: add DAO hard-fork balance moves #2800

0 of 1 task complete
@julian1

This comment has been minimized.

Show comment
Hide comment
@julian1

julian1 Jul 11, 2016

@karalabe Regarding p2p network partitioning.

If a node receives a block with an invalid transaction (considered from the perspective of whatever protocol rules it follows) then it will not relay the block to its immediately connected p2p peers.

Since block propagation gets interrupted at the first-step - won't the network partition itself quickly and automatically - simply by having peers disconnect from other peers that relay bad-blocks?

Was this able to be tested in simulation?

Also, would convergence on the partitioned state be aided by being able to tune disconnection policy to be more aggressive. (For example disconnect from a peer if it sends 1 bad-block in the last 3)?

I wonder if that would alleviate the need to hard-code/duplicate flags indicating protocol choices in block headers?

I worry that if Ethereum ends up being forked many times in future - then we will end up with a whole matrix of little header flags creating a lot of complicated logic as well as maintenance debt.

julian1 commented Jul 11, 2016

@karalabe Regarding p2p network partitioning.

If a node receives a block with an invalid transaction (considered from the perspective of whatever protocol rules it follows) then it will not relay the block to its immediately connected p2p peers.

Since block propagation gets interrupted at the first-step - won't the network partition itself quickly and automatically - simply by having peers disconnect from other peers that relay bad-blocks?

Was this able to be tested in simulation?

Also, would convergence on the partitioned state be aided by being able to tune disconnection policy to be more aggressive. (For example disconnect from a peer if it sends 1 bad-block in the last 3)?

I wonder if that would alleviate the need to hard-code/duplicate flags indicating protocol choices in block headers?

I worry that if Ethereum ends up being forked many times in future - then we will end up with a whole matrix of little header flags creating a lot of complicated logic as well as maintenance debt.

@karalabe

This comment has been minimized.

Show comment
Hide comment
@karalabe

karalabe Jul 12, 2016

Member

Currently block propagation is done in two phases: a new block's header is verified (PoW included) and if everything checks out, the block is already propagated to log(n) nodes. It is then tried to be imported, and if import succeeds the block is announced to everyone else. The idea was that blocks seeming good should propagate as fast as possible through the network.

This also means however that unless we remove the quick propagation mechanism, the network will fall apart if we start dropping nodes since from the quick-check's perspective states cannot be verified, so by the time it turns out to be rotten, everyone has it.

Regarding header content, that is mostly needed for the transition period when there's no "pro-fork" and/or "no-fork" headers to checkpoint with. Given that these hard forks are mostly protocol upgrades and there's little reason to support the minority (nor is there a reason to support these transitions in private networks), it is possible to switch out the mechanism to checkpointing after the fork passed and everything stabilized. (If however for whatever reason we need to support both camps longer, that can be ugly fast).

Member

karalabe commented Jul 12, 2016

Currently block propagation is done in two phases: a new block's header is verified (PoW included) and if everything checks out, the block is already propagated to log(n) nodes. It is then tried to be imported, and if import succeeds the block is announced to everyone else. The idea was that blocks seeming good should propagate as fast as possible through the network.

This also means however that unless we remove the quick propagation mechanism, the network will fall apart if we start dropping nodes since from the quick-check's perspective states cannot be verified, so by the time it turns out to be rotten, everyone has it.

Regarding header content, that is mostly needed for the transition period when there's no "pro-fork" and/or "no-fork" headers to checkpoint with. Given that these hard forks are mostly protocol upgrades and there's little reason to support the minority (nor is there a reason to support these transitions in private networks), it is possible to switch out the mechanism to checkpointing after the fork passed and everything stabilized. (If however for whatever reason we need to support both camps longer, that can be ugly fast).

@whatisgravity

This comment has been minimized.

Show comment
Hide comment
@whatisgravity

whatisgravity Jul 12, 2016

I'm still not sure how this solution really prevents either network from creating modified clients to relay valid transactions to the other network complicating the split.

whatisgravity commented Jul 12, 2016

I'm still not sure how this solution really prevents either network from creating modified clients to relay valid transactions to the other network complicating the split.

@aakilfernandes

This comment has been minimized.

Show comment
Hide comment
@aakilfernandes

aakilfernandes Jul 13, 2016

Isn't there already a networkid option? Can't we repurpose that for this fork?

Ps thanks for going to these great lengths to keep both networks operating, its truly appreciated.

aakilfernandes commented Jul 13, 2016

Isn't there already a networkid option? Can't we repurpose that for this fork?

Ps thanks for going to these great lengths to keep both networks operating, its truly appreciated.

@karalabe

This comment has been minimized.

Show comment
Hide comment
@karalabe

karalabe Jul 16, 2016

Member

Superseeded and merged in #2814.

Member

karalabe commented Jul 16, 2016

Superseeded and merged in #2814.

@karalabe karalabe closed this Jul 16, 2016

@obscuren obscuren removed the please review label Jul 16, 2016

@yangliw3

This comment has been minimized.

Show comment
Hide comment
@yangliw3

yangliw3 commented Jul 20, 2016

support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment