Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: Ensure peers are on the same side of chain splits as us #869

Merged
merged 1 commit into from Jun 22, 2018

Conversation

gsalgado
Copy link
Collaborator

@gsalgado gsalgado commented Jun 5, 2018

Closes: #865

@carver
Copy link
Contributor

carver commented Jun 5, 2018

Rather than checking fork rules explicitly in p2p, maybe it's time to bring back the HeaderChain idea start using HeaderChain in the LightPeerChain, and have it checking fork rules.

Something like replacing this: https://github.com/gsalgado/py-evm/blob/25a2a4dca3d25fce68c5ba1e0b4ea35d2f88702c/p2p/lightchain.py#L222

with:

try:
    await self.header_chain.coro_import_header(header)
except ValidationError:
    # A fork choice rule was violated. Maybe we want a specific exception for this
    # TODO:
    # blacklist peer
    # drop peer
    break

and have header chain's import_header() actually check fork rules :)
https://github.com/ethereum/py-evm/blob/master/evm/chains/header.py#L154

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a bit grumbly about this but I don't have a practical alternative to suggest.

  • would really like to actually perform the check against the homestead VM for whether the fork is supported.
  • would like to de-duplicate the very similar do_extra_handshake_checks and _day_fork_check methods.

I see why both of these things are the way they are. Can you go ahead and open an issue to track this FIXME so that we are slightly less likely to forget to ever do it.

p2p/peer.py Outdated
# for users who want to follow the no-fork chain.
request_id = gen_request_id()
self.sub_proto.send_get_block_headers(
DAO_FORK_MAINNET_BLOCK, max_headers=1, request_id=request_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason we only request 1 header (instead of the full 10) just for optimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that it'd be redundant to check the extra_data on all 10 headers, so I wouldn't call it an optimization

p2p/peer.py Outdated
raise

async def _dao_fork_check(self) -> None:
# FIXME: For now we only support the pro-fork chain, but we should provide a config option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me grumbly because I know we have very little intrinsic motivation to fix this later..... but also the PeerPool doesn't have access to the Chain so giving it the data it needs to do these checks might be messy.

@gsalgado
Copy link
Collaborator Author

gsalgado commented Jun 6, 2018

@carver any thoughts on the best way to have HeaderChain.import_header() check fork rules? ISTM we'd need to move (or duplicate?) vm_configuration and a bunch of other things from Chain to HeaderChain and then implement the validation in the fork-specific create_header_from_parent() or something like that? Just trying to figure out whether we should go with this as an interim solution or try to go straight with the solution you suggest

@pipermerriam
Copy link
Member

Not a complete answer but I think that Chain should define a header_chain_class and that the vm_configuration from the Chain class should be passed down to the HeaderChain. This gets us VM awareness in the HeaderChain to be able to do this type of validation.

@carver
Copy link
Contributor

carver commented Jun 6, 2018

@gsalgado I think it might be worth jumping straight into the new HeaderChain changes. I'll take a look now.

@carver
Copy link
Contributor

carver commented Jun 7, 2018

So it's a very rough work in progress, but #883 is the gist of what I'm thinking as an approach.

@carver
Copy link
Contributor

carver commented Jun 9, 2018

Although 883 is not complete, I still think it is proving out the idea: that we can keep all the forking logic in the VM, separate from the p2p module.

@gsalgado
Copy link
Collaborator Author

gsalgado commented Jun 9, 2018

Sounds good to me! Do you plan to finish up #883?

@pipermerriam
Copy link
Member

Should this now be closed since #883 is merged?

@gsalgado
Copy link
Collaborator Author

This PR is doing what you suggest on #865 (comment), but in a far from ideal way. Once we agree on the best way to tackle #865 I can update this accordingly

@carver
Copy link
Contributor

carver commented Jun 12, 2018

Once we agree on the best way to tackle #865 I can update this accordingly

Seems like a good sign that we both came up with the same approach. I'd say that's good enough to take a swing at it now

@gsalgado gsalgado force-pushed the dao-fork-peer-check branch 2 times, most recently from eec433b to 6243005 Compare June 13, 2018 14:12
@gsalgado gsalgado mentioned this pull request Jun 13, 2018
@gsalgado gsalgado changed the title p2p: Ensure mainnet peers are pro DAO fork when connecting p2p: Ensure peers are on the same side of chain splits as us Jun 13, 2018
@gsalgado gsalgado force-pushed the dao-fork-peer-check branch 4 times, most recently from 68e4232 to 960a742 Compare June 14, 2018 11:57
Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evidently I forgot to post this.

class MainnetHomesteadVM(MainnetDAOValidatorVM, HomesteadVM):
pass
class MainnetPostDAOHomesteadVM(MainnetDAOValidatorVM, HomesteadVM):
caused_chain_split = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mechanism doesn't sit well with me. I'd much rather we special case things in trinity with respect to the HomesteadVM than create a generic mechanism like this. Do you think it's reasonable to just do the following in the peer checks.

for start_block, VM in vm_configuration:
    if not issubclass(VM, HomesteadVM):
        continue  # not using a VM that contains the dao fork rules in its history
    elif start_block > VM.dao_fork_block_number:
        continue  # VM is fully passed the dao fork
    
    ...  # do the checks we need to do

We'd retain a single VM class for homestead like before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that, but if we're going to special case the HomesteadVM, we might as well do it like this?

    if self.network_id != MAINNET_NETWORK_ID:
        continue
    dao_fork_block = HomesteadVM.dao_fork_block_number
    # request and validate dao_fork_block...

Then we don't have to worry about passing chain_class around as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that we don't know that we're using the HomesteadVM or whether it's a subclass which starts the fork block at a different height.... I think we have to pass in the vm_configuration in some form, otherwise this won't work for non-mainnet and similar VMs.

p2p/peer.py Outdated
@@ -99,6 +103,16 @@
from trinity.db.header import BaseAsyncHeaderDB # noqa: F401


# XXX: Where should this live?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this come from whatever Chain class is being used? Unfortunate since that means passing the chain_class down through the various APIs but it seems closer to the right thing here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I wanted to avoid. Do you think it's ok to pass only the vm_configuration into PeerPool and handshake()? Or can you think of any reason why we should pass a Chain class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to pass the whole Chain class. 👍 to just the vm_configuration property.

p2p/peer.py Outdated
headers = msg['headers']
if len(headers) != 2:
raise HeaderNotFound()
parent, header = headers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be validation here that they actually sent us the headers we requested? Does this allow them to send some pair of other arbitrary headers that would pass validation but aren't the headers we requested?

Which raises an additional point of whether we validate the returned headers in any situation? Shouldn't we be validating that the headers returned are indeed at minimum a subset of the requested headers that we expected? Similarly, that they conform to the requested ordering based on the reverse keyword.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could perform all those extra checks you describe, but it'd make the code a lot more complex for no real benefit because a malicious peer that passes this check by providing fake data would be caught later on when we actually try to import headers. That's what I had in mind when I wrote the comment in the except block that handles HeaderNotFound.

Unless I'm missing something, I don't think this could be an attack vector so I'd rather avoid the extra complexity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the added complexity could be as simple as a single function call to a validation utility.

def validate_header_response(request_params, response_headers):
    ...

Should be small and easy to insert into any function body which requests and receives headers (though it does get complicated if we have call sites which don't have access to the request_params).

IF it is as simple as this then I feel strongly enough to push for it. If there are hidden complexities that I don't see, then I'll happily defer to your judgement.

My reasoning for still wanting this even with the knowledge that the import validation should catch things is that unless I know with near certaintly that something isn't an attack vector, I try to assume that something is hiding in there, and skipping validation seems like a good avenue for security holes to find their way in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, that's straightforward enough. Pushed a new commit with the validation function you suggested

p2p/peer.py Outdated
self, block_number: BlockNumber) -> Tuple[BlockHeader, BlockHeader]:
request_id = gen_request_id()
self.sub_proto.send_get_block_headers(
block_number - 1, max_headers=2, request_id=request_id, reverse=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this one specify a request_id but the one on line 605 doesn't?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different sub-protocols. LES requests include a request_id but ETH ones don't

p2p/peer.py Outdated
@@ -528,6 +600,22 @@ class ETHPeer(BasePeer):
self.head_td = msg['td']
self.head_hash = msg['best_hash']

async def _get_headers_at_chain_split(
self, block_number: BlockNumber) -> Tuple[BlockHeader, BlockHeader]:
self.sub_proto.send_get_block_headers(block_number - 1, max_headers=2, reverse=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sister comment: why doesn't this have a request_id?

@gsalgado gsalgado force-pushed the dao-fork-peer-check branch 7 times, most recently from 7f1e6c8 to 2348f8a Compare June 21, 2018 17:21
Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am i correct that there should be a follow up PR that applies this validation to the other places where header retrieval happens?



def validate_header_response(
start_number: int, count: int, reverse: bool, headers: List[BlockHeader]) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we just not using the skip parameter yet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we currently don't use it anywhere IIRC

@gsalgado
Copy link
Collaborator Author

We could try and apply this validation to other places that receive headers, but it won't be so straightforward as they don't have the request params, and yeah, it should be in a separate PR because this one is only about ensuring peers are on the same side of the DAO fork as us

@pipermerriam
Copy link
Member

Tracked here: #944

@chfast
Copy link

chfast commented Jul 24, 2018

I cannot find it in the code. Do you check only the header hash of the dao hf header?

@pipermerriam
Copy link
Member

Check happens here:

https://github.com/ethereum/py-evm/blob/master/p2p/peer.py#L819-L865

Specifically here: https://github.com/ethereum/py-evm/blob/master/p2p/peer.py#L861

I think we may be missing validation enforcement on the extra_data field for the first 10 headers.

@pipermerriam
Copy link
Member

link #1086

@pipermerriam
Copy link
Member

@chfast
Copy link

chfast commented Jul 24, 2018

Is the extra data check needed? Header hash is not enough?

@pipermerriam
Copy link
Member

It may not be needed but it is part of the spec so we do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants