New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: Ensure peers are on the same side of chain splits as us #869
Conversation
Rather than checking fork rules explicitly in Something like replacing this: https://github.com/gsalgado/py-evm/blob/25a2a4dca3d25fce68c5ba1e0b4ea35d2f88702c/p2p/lightchain.py#L222 with: try:
await self.header_chain.coro_import_header(header)
except ValidationError:
# A fork choice rule was violated. Maybe we want a specific exception for this
# TODO:
# blacklist peer
# drop peer
break and have header chain's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm feeling a bit grumbly about this but I don't have a practical alternative to suggest.
- would really like to actually perform the check against the homestead VM for whether the fork is supported.
- would like to de-duplicate the very similar
do_extra_handshake_checks
and_day_fork_check
methods.
I see why both of these things are the way they are. Can you go ahead and open an issue to track this FIXME
so that we are slightly less likely to forget to ever do it.
p2p/peer.py
Outdated
# for users who want to follow the no-fork chain. | ||
request_id = gen_request_id() | ||
self.sub_proto.send_get_block_headers( | ||
DAO_FORK_MAINNET_BLOCK, max_headers=1, request_id=request_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the reason we only request 1 header (instead of the full 10) just for optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my understanding that it'd be redundant to check the extra_data on all 10 headers, so I wouldn't call it an optimization
p2p/peer.py
Outdated
raise | ||
|
||
async def _dao_fork_check(self) -> None: | ||
# FIXME: For now we only support the pro-fork chain, but we should provide a config option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me grumbly because I know we have very little intrinsic motivation to fix this later..... but also the PeerPool
doesn't have access to the Chain
so giving it the data it needs to do these checks might be messy.
@carver any thoughts on the best way to have |
Not a complete answer but I think that |
@gsalgado I think it might be worth jumping straight into the new |
So it's a very rough work in progress, but #883 is the gist of what I'm thinking as an approach. |
Although 883 is not complete, I still think it is proving out the idea: that we can keep all the forking logic in the VM, separate from the |
Sounds good to me! Do you plan to finish up #883? |
25a2a4d
to
2c34e4b
Compare
Should this now be closed since #883 is merged? |
This PR is doing what you suggest on #865 (comment), but in a far from ideal way. Once we agree on the best way to tackle #865 I can update this accordingly |
Seems like a good sign that we both came up with the same approach. I'd say that's good enough to take a swing at it now |
eec433b
to
6243005
Compare
68e4232
to
960a742
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evidently I forgot to post this.
evm/chains/mainnet/__init__.py
Outdated
class MainnetHomesteadVM(MainnetDAOValidatorVM, HomesteadVM): | ||
pass | ||
class MainnetPostDAOHomesteadVM(MainnetDAOValidatorVM, HomesteadVM): | ||
caused_chain_split = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism doesn't sit well with me. I'd much rather we special case things in trinity with respect to the HomesteadVM
than create a generic mechanism like this. Do you think it's reasonable to just do the following in the peer checks.
for start_block, VM in vm_configuration:
if not issubclass(VM, HomesteadVM):
continue # not using a VM that contains the dao fork rules in its history
elif start_block > VM.dao_fork_block_number:
continue # VM is fully passed the dao fork
... # do the checks we need to do
We'd retain a single VM
class for homestead like before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that, but if we're going to special case the HomesteadVM, we might as well do it like this?
if self.network_id != MAINNET_NETWORK_ID:
continue
dao_fork_block = HomesteadVM.dao_fork_block_number
# request and validate dao_fork_block...
Then we don't have to worry about passing chain_class
around as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except that we don't know that we're using the HomesteadVM
or whether it's a subclass which starts the fork block at a different height.... I think we have to pass in the vm_configuration
in some form, otherwise this won't work for non-mainnet and similar VMs.
p2p/peer.py
Outdated
@@ -99,6 +103,16 @@ | |||
from trinity.db.header import BaseAsyncHeaderDB # noqa: F401 | |||
|
|||
|
|||
# XXX: Where should this live? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this come from whatever Chain
class is being used? Unfortunate since that means passing the chain_class
down through the various APIs but it seems closer to the right thing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's what I wanted to avoid. Do you think it's ok to pass only the vm_configuration into PeerPool
and handshake()
? Or can you think of any reason why we should pass a Chain
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to pass the whole Chain
class. 👍 to just the vm_configuration
property.
p2p/peer.py
Outdated
headers = msg['headers'] | ||
if len(headers) != 2: | ||
raise HeaderNotFound() | ||
parent, header = headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be validation here that they actually sent us the headers we requested? Does this allow them to send some pair of other arbitrary headers that would pass validation but aren't the headers we requested?
Which raises an additional point of whether we validate the returned headers in any situation? Shouldn't we be validating that the headers returned are indeed at minimum a subset of the requested headers that we expected? Similarly, that they conform to the requested ordering based on the reverse
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could perform all those extra checks you describe, but it'd make the code a lot more complex for no real benefit because a malicious peer that passes this check by providing fake data would be caught later on when we actually try to import headers. That's what I had in mind when I wrote the comment in the except block that handles HeaderNotFound
.
Unless I'm missing something, I don't think this could be an attack vector so I'd rather avoid the extra complexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the added complexity could be as simple as a single function call to a validation utility.
def validate_header_response(request_params, response_headers):
...
Should be small and easy to insert into any function body which requests and receives headers (though it does get complicated if we have call sites which don't have access to the request_params
).
IF it is as simple as this then I feel strongly enough to push for it. If there are hidden complexities that I don't see, then I'll happily defer to your judgement.
My reasoning for still wanting this even with the knowledge that the import validation should catch things is that unless I know with near certaintly that something isn't an attack vector, I try to assume that something is hiding in there, and skipping validation seems like a good avenue for security holes to find their way in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that's straightforward enough. Pushed a new commit with the validation function you suggested
p2p/peer.py
Outdated
self, block_number: BlockNumber) -> Tuple[BlockHeader, BlockHeader]: | ||
request_id = gen_request_id() | ||
self.sub_proto.send_get_block_headers( | ||
block_number - 1, max_headers=2, request_id=request_id, reverse=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this one specify a request_id
but the one on line 605 doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different sub-protocols. LES requests include a request_id but ETH ones don't
p2p/peer.py
Outdated
@@ -528,6 +600,22 @@ class ETHPeer(BasePeer): | |||
self.head_td = msg['td'] | |||
self.head_hash = msg['best_hash'] | |||
|
|||
async def _get_headers_at_chain_split( | |||
self, block_number: BlockNumber) -> Tuple[BlockHeader, BlockHeader]: | |||
self.sub_proto.send_get_block_headers(block_number - 1, max_headers=2, reverse=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sister comment: why doesn't this have a request_id
?
7f1e6c8
to
2348f8a
Compare
2348f8a
to
d678492
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am i correct that there should be a follow up PR that applies this validation to the other places where header retrieval happens?
|
||
|
||
def validate_header_response( | ||
start_number: int, count: int, reverse: bool, headers: List[BlockHeader]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we just not using the skip
parameter yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we currently don't use it anywhere IIRC
We could try and apply this validation to other places that receive headers, but it won't be so straightforward as they don't have the request params, and yeah, it should be in a separate PR because this one is only about ensuring peers are on the same side of the DAO fork as us |
Tracked here: #944 |
d678492
to
f193d96
Compare
I cannot find it in the code. Do you check only the header hash of the dao hf header? |
Check happens here: https://github.com/ethereum/py-evm/blob/master/p2p/peer.py#L819-L865 Specifically here: https://github.com/ethereum/py-evm/blob/master/p2p/peer.py#L861 I think we may be missing validation enforcement on the |
link #1086 |
Nevermind, here it is: https://github.com/ethereum/py-evm/blob/master/eth/chains/mainnet/__init__.py#L36 |
Is the extra data check needed? Header hash is not enough? |
It may not be needed but it is part of the spec so we do it. |
Closes: #865