-
Notifications
You must be signed in to change notification settings - Fork 145
Conversation
954fdc7
to
ebf132c
Compare
p2p/peer_pool.py
Outdated
@@ -79,11 +83,16 @@ def __init__(self, | |||
privkey: datatypes.PrivateKey, | |||
context: BasePeerContext, | |||
max_peers: int = DEFAULT_MAX_PEERS, | |||
peer_info: BasePeerInfoPersistance = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make sense to put NoopPeerInfoPersistance()
as a default instead of None
? And then removing line 92-93.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. My instinct is to avoid passing objects as parameter defaults but NoopPeerInfoPersistance
has no state so it's safe here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just not having a default at this level and taking care of it at the level above this where it is instantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just not having a default at this level and taking care of it at the level above this where it is instantiated.
My idea was that BasePeerPool
would always have a peer_info without needing to put null checks everywhere. This also makes the interface a little simpler, if the caller doesn't care about persistence then it can implicitly not-care, vs explicitly passing in a Noop object.
p2p/persistance.py
Outdated
return True | ||
|
||
|
||
class ClosedException(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should derive from BaseP2PError
and probably even move into p2p/exceptions.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I'm willing to do it but why do you think that error is more appropriate? This exception doesn't have anything to do with peers, it signals an invalid state which should never be reached. I thought about making it an assert False
but think this stacktrace is a little easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If seeing this is truly something a user should never encounter then downgrading to Exception
is acceptable. If it's helpful to have a custom exception but we don't want to expose it to users, then maybe add that explanation to the docstring. If a user might encounter this in some situation (which it sounds like they shouldn't) then it should be in p2p.exceptions
and inherit from the base exception.
Do as you see fit 😸
looks nice :) this is minor but shall i suggest s/persistance/persistence/? |
p2p/persistance.py
Outdated
T = TypeVar('T', bound=Callable[..., Any]) | ||
|
||
|
||
def must_be_open(func: T) -> T: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like we are throwing away valuable type information here ( T
has a bound of Callable[..., Any]
).
Did you check if mypy
still thinks can_connect_to
will return bool
after this decorator is applied on? You can test it if you put in reveal_type(can_connect_to)
and run mypy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://mypy.readthedocs.io/en/latest/generics.html#declaring-decorators
According to the docs preserving function signatures is done as follows:
FuncType = Callable[..., Any]
F = TypeVar('F', bound=FuncType)
# A decorator that preserves the signature.
def my_decorator(func: F) -> F:
def wrapper(*args, **kwds):
...
return cast(F, wrapper)
And the current code looks roughly equivalent to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check if
mypy
still thinkscan_connect_to
will returnbool
after this decorator is applied on?
I didn't but this decorator is called on more methods than can_connect_to
, and I wanted it to work no matter what the signature of the wrapped method is. You're right that it's not very pretty though 😢 I found a ticket which tracks mypy support for this: python/mypy#3157
p2p/persistance.py
Outdated
|
||
def _fetch_node(self, remote: Node) -> sqlite3.Row: | ||
enode = remote.uri() | ||
cursor = self.db.execute('SELECT * from bad_nodes WHERE enode = ?', (enode,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't worked with SQL for ages and never did so in Python. I think @pipermerriam already commented on SQL injections in the previous PR so I assume it is double checked that this is the right way to do it and we are on the safe side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, using the ?
in the query string and placing the values as positional arguments does keep us safe from sql injection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me sad that this is the correct way but it's indeed the correct way. named params are nicer but this sqlite3 library doesn't support them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is undocumented but after looking at the sqlite3 library, params are passed directly to sqlite, which means named parameters are possible.
That means the code could look like this:
self.db.execute(
'''
INSERT INTO bad_nodes (enode, until, reason, error_count)
VALUES (:enode, :until, :reason, :error_count)
''',
{
'enode': enode,
'until': time_to_str(until),
'reason': reason,
'error_count': error_count,
}
)
But that might be a level of boilerplate which the queries aren't yet big enough to justify
p2p/persistance.py
Outdated
|
||
class MemoryPeerInfoPersistance(SQLPeerInfoPersistance): | ||
def __init__(self, logger: ExtendedDebugLogger) -> None: | ||
super().__init__(Path(":memory:"), logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet ❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in memory SQLite is pretty sexy. I recall loving this for testing Django apps.
p2p/persistance.py
Outdated
|
||
|
||
class BasePeerInfoPersistance(abc.ABC): | ||
def __init__(self, logger: ExtendedDebugLogger) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than monkeypatching current_time
in the tests I personally would prefer the __init__
to take a parameter with time_provider: Callable[[], datetime.datetime] = datetime.datetime.utcnow
. So with a sensible default value yet over-writable for testing without reaching for monkeypatch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm minorly 👎 on this. Time is one of the few things that I'm ok with mocking as I haven't come across a better mechanism. I would advocate removing the current_time
function in favor of just using datetime.datetime.utcnow()
and then in our tests, using the monkeypatch
fixture to swap that function out for one that returns the desired value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine either way 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea on this is that there are two big categories of references, there's the "immutable" world of static references (immutable globals such as + or, in this file, sqlite3), and also the mutable world of objects which programs typically concern themselves with.
Dependency injection works well for the latter, there are legitimately different implementations of BasePeerInfoPersistance
floating around and almost all classes won't care exactly which one they use, so by taking responsibility for constructing BasePeerInfoPersistance
away from those classes we can make the code more flexible and much easier to test.
I think DI makes less sense for the former. Imports are fixed at the beginning of runtime, there aren't going to be multiple implementations of datetime.datetime.utcnow()
floating around so you lose no flexibility by explicitly referencing it. And if you squint, python is already doing DI! This module is asking for an implementation of datetime.datetime
and python is providing one. test_persistence
isn't monkeypatching so much as changing which implementation of utcnow()
this module receives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_persistence
isn't monkeypatching so much as changing which implementation ofutcnow()
this module receives.
Yes, which is why I lean on the side: Hey, if we do encounter a case where we want to swap the implementation of utcnow()
against something else e.g. a controllable fakeutcnow()
then, DI is my tool of choice.
And for a class where time plays an important role I wouldn't see it as uncommon to just depend on a time_provider: Callable[[], int]
. But anyway, this can become quite a bit philosophical and I just happen to have build DI-heavy systems for many years so I'm just biased into that direction.
It's actually refreshing to look at other solutions so I'm not opposed to the existing code 👍
@@ -23,6 +24,9 @@ def __init__(self, event_bus: Endpoint, trinity_config: TrinityConfig) -> None: | |||
self._node_port = trinity_config.port | |||
self._max_peers = trinity_config.max_peers | |||
|
|||
app_config = trinity_config.get_app_config(Eth1AppConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that without further checking this would crash if the Node
class gets reused for the beacon chain which however I don't think will happen (as described in #240) Just noting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some idle comments, not a full review.
p2p/exceptions.py
Outdated
|
||
|
||
class WrongNetworkFailure(HandshakeFailure): | ||
timeout = 60 * 60 * 24 # one day |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of storing these timeout on the exceptions we should keep this information separate in a mapping from Exception -> timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unsure on this. On the one hand it doesn't seem to be BasePeerInfoPersistence
's responsibility, but on the other hand you're right, this isn't really a property of the failure itself.
I could make a map in p2p/persistence.py
, I didn't know if that was adding too much boilerplate to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've moved them into a map, that's probably a big enough change to be worth looking at again: 129b8d6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to be addressed in this PR but I want to give consideration to how this interacts with preferred peers and the concept of reserved connection slots. We'll eventually need to ensure this mechanism doesn't blacklist a peer in a vipnode type setup where peers have paid for reserved slots.
Note that this is probably best solved via a multi-peer pool setup since it's probably not sane to try to support a single peer pool having multiple sets of rules for peer connections since those rules would very likely end of conflicting with each other.
p2p/peer_pool.py
Outdated
@@ -79,11 +83,16 @@ def __init__(self, | |||
privkey: datatypes.PrivateKey, | |||
context: BasePeerContext, | |||
max_peers: int = DEFAULT_MAX_PEERS, | |||
peer_info: BasePeerInfoPersistance = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just not having a default at this level and taking care of it at the level above this where it is instantiated.
p2p/peer_pool.py
Outdated
except HandshakeFailure as e: | ||
self.logger.debug("Could not complete handshake with %r: %s", remote, repr(e)) | ||
self.peer_info.record_failure( | ||
remote, timeout=e.timeout, reason=type(e).__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have a fixed set of values for the reason string. Maybe instead of using __name__
we could maintain a mapping from exception classes to a canonical "reason" string/value, or potentially something like an Enum
with error codes?
p2p/persistance.py
Outdated
enode = remote.uri() | ||
row = self._fetch_node(remote) | ||
now = current_time() | ||
if row: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, it looks like it's possible to emulate an UPSERT
according to this stack overflow answer.
https://stackoverflow.com/questions/418898/sqlite-upsert-not-insert-or-replace
tests/p2p/test_persistance.py
Outdated
|
||
def random_node(): | ||
address = kademlia.Address('127.0.0.1', 30303) | ||
return kademlia.Node(generate_privkey(), address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting this instpired me to create #279
heh, oops, good point @ralexstokes |
05b18d3
to
b6783a3
Compare
Temporarily blacklist peers when specific errors occur. e.g. when a peer has the wrong genesis don't attempt to reconnect to it for at least a day. This improves the speed at which peers are found because less time is wasted reconnecting to peers we already know don't work. - Create SQLlite-backed SQLitePeerInfo - Create memory-backed MemoryPeerInfo - FullNode stores peer info in a file called 'nodedb' in the data dir
3f7ad17
to
793eb19
Compare
I believe that they could be orthogonal notions. Even if you don't go all the way and have separate The alternative might be telling
Yeah, multiple peer pools with different rules seems like the simplest change. I'm trying to imagine alternatives: Maybe peers each have a |
Okay, I believe that I've addressed all the feedback. I finished my TODOs and squashed, if one of you could please give it a final lookover I'll address anything you notice and merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good,
A cleaned up version of #261
What was wrong?
Trinity wastes time trying to connect to nodes which it already knows won't work (they're on the wrong network or using the wrong genesis).
How was it fixed?
Trinity now keeps a sqlite database with information on peers. It records failures and doesn't attempt to reconnect to useless peers until a timeout has passed.
This PR doesn't do things like remember good nodes or exponential backoff, but it hopefully lays a foundation which will make those changes easy in the future.
Todo:
Cute Animal Picture