New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split peer selection from header sync #1317
Split peer selection from header sync #1317
Conversation
trinity/sync/common/chain.py
Outdated
|
||
@contextmanager | ||
def _get_peer_header_syncer( | ||
self, peer: HeaderRequestingPeer) -> Generator['PeerHeaderSyncer', None, None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type-wise I think we figured out that we can do Iterator[Thing]
here which is a bit cleaner, leaving of the dangly None
bits.
trinity/sync/common/chain.py
Outdated
self.chain, | ||
self.db, | ||
peer, | ||
self.header_queue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that in my opinion, much of the confusing complexity we've gotten rid of involved passing around queues. If you are onboard with avoiding this pattern it doesn't need to be part of this PR but I my thoughts on what it might look like are:
Instead of the service being responsible for both generating headers and feeding them into the queue, potentially we should move the responsibility into the service where the header_queue
lives. This could be a background task/process/daemon which does roughly:
async def consume_headers(self, peer_header_syncer):
async for header in peer_header_syncer.stream():
self.header_queue.put_nowait(header)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is a TaskQueue
not an asyncio.Queue
. A nice thing about using a (nevermind, TaskQueue
here is that there is back-pressure when downstream processes are slowing down. (So we can await self.header_queue.add(headers)
and it will pause if downstream hasn't finished)asyncio.Queue
does that too, of course)
I'll take a crack at removing this particular coupling, but I still think it could be a good solution for communicating between the BaseHeaderChainSyncer
and, say, the FastChainSyncer
. (which is how it is now, but they are implemented as superclass/subclass, so maybe it's less noticeable)
I'm also not remembering the queue complexity that you're talking about, so if you have a link/story about it, that would help me understand what to avoid.
56114bd
to
82e4b85
Compare
What was wrong?
BaseHeaderChainSyncer
did a lot of things all together, like pick the peer to sync with, and then do all the header syncing with that peer. This structure was not friendly to changes like swapping in a new peer mid-stream to do the header-sync, or switching to a skeleton header sync.How was it fixed?
Added a new service that only runs for the lifetime of a single peer's header syncing. This splits the code that does peer selection out of the code that actually syncs with the given peer. I have some code for the mid-sync-swap drafted, but I suspect that will require some more conversation. I'm hoping that we can get this merged in the meantime.
Marked wip because I'm getting type errors in code that wasn't touched. cc cburgdorf for ideas?(Nevermind, the type errors are only showing up on my local install. That's also odd, but at least I have a lead now.)Cute Animal Picture