Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spv: Move FetchMissingCFilter step to syncer startup #2298

Merged
merged 3 commits into from Nov 15, 2023

Conversation

matheusd
Copy link
Member

This moves the "fetching missing cfilters" stage of syncing from individual peer startup to the syncer startup.

This reduces the overall load by ensuring this stage is only done once instead of being done for each new peer.

Future PRs will continue the process of moving syncing stages from the per-peer startupSync() function to the global syncer Run() function, removing duplicate work that is currently done by the wallet.

This will make it easier to perform future refactors that affect the
peer startup procedure.

The peer running loop in the syncer is slighlty refactored to remove the
need for additional contexts and wait channels.
spv/sync.go Outdated Show resolved Hide resolved
spv/backend.go Outdated Show resolved Hide resolved
This switches the CFilters() call which fulfills the wallet's Peer
interface to use  waitForRemote instead of pickRemote.

waitForRemote blocks until a peer is available, and thus only fails when
the context is closed or if the passed isEligible function indicates
none of the peers are eligible.
This moves the "fetching missing cfilters" stage of syncing from
individual peer startup to the syncer startup.

This reduces the overall load by ensuring this stage is only done once
instead of being done for each new peer.
// If waitEligible is true, then this function will wait until at least one
// remote is eligible. Otherwise, this function returns a nil peer with nil
// error when there are peers but none of them are eligible.
func (s *Syncer) waitForRemote(ctx context.Context, isEligible func(rp *p2p.RemotePeer) bool, waitEligible bool) (*p2p.RemotePeer, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to return nil, nil at all? (are there going to be future calls where waitEligible is false?)

this just seems super awkward to me, even more so because of the naming (waitForRemote might not wait at all for such a remote peer).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if such a case is needed, i think using a different function (with a name to describe that non-waiting behavior) would be preferred.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be a future commit where I rely on false and (nil, nil). Specifically, I'll use a picker to determine there are no peers with announced headers > than the currently synced to, to flag that the startup sync has finished.

I could also return nil, ErrNoPeer in this case and select on the error if that's preferable.

@@ -49,6 +49,7 @@ type Syncer struct {

connectingRemotes map[string]struct{}
remotes map[string]*p2p.RemotePeer
remoteAvailable chan struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming this waitingOnRemote would be clearer I think. Remotes might always be available, but this channel will only be non-nil if there are currently waiters currently calling waitForRemote.

Copy link
Member

@jrick jrick Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or perhaps that is not fully accurate, upon closer inspection it might still be non-nil but closed, as it is only ever set to nil when a peer is removed.

this code is giving me a rough time :) but trying to make sure that we don't end up causing a busy loop in waitForRemote

Copy link
Member Author

@matheusd matheusd Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

as it is only ever set to nil when a peer is removed.

It is set to nil when a peer is connected.

So, the way I reason about this code is the following:

  1. Everything is done under the remotesMu lock
  2. The first caller to waitForRemote that finds no peers, creates the channel. Everyone else uses the same channel.
  3. Once the first peer connects, it closes the channel, sets the field to nil and adds itself to the remotes map.

Given everything is under the remotesMu lock, a peer entering waitForRemote() is either going to find a peer in the remotes map or it's going to use or create the waitingOnRemote channel.

Inside waitingOnRemote the peer captures the channel to use after unlocking remotesMu, so even if an arbitrary number of events (connections/disconnections/waits) happen between the unlocking and checking the captured channel c, the syncer should still maintain an invariant of there are either peers or waitingOnRemote will be used.

I guess I could, instead of relying on the first caller to waitForRemote() to create the channel, always create it when the last peer disconnects (thus the invariant would be (len(remotes) > 0) != (waitingOnRemote != nil) - either there are remotes or waitingOnRemote is filled). This might make it easier to reason about the code (at the expense of always creating the channel even if unneeded).

I could also change this to a sync.Cond, but they don't offer a way to wait with a context.

@jrick jrick merged commit 1812006 into decred:master Nov 15, 2023
2 checks passed
@matheusd matheusd deleted the fetchcf-startup-sync branch November 15, 2023 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants