-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autorelay: not discarding static relays on connectivity failures #1790
Conversation
In the case of the client loosing network connection it needs to reconnect back to relays when network is online
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of, thanks a ton for the contribution and apologies for the delay in the review. Follow ups will be much faster here on out.
I don't follow why we need both strategies of trying to connect to all static relays but also keeping them in our candidates. Wouldn't keeping them in our candidates suffice for both scenarios?
Either way, I think we can do something much simpler here. What if we added a new channel similar to peerChan
in func findNodes
and call it staticRelayChan
. We do the same logic (with backoffs and checking if we're at our max) and have a separate similar timer for this. This way we really only need changes in findNode
. After the new timer fires we can fill staticRelayChan with the static relay nodes that we aren't currently connected to. Backoffs will work similar to how normal candidates are handled, and we don't risk polluting our candidates list with nodes that are unresponsive (if we hit maxCandidates of unresponsive state relay nodes we're stuck). It also keeps all the other logic the same. Candidates will keep getting removed and we don't keep unresponsive relays in the candidate list.
This is a bit like giving static relays priority when sourcing new relays.
What do you think?
p2p/host/autorelay/relay_finder.go
Outdated
} else if evt.Connectedness == network.Connected { | ||
for _, r := range rf.conf.staticRelays { | ||
if r.ID == evt.Peer { | ||
rf.notifyMaybeConnectToRelay() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow why we need this logic now. Aren't we already connected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the attempt to re-reserve slots when we are back online (when the client had intermitent network connection). In that case, the relay would have dropped all our reservations and we need to re-reserve
The reason for the re-scaning is that if the client is down then the relay will drop all the reservations of that client and the next time the client is up it needs to re-reserve slots. Even if we did not drop any candidates in the time we were offline, they need to reserve slots again
Yes, that could work. I could implement something along those lines. |
Great! Very much appreciated if you do, but no expectation here. If you don't get around to it, I'll probably tackle this Thursday or Friday of this week. |
Didn't have time at all to have it done. But I was in the process of doing the required changes. I have pushed them so you can work on top of them if you want. |
p2p/host/autorelay/options.go
Outdated
c.staticRelayMinInterval = time.Hour | ||
rand.Seed(time.Now().UnixNano()) | ||
var static_candidates = static | ||
rand.Shuffle(len(static_candidates), func(i, j int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the shuffle? Users may have a preferred order, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale here was: if asked only for 1 or 2 candidates (<len(static_candidates)
), and those candidates happened to be always down, then one wouldn't be able to connect to any. Randomizing the process guarantees than you will pull different candidates as longs as len(static_candidates)>num
. But yes, if order is important, the user can just increase num
and get more candidates if previous ones haven't succeeded.
p2p/host/autorelay/options.go
Outdated
rand.Shuffle(len(static_candidates), func(i, j int) { | ||
static_candidates[i], static_candidates[j] = static_candidates[j], static_candidates[i] | ||
}) | ||
for i := 0; i < num%c.minCandidates; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm so if num == c.minCandidates we do nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right it should have been for i := 0; i <= (num-1)%c.minCandidates-1; i++
given that num >0
. However, maybe its simpler returning just num
candidates with a max of len(static_candidates)
. Although in WithPeerSource it says that MUST return at least numPeers
and here should be pretty much the same, right? if num>len(static_candidates)
we can't do much about it.
I got a chance to look at this. Possibly dumb thought, but why not simply use |
Yes That could be done, however in order for to a relay to be retried when closed the peersource must be called again and serve the same static relays. This is now done in the refactor of witStaticRelays. I could have deleted all the static relays functions but then the burden of handling the peersource for static relays will go to the user. But that can be changed. Let me know what you think @MarcoPolo |
I merged 1875. It includes a helper for creating a peersource from static relays. It should retry static candidates automatically when we loose a candidate. |
Static relays are a special kind of relays, since the client has spent some time bootstrappin those relays. Thats why if the client (autorelay) looses connection with such a relays, it should not be treated like regular relays (e.g. discarding them) Two cases come to mind:
This can be solved in two ways, either not discarding a static relay on connection errors (and send it back to candidates) or implementing a timer to periodically call handleStaticRelay. I have implemented both, since the former responses quicker in case no 1 and the latter periodically rescues forgotten static relays in case no 2
This PR solves #1782