Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discovery+rpc: expose graph synced status within GetInfo #3355

Merged
merged 3 commits into from Aug 13, 2019

Conversation

@wpaulino
Copy link
Collaborator

commented Jul 29, 2019

Fixes #3289.

One thing to note is that the current implementation will lead to GetNetworkInfo blocking if it's called while lnd is waiting for the backend to be considered synced. This is because we don't initialize lnd until that's done first. This can be addressed, but I've decided to leave it out unless we think it's something definitely worth addressing.

@wpaulino wpaulino requested a review from halseth as a code owner Jul 29, 2019

@wpaulino wpaulino added this to the 0.8.0 milestone Jul 29, 2019

@Roasbeef

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

One thing to note is that the current implementation will lead to GetNetworkInfo blocking if it's called while lnd is waiting for the backend to be considered synced.

This'll have an affect on tools such as lndmon, as they won't be able to plot the increase in chans known over time. Still need to dive into the diff, but can't we just return false in the scenario that would cause us to block?

discovery/sync_manager.go Outdated Show resolved Hide resolved
@@ -4029,6 +4029,7 @@ func (r *rpcServer) GetNetworkInfo(ctx context.Context,
MaxChannelSize: int64(maxChannelSize),
MedianChannelSizeSat: int64(medianChanSize),
NumZombieChans: numZombies,
SyncedToGraph: r.server.authGossiper.SyncManager().IsGraphSynced(),

This comment has been minimized.

Copy link
@Roasbeef

Roasbeef Jul 30, 2019

Member

If you make the return value channel, then we don't need to block here and can fall through to false in that case.

This comment has been minimized.

Copy link
@wpaulino

wpaulino Jul 30, 2019

Author Collaborator

Went with a different approach instead.

@wpaulino wpaulino force-pushed the wpaulino:is-graph-synced branch from 31661b2 to fec422b Jul 30, 2019

@wpaulino wpaulino requested a review from Roasbeef Jul 30, 2019

@cfromknecht

This comment has been minimized.

Copy link
Collaborator

commented Jul 30, 2019

for some reason i was under the impression we wanted to expose a percentage rather than just a boolean, was that not the case? other the code changes look pretty good

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 30, 2019

@cfromknecht I went for the simplest approach possible as exposing a percentage seemed a bit more involved, though I'm not opposed to doing that.

@cfromknecht
Copy link
Collaborator

left a comment

LGTM 👨‍🚀 We can defer on exposing a percentage-based indicator if it proves useful

@wpaulino wpaulino force-pushed the wpaulino:is-graph-synced branch from fec422b to 582d44b Jul 31, 2019

@sergioabril

This comment has been minimized.

Copy link

commented Jul 31, 2019

@cfromknecht IMO, a percentage would be much more useful. The boolean is ok, but feels incomplete to me.

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 1, 2019

@sergioabril could you provide more insight for your use case of the percentage? Would it just be to expose the percentage to users in a higher level application? With the boolean, it should quickly flip to true unless it's your first time syncing the graph or you've been offline for a while.

@sergioabril

This comment has been minimized.

Copy link

commented Aug 1, 2019

@cfromknecht

This comment has been minimized.

Copy link
Collaborator

commented Aug 1, 2019

@wpaulino

With the boolean, it should quickly flip to true unless it's your first time syncing the graph or you've been offline for a while.

If we connect to a peer that doesn't respond, will this continue to show false indefinitely?

@wpaulino wpaulino force-pushed the wpaulino:is-graph-synced branch from 582d44b to 98348a4 Aug 1, 2019

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 1, 2019

If we connect to a peer that doesn't respond, will this continue to show false indefinitely?

@cfromknecht wasn't the case previously, but should be fixed in the latest version!

@cfromknecht

This comment has been minimized.

Copy link
Collaborator

commented Aug 2, 2019

@cfromknecht wasn't the case previously, but should be fixed in the latest version!

just looked at the latest diff, how is this solved? if we send a query_channel_range and the initial peer we connect to is slow to reply, then is_synced will continue to show false. in the case of the app, that means a peer not responding can prevent you from making payments

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 2, 2019

See fcb6a8b. We’ll pick up on the status of the next historical sync if our initial one hasn’t completed by then. Did you have another solution in mind?

@cfromknecht

This comment has been minimized.

Copy link
Collaborator

commented Aug 2, 2019

See fcb6a8b. We’ll pick up on the status of the next historical sync if our initial one hasn’t completed by then.

i don't think we're talking about the same issue. the is_synced field can only become true if a peer responds to our query_channel_range message. if none of our peers respond, then it will always be false. if we gate payments in the app based on this field, then you will never be able to make a payment if you connect to poor/malicious peers.

Did you have another solution in mind?

one option is to add a deadline after which it will flip to true regardless, that way if a peer doesn't respond we're still covered.

in my mind, there's really three states:

  1. UNKNOWN
  2. SYNCING
  3. SYNCED

when we come up, we start at UNKNOWN. after receiving any reply from our historical syncer[s] we then transition to:

  • SYNCING, if they have unknown channels, transitioning to SYNCED once a syncer completes
  • SYNCED, if they don't

my proposal is to add a timeout where if we stay in UNKNOWN for too long, we'll transition to SYNCED anyway and is_synced becomes true.

alternatively, these enums could be exposed over rpc to allow higher-level applications to act on. for example, if an app sees that we're still in UNKNOWN after some timeout, it can make the judgement call as to whether to allow payments or not

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 2, 2019

if we gate payments in the app based on this field, then you will never be able to make a payment if you connect to poor/malicious peers.

Why not just alert users rather than preventing them from completely making payments? Higher level applications can choose to interpret this field as they wish. They can decide to only gate payments up to some timeout as you suggested, gate payments regardless of how long it takes, alert users, or something else entirely.

when we come up, we start at UNKNOWN. after receiving any reply from our historical syncer[s] we then transition to:

  • SYNCING, if they have unknown channels, transitioning to SYNCED once a syncer completes
  • SYNCED, if they don't

my proposal is to add a timeout where if we stay in UNKNOWN for too long, we'll transition to > SYNCED anyway and is_synced becomes true.

Transitioning to true even though we haven't synced the graph seems to defeat the purpose of the flag, which is just to give an indication that we have done our part to sync the graph with at least one peer fully and will continue to do so through the consequent historical syncs. Just because they replied once doesn't mean we can guarantee they'll continue to do so, so we can still be in the SYNCING state. If they don't reply, do we flip back to UNKNOWN? Do we try another peer after a timeout? The additional complexity doesn't seem worth it.

alternatively, these enums could be exposed over rpc to allow higher-level applications to act on. for example, if an app sees that we're still in UNKNOWN after some timeout, it can make the judgement call as to whether to allow payments or not

They can already do this with how things are at the moment, i.e. synced_to_chain is false after some timeout, though I guess they wouldn't be able to differentiate between being stalled and actually syncing.

continue
}

// Otherwise, we'll track the peer we've performed a

This comment has been minimized.

Copy link
@halseth

halseth Aug 2, 2019

Collaborator

interesting. So this is really an existing bug, but it wouldn't really matter that the initialHistoricalSyncCompleted was never set?

This comment has been minimized.

Copy link
@wpaulino

wpaulino Aug 6, 2019

Author Collaborator

Exactly. It would matter as it could prevent us from transitioning to active syncers.

@wpaulino wpaulino requested a review from halseth Aug 6, 2019

discovery/sync_manager.go Show resolved Hide resolved
lnrpc/rpc.proto Outdated Show resolved Hide resolved
wpaulino added 2 commits Jul 27, 2019
discovery: handle graph synced status after stalled initial historica…
…l sync

This ensures that the graph synced status is marked true at some point
once a historical sync has completed. Before this commit, a stalled
historical sync could cause us to never mark the graph as synced.

@wpaulino wpaulino force-pushed the wpaulino:is-graph-synced branch from 98348a4 to 8ca2c79 Aug 7, 2019

@wpaulino wpaulino requested a review from cfromknecht Aug 7, 2019

@Roasbeef

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

On the question of if we need to care about a peer stalling: that seems to be more of an issue for our peer selection, and eclipse attack mitigations, no? If a peer isn't responding, then we can move to time them out, and rotate to another peer.

Our initial use case for this is also pretty simple, in just greying things out, or displaying this as an active indicator. Also practically, in the case of the application, they're connected to mostly known "golden" peers as well.

@wpaulino wpaulino changed the title discovery+rpc: expose graph synced status within GetNetworkInfo discovery+rpc: expose graph synced status within GetInfo Aug 7, 2019

@halseth halseth removed their request for review Aug 7, 2019

@cfromknecht
Copy link
Collaborator

left a comment

LGTB 💰

@Roasbeef
Copy link
Member

left a comment

LGTM 🌋

@Roasbeef Roasbeef merged commit 4e62e8a into lightningnetwork:master Aug 13, 2019

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 60.782%
Details

@wpaulino wpaulino deleted the wpaulino:is-graph-synced branch Aug 13, 2019

@githorray

This comment has been minimized.

Copy link

commented Aug 13, 2019

I get "synced_to_graph": true via the getinfo rpc call, but not via lncli getinfo.

@wpaulino

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 14, 2019

Thanks @githorray, fixed with #3400.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.