Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

watchtower/wtclient: reliable, asynchronous pipeline for revoked state backups #2618

Merged
merged 21 commits into from Mar 16, 2019

Conversation

Projects
None yet
3 participants
@cfromknecht
Copy link
Collaborator

commented Feb 8, 2019

This PR presents the bulk of the wtclient package, a reliable subsystem that backs up revoked states to a configured watchtower. #2516 introduced the logic for constructing justice transactions and encrypting them. Here we add the remainder of the capabilities that aggregates all revoked states, assigns them to a particular session, then signs justice transactions and uploads them to the proper tower.

Great care is taken to ensure that the wtclient.TowerClient can do all of this in a non-blocking fashion, ensuring that we maintain LND's high throughput and performance in HTLC forwarding. However, this too creates difficulties of its own, as we must ensure that the whole process is reliable, and will continue until completion if the daemon is shutdown or killed intermittently. Thus, the client is implemented as an asynchronous pipeline to which breached states are added, and the client makes its best effort to upload them before allowing the daemon to exit.

Additional considerations are also given to mobile, which is intended to be the predominant platform in which the client is run and where it adds the most benefit. Since transient factors such as network connectivity may impede a successful upload during shutdown, the entire pipline can be ForceQuit to exit promptly. On OS's like iOS, where applications only have a short time to live once a shutdown is requested, this ensures that we terminate gracefully, and can pick back up where we left off once the daemon has been restarted.

The initial rollout will only be targeted at supporting a single, private tower. However, much of the logic in this package is included for more advanced features such as rotating between several target towers during session negotiation, or allowing more asynchrony between session negotiation which can be useful for added privacy. These can be enabled down the road once more testing is in place as we continue to improve on minor reliability edge cases in the single tower case.

The unit tests in this area in the process of being refactored, to enable them to be more generalized and table-driven. These will be posted later this week.

There's a lot more detail I can go into about the design, but it's getting late here in SF so I'll pick this up tomorrow 馃殌

Depends on:

@cfromknecht cfromknecht force-pushed the cfromknecht:wtclient branch from 1ae1456 to eb519b5 Feb 8, 2019

@Roasbeef
Copy link
Member

left a comment

Things are really starting to come together now! I found it very illuminating to see all the components wired up together in this PR end to end. At times, it was a bit hard to see the bigger picture when reviewing an isolated package or interface outside of the main server/client loops. As there's a lot contained in this PR (draw the rest of the fkn owl), I'll be making a few more passes, paying particular attention to the tests, and also the interaction (edge cases, persistence, etc) in the main client.go file. The persistence and reliable handoffs aren't super clear in my mental model yet, but I think after another pass or two (and possibly clarifying questions), things'll become more clear on my end.

Show resolved Hide resolved watchtower/wtserver/server.go Outdated
Show resolved Hide resolved watchtower/wtserver/server.go Outdated
Show resolved Hide resolved watchtower/wtdb/client_state_update.go Outdated
Show resolved Hide resolved watchtower/wtclient/interface.go Outdated
Show resolved Hide resolved watchtower/wtclient/interface.go
Show resolved Hide resolved watchtower/wtclient/session_queue.go
Show resolved Hide resolved watchtower/wtclient/session_queue.go
Show resolved Hide resolved watchtower/wtclient/client.go
Show resolved Hide resolved watchtower/wtclient/client.go
Show resolved Hide resolved watchtower/wtclient/client.go Outdated
@wpaulino

This comment has been minimized.

Copy link
Collaborator

commented Mar 5, 2019

Needs a rebase.

@Roasbeef Roasbeef added this to the 0.6 milestone Mar 6, 2019

Show resolved Hide resolved watchtower/wtserver/server.go Outdated
Show resolved Hide resolved watchtower/wtdb/client_session.go
Show resolved Hide resolved watchtower/wtclient/interface.go

import "errors"

var (

This comment has been minimized.

Copy link
@Roasbeef

Roasbeef Mar 6, 2019

Member

Still here in latest diff.

Show resolved Hide resolved watchtower/wtclient/candidate_iterator.go
Show resolved Hide resolved watchtower/wtclient/session_negotiator.go
Show resolved Hide resolved watchtower/wtclient/task_pipeline.go
case wtwire.CodeOK:
lastApplied := stateUpdateReply.LastApplied
if lastApplied > stateUpdate.SeqNum {
// TODO(conner): borked watchtower?

This comment has been minimized.

Copy link
@Roasbeef

Roasbeef Mar 6, 2019

Member

At the very least, in the short term, should log this case and the one below.

// TODO(conner): store last applied for session
// TODO(conner): delete blob locally
q.queueCond.L.Lock()
q.cfg.ClientSession.SeqNum++

This comment has been minimized.

Copy link
@Roasbeef

Roasbeef Mar 6, 2019

Member

When does this state get updated on disk?

This comment has been minimized.

Copy link
@cfromknecht

cfromknecht Mar 6, 2019

Author Collaborator

latest version properly updates the db, you're correct that the first version did this assignment in memory ;)

Show resolved Hide resolved watchtower/wtclient/session_queue.go Outdated

@cfromknecht cfromknecht force-pushed the cfromknecht:wtclient branch 7 times, most recently from f2d6b21 to 9fe6f3b Mar 6, 2019

@wpaulino
Copy link
Collaborator

left a comment

Looking pretty good on first pass -- mostly have minor comments.

Show resolved Hide resolved watchtower/wtdb/client_session.go Outdated
Show resolved Hide resolved watchtower/wtdb/client_session.go Outdated
Show resolved Hide resolved watchtower/wtdb/client_state_update.go Outdated
Show resolved Hide resolved watchtower/wtclient/interface.go Outdated
Show resolved Hide resolved watchtower/wtclient/interface.go Outdated
Show resolved Hide resolved watchtower/wtclient/client.go
Show resolved Hide resolved watchtower/wtmock/client_db.go
Show resolved Hide resolved watchtower/wtmock/peer.go
Show resolved Hide resolved watchtower/wtserver/server_test.go Outdated
Show resolved Hide resolved watchtower/wtserver/server.go
@Roasbeef
Copy link
Member

left a comment

LGTM 馃椏

Spotted a panic blocker in the recent review, once that's addressed, it's ready to land on my end.

// The tower rejected the session because of the reward rate. If
// we didn't request a reward session, we'll treat this as a
// permanent tower failure.
if !policy.BlobType.Has(blob.FlagReward) {

This comment has been minimized.

Copy link
@Roasbeef

Roasbeef Mar 15, 2019

Member

Possible that it could, but atm a config change like that would require a restart. In CheckRemoteInit we'd ensure that the server supported w/e features we deem necessary. If they say we have a bad reward rate, and the we didn't specify one at all, then it would likely point to a bug in the tower server.

Show resolved Hide resolved watchtower/wtclient/client_test.go Outdated
Show resolved Hide resolved watchtower/wtserver/server.go

cfromknecht added some commits Mar 15, 2019

cfromknecht added some commits Mar 15, 2019

@cfromknecht cfromknecht force-pushed the cfromknecht:wtclient branch from 9fe6f3b to 05e3a7f Mar 15, 2019

@cfromknecht

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 15, 2019

@Roasbeef @wpaulino comments addressed, ptal

@wpaulino
Copy link
Collaborator

left a comment

LGTM 鈱氾笍

@Roasbeef Roasbeef merged commit ec62104 into lightningnetwork:master Mar 16, 2019

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.5%) to 60.671%
Details

@cfromknecht cfromknecht deleted the cfromknecht:wtclient branch Mar 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.