Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[watchtower/server] Server-Side Wire Protocol #1535

Merged
merged 10 commits into from Oct 25, 2018

Conversation

cfromknecht
Copy link
Collaborator

@cfromknecht cfromknecht commented Jul 11, 2018

This PR builds on #1512, by adding the server-side handling of watchtower clients. This allows clients to negotiate sessions and send state updates. The wtwire package has been extended with more concrete fail codes, and the SessionInitReply has been altered to return an arbitrary Data field (instead of RewardPubKey), which seems more useful for negotiating session parameters.

The server's db and clients have been mocked out and protected using a build flag, so that we can have adequate black-box testing of the server's behavior. A fairly comprehensive set of test vectors have been added for state update sequences and checking that the errors from the database are being converted into the appropriate wire fail codes.

Things not in the this PR:

  • watchtower acceptance policy, current accepts any new session id
  • cleaning up old sessions, have some ideas on how to do this

@cfromknecht cfromknecht added watchtower wtserver wtwire needs review labels Jul 11, 2018
@cfromknecht cfromknecht force-pushed the wtwire-server branch 4 times, most recently from 452ed56 to 793cbf4 Compare Jul 11, 2018
@Roasbeef Roasbeef added the P2 label Jul 12, 2018
@Roasbeef Roasbeef added this to the 0.5 milestone Jul 17, 2018
@cfromknecht
Copy link
Collaborator Author

@cfromknecht cfromknecht commented Jul 19, 2018

rebased on latest #1512

Copy link
Member

@Roasbeef Roasbeef left a comment

So stoked to finally get the initial MVP watch tower code up! I've completed an initial review and dint' have any substantial comments. Most of the comments can be converted into TODO's as they aren't considered blocking critical for this PR.

I really dig the interface-by-first-principles approach that was carried out in the PR as it allows one to write completely "pure" unit tests, so no need to worry about creating a test database or anything like that. This is generally one item we can improve on within lnd as a hole. I think the major culprits are: the db, the peer struct, and at times the channel state machine itself.


// NewAddress is used to generate reward addresses, where a cut of
// successfully sent funds can be received.
NewAddress func() (btcutil.Address, error)
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

func New(cfg *Config) (*server, error) {
// Create a brontide listener for each of the configured addrs.
listeners := make([]net.Listener, len(cfg.ListenAddrs))
for _, addr := range cfg.ListenAddrs {
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a blocker, but the config should also eventually integrate the tor package to allow towers to listen of onion services. I don't think it's a blocker for MVP towers, but is worthy of a TOOD/issue now.

Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can actually just pass in the listeners directly even, this would remove any internal details from the watch tower server itself.

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah passing in the listeners sounds like a good approach. i have the tor configuration done in the top level watchtower package, so server logic should be agnostic to network :)

)
}

rewardAddrBytes := []byte(rewardAddress.EncodeAddress())
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can save a bit of space where by using a signalling protocol to encode the addresses (so byte version, then raw bytes) rather than sending the full strings.

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what version byte would we use? can we use ScriptAddress() directly?

)

// BreachHintSize parses 16-bytes from a txid.
const BreachHintSize = 16
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

godoc comment is off.

// AcceptUpdateSequence validates that a state update's sequence number and last
// applied are valid given our past history with the client. These checks ensure
// that clients are properly in sync and following the update protocol properly.
// If t
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragment is cut off here.

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

case *wtwire.SessionInit:
// Ensure SessionInit can only be sent as the first
// message.
if stateUpdateOnlyMode {
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the session for a client expires, then we force it to reconnect and initiate a new session? Why not just allow them to continue using the same connection?

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition enforces that a SessionInit is only sent as the first message, and cannot follow a StateUpdate. For a single conn, we permit:

  • 1 SessionInit
  • n StateUpdates

When parsing the first message, we don't know (without querying the db) which case we're dealing with. After finishing the first loop, this becomes true to prevent an ordering of StateUpdate+, SessionInit.

If the connection expires, the client can resume an existing session using the same session keypair. Rationale for not having long-lived connections is it makes it more difficult for the tower to track when clients are on/offline, and also prevents clients from consuming tower resources endlessly.

// A StateUpdate indicates an existing client attempting to
// back-up a revoked commitment state.
case *wtwire.StateUpdate:
log.Infof("Received SessionUpdate from %s, seqnum=%d "+
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll likely eventually want to demote this to a debug, and later trace, but for now it'll likely be helpful in finding bugs in the initial implementation.

// addPeer stores a client in the server's client map. An error is returned if a
// client with the same session id already exists.
func (s *server) addPeer(id *wtdb.SessionID, peer Peer) error {
s.clientMtx.Lock()
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just use a defer here to make things a bit easier ;)

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

// Assert that the server closes the connection after processing
// the SessionInit. We sleep to ensure we don't check too soon,
// as the check also attempts to close the connection.
time.Sleep(2 * timeoutDuration)
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideaaally we remove this sleep in the future, but don't think it's a strong blocker on the current PR.

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed all the sleeps, also made the tests faster


// Check that the server closed the connection used to register
// the session.
time.Sleep(2 * timeoutDuration)
Copy link
Member

@Roasbeef Roasbeef Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here, ideally this is event driven rather than relying on a sleep.

@Roasbeef Roasbeef added first pass review done and removed needs review labels Jul 24, 2018
Copy link
Collaborator

@halseth halseth left a comment

woops, old lingering review. Posting and checking newest changes.


// ReadTimeout specifies how long a client may go without sending a
// message.
ReadTimeout time.Duration
Copy link
Collaborator

@halseth halseth Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are ballpark values for these timeouts?

Copy link
Collaborator Author

@cfromknecht cfromknecht Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was thinking about 15-30s, to allow the watchtower to clean up resources if the client disappears


log.Infof("Accepted incoming peer %s@%s",
id, peer.RemoteAddr())
defer log.Infof("Releasing incoming peer %s@%s",
Copy link
Collaborator

@halseth halseth Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this together withe the defer removePeer?

// A StateUpdate indicates an existing client attempting to
// back-up a revoked commitment state.
case *wtwire.StateUpdate:
log.Infof("Received SessionUpdate from %s, seqnum=%d "+
Copy link
Collaborator

@halseth halseth Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make Debug or Trace?

import (
"encoding/hex"

"github.com/roasbeef/btcd/chaincfg/chainhash"
Copy link
Collaborator

@halseth halseth Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be btcsuite now. Maybe just needs a rebase? :)

Copy link
Collaborator Author

@cfromknecht cfromknecht Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@cfromknecht cfromknecht force-pushed the wtwire-server branch 2 times, most recently from 027f495 to c9aff17 Compare Jul 24, 2018
@cfromknecht
Copy link
Collaborator Author

@cfromknecht cfromknecht commented Jul 24, 2018

Thank you @Roasbeef @halseth almost all the feedback has been addressed

Last major thing seem to be fleshing out the reward addr encoding

@Roasbeef Roasbeef removed this from the 0.5 milestone Aug 1, 2018
@Roasbeef Roasbeef added this to the 0.6 milestone Aug 1, 2018
@cfromknecht cfromknecht force-pushed the wtwire-server branch 3 times, most recently from 688dea1 to 43fa563 Compare Sep 7, 2018
@cfromknecht cfromknecht force-pushed the wtwire-server branch 3 times, most recently from 9982ee4 to b9c30c2 Compare Oct 24, 2018
Copy link
Member

@Roasbeef Roasbeef left a comment

LGTM 🎸

Left a few comments during this final pass over. However, I don't consider any of the comments to be blocking.

"github.com/btcsuite/btcd/btcec"
)

// SessionIDSize is 33-bytes; it is a serialized, compressed public key.
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in commit:

adds SessoinID

Should be SessionID

return nil
}

func (s *Server) readMessage(peer Peer) (wtwire.Message, error) {
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing godoc comment here and above.

// log is a logger that is initialized with no output filters. This
// means the package will not perform any logging by default until the caller
// requests it.
var log btclog.Logger
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new logging format, I think this still needs to register itself as a logger in log.go within the main lnd directory.

Copy link
Collaborator Author

@cfromknecht cfromknecht Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was thinking about doing that when this is no longer dead code, but happy to do it now


// Interface represents a simple, listen-only service that accepts watchtower
// clients, and provides responses to their requests.
type Interface interface {
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a naming collision here ;)

Although it's not so bad when using the full import: server.Interface. Although in isolation, the server package name may conflict a bit with local variables we have in lnd or w/e else since it's so plain.

Copy link
Collaborator Author

@cfromknecht cfromknecht Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah was thinking about renaming it to something better, any suggetions??

Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TowerInterface? Lol, not blocking atm, so we can pick a better name in the future.


// Peer is the primary interface used to abstract watchtower clients.
type Peer interface {
io.WriteCloser
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,634 @@
// +build dev
Copy link
Member

@Roasbeef Roasbeef Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the build tags necessary here? As is, a simple go test -v in this directory won't execute these tests. We can't always assume that the user is using our set of makefile tools. Seems this is just to have access to the mocked variants of the major interfaces?

Copy link
Collaborator Author

@cfromknecht cfromknecht Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linter (and other go tools) complain because the mock classes aren't accessible w/o the dev tag, it is nice that putting them behind a flag keeps mock structs from being rendered on godoc.org

can't help those that do not follow the install docs :P a sizable portion of the tests won't pass/compile without the dev tag anyway, seems they're already out of luck

@Roasbeef Roasbeef merged commit 5c6c966 into lightningnetwork:master Oct 25, 2018
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first pass review done P2 watchtower wtserver wtwire
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants