Data loss protection #1364

halseth · 2018-06-11T13:18:14Z

This PR implements the logic required for "data loss protection", as described in BOLT#2: https://github.com/lightningnetwork/lightning-rfc/blob/master/02-peer-protocol.md#message-retransmission

When syncing channel states we will now detect the cases where we probably have lost our local state, meaning broadcasting our commitment would be unsafe as it could be considered a breach. Instead we store the my_current_per_commitment_point sent to us by the remote in the database. This commitment point can later be used to reclaim our funds if the remote party decides to unilaterally close the channel using the corresponding state.

If we detect that the remote party probably has lost state, we'll be a good citizen and force close the channel using our latest commitment.

@Roasbeef: what is meant by:

In order to ensure we can carry out this process reliably we may need to ensure that they remote party first sends their re-sync message before we do. We can enforce this by requiring the initiator to send their message first.

?

Fixes #1131

lcasassa · 2018-06-17T01:25:07Z

After doing a rebase to master, the docker image does not compile. Can you take a look please 👍

irekzielinski · 2018-06-18T20:22:04Z

Can't wait for this to be merged in - this will allow BLW wallet to use LND nodes!
Thank you for your effort guys!

lcasassa · 2018-07-04T19:07:28Z

Any updates on this?

Roasbeef · 2018-07-04T19:27:59Z

You're free to help test and review Linus

…

On Wed, Jul 4, 2018, 2:07 PM Linus Casassa ***@***.***> wrote: Any updates on this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1364 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA87Lh_ILbFtX8Xd9yJ8-xvYDr1C_xO2ks5uDRJygaJpZM4Uira9> .

Roasbeef · 2018-07-04T19:35:43Z

I have a pending review which should be posted soon.

…

On Wed, Jul 4, 2018, 2:27 PM Olaoluwa Osuntokun ***@***.***> wrote: You're free to help test and review Linus On Wed, Jul 4, 2018, 2:07 PM Linus Casassa ***@***.***> wrote: > Any updates on this? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#1364 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AA87Lh_ILbFtX8Xd9yJ8-xvYDr1C_xO2ks5uDRJygaJpZM4Uira9> > . >

lcasassa · 2018-07-05T00:43:46Z

I'm happy to test. But I get an error when compiling the docker image after doing a rebase from master. :/ If I don't do the rebase I get the offline channel error.

Roasbeef

Excellent PR! This is one of the last few things we're missing in lnd in terms of additional safety measures to ensure that our user's funds ah safu!

The architecture and the structure of the PR, along with the set of tests look pretty good to me at first pass. Many of my comments are style related, and pointing out some measures in the PR that should break after a rebase to the current master (as this PR is a month out of date at present).

Roasbeef · 2018-06-28T03:09:04Z

lnd_test.go

@@ -2236,6 +2236,11 @@ func testChannelForceClosure(net *lntest.NetworkHarness, t *harnessTest) {
 			carolExpectedBalance,
 			carolBalResp.ConfirmedBalance)
 	}
+


Should be the case that this commit is no longer needed after rebase.

Roasbeef · 2018-06-28T03:10:17Z

contractcourt/chain_watcher.go

@@ -588,7 +597,7 @@ func (c *chainWatcher) dispatchRemoteForceClose(commitSpend *chainntnfs.SpendDet
 	// channel on-chain.
 	uniClose, err := lnwallet.NewUnilateralCloseSummary(
 		c.cfg.chanState, c.cfg.signer, c.cfg.pCache, commitSpend,
-		remoteCommit, isRemotePendingCommit,
+		remoteCommit, commitPoint,


If we're already passing in the channel state, then why do we need to pass in this point as well?

Roasbeef · 2018-06-28T03:12:25Z

lnwallet/channel.go

 func NewUnilateralCloseSummary(chanState *channeldb.OpenChannel, signer Signer,
 	pCache PreimageCache, commitSpend *chainntnfs.SpendDetail,
 	remoteCommit channeldb.ChannelCommitment,
-	remotePendingCommit bool) (*UnilateralCloseSummary, error) {
+	commitPoint *btcec.PublicKey) (*UnilateralCloseSummary, error) {


Similar comment here, we're already passing in the entire state.

Roasbeef · 2018-06-28T03:12:46Z

lnd_test.go

+	// Carol will be the breached party. We set --nolisten to ensure Bob
+	// won't be able to connect to her and trigger the channel data
+	// protection logic automatically.
+	carol, err := net.NewNode("Carol", []string{"--debughtlc",


Style nit w.r.t line wrapping here.

Roasbeef · 2018-06-28T03:12:54Z

lnd_test.go

@@ -4981,27 +5007,44 @@ func testRevokedCloseRetributionZeroValueRemoteOutput(net *lntest.NetworkHarness

 	// Since we'd like to test some multi-hop failure scenarios, we'll
 	// introduce another node into our test network: Carol.
-	carol, err := net.NewNode("Carol", []string{"--debughtlc", "--hodl.exit-settle"})
+	carol, err := net.NewNode("Carol", []string{"--debughtlc",


Style nit w.r.t line wrapping here.

Roasbeef · 2018-07-07T01:07:56Z

contractcourt/chain_watcher.go

+			// state, we'll just pass an empty commitment. Note
+			// that this means we won't be able to recover any HTLC
+			// funds.
+			// TODO(halseth): can we try to recover some HTLCs?


Maybe...only if we have partial state, and this isn't actually the result of us restoring w/ seed+static-backups.

Roasbeef · 2018-07-07T01:08:19Z

contractcourt/chain_watcher.go

+			// funds.
+			// TODO(halseth): can we try to recover some HTLCs?
+			err = c.dispatchRemoteForceClose(
+				commitSpend, channeldb.ChannelCommitment{},


Ah now I see why the commitPoint and chan state are passed distinctly.

Roasbeef · 2018-07-07T01:08:36Z

lnd_test.go

+	}()
+
+	// Dave will be the party losing his state.
+	dave, err := net.NewNode("Dave", []string{})


Can just pass a nil here as the second param.

Roasbeef · 2018-07-07T01:08:50Z

lnd_test.go

+
+	// We must let Dave communicate with Carol before they are able to open
+	// channel, so we connect Dave and Carol,
+	if err := net.ConnectNodes(ctxb, carol, dave); err != nil {


This'll fail after rebasing to master as carol was started in --nolisten mode.

Roasbeef · 2018-07-07T01:09:44Z

lnd_test.go

+	block = mineBlocks(t, net, 1)[0]
+	assertTxInBlock(t, block, daveSweep)
+
+	// Now Dave should considere the channel fully closed.


considere -> consider?

halseth · 2018-07-12T11:03:02Z

Rebased and addressed comments. Also increased the cases we force close to include the remote giving us invalid data.

I will create a few issues to handle the remaining follow-ups, most notably prohibiting the user from force closing a desynced channel, and adding chanSync message resend.

PTAL

cfromknecht · 2018-07-31T01:08:03Z

channeldb/channel.go

+	c.Lock()
+	defer c.Unlock()
+
+	if err := c.Db.Update(func(tx *bolt.Tx) error {


would be great to use putChanStatus here to reduce code duplication, signature might be more like:

func (c *OpenChannel) putChanStatus(tx *bolt.Tx, status ChannelStatus) (ChannelStatus, error)

for easier composition

Tried doing this, but this turned out to now reduce duplication, since then the DB must be Viewed from the caller, and the bucket must be retrieved twice in MarkDataLoss. The other suggestion were added, PTAL :)

gotcha, thanks for trying anyway 😂

In this commit we modify the integration tests slightly, by setting the parties that gets breached during the breach tests to --nolisten. We do this to ensure that once the data protection logic is in place, they nodes won't automatically connect, detect the state desync and recover before we are able to trigger the breach.

…ndingCommit

…commit

cfromknecht

LGTM! 💾🔒 just needs a squash before merging

Since the ChanStatus field can be changed from concurrent callers, we make it unexported and add the method ChanStatus() for safe retrieval.

This commit defines a few new errors that we can potentially encounter during channel reestablishment: * ErrInvalidLocalUnrevokedCommitPoint * ErrCommitSyncLocalDataLoss * ErrCommitSyncRemoteDataLoss in addition to the already defined errors * ErrInvalidLastCommitSecret * ErrCannotSyncCommitChains

…alDataLoss

This commit enumerates the various error cases we can encounter when we compare our local commit chain to the view the remote communicates to us via msg.RemoteCommitTailHeight. We now compare this height to our local tail height (note that there's never a local "tip" at this point), returning relevant error in case of a unrecoverable desync, and re-send a revocation in case we owe one.

This commit enumerates the various error cases we can encounter when we compare our remote commit chain to the view the remote communicates to us via msg.NextLocalCommitHeight. We now compare this height to our remote tail and tip height, returning relevant error in case of a unrecoverable desync, and re-send a commitment signature (including log updates) in case we owe one.

This commit adds a check for the LocalUnrevokedCommitPoint sent to us by the remote during channel reestablishment, ensuring it is the same point as they have previously sent us.

This commit makes the link inspect the error encountered during channel sync, force closing the channel if we detect a remote data loss.

…oss commitPoint This commit makes the chainwatcher attempt to dispatch a remote close when it detects a remote state with a state number higher than our known remote state. This can mean that we lost some state, and we check the database for (hopefully) a data loss commit point retrieved during channel sync with the remote peer. If this commit point is found in the database we use it to try to recover our funds from the commitment.

This commit adds the integration test testDataLossProtection, that ensures that when a node loses state, the channel counterparty will force close the channel, and they both can recover their funds.

cfromknecht

Ready to merge 🎉

lcasassa · 2018-08-01T21:32:20Z

Nice work here!! Thanks!!

mcduck76 · 2019-02-06T11:17:42Z

Sorry for being late to the party, but just today bitcoin lightning wallet refuses to connect with "data loss protection not supported by this peer".
I am running current code base of lnd.

Is there an option to enable the feature? From what I can tell it is on by default since this merge.

Roasbeef · 2019-02-06T21:11:38Z

@mcduck76 I'd report that to BLW, looks like they aren't interpreting feature bits correctly.

mcduck76 · 2019-02-14T11:57:08Z

@mcduck76 I'd report that to BLW, looks like they aren't interpreting feature bits correctly.

I can connect from BLW to outher LND nodes. I really can't think of what is wrong with my installation, other nodes can connect fine.
Are there any configuration or make options that I have omitted so there is no support for the feature?

irekzielinski mentioned this pull request Jun 11, 2018

LND and Eclair Wallet - infinity blinking status in Eclair as NORMAL->OFFLINE->NORMAL #1360

Closed

meshcollider added safety General label for issues/PRs related to the safety of using the software recovery Related to the backup/restoration of LND data (e.g. wallet seeds) labels Jun 15, 2018

Roasbeef requested changes Jul 7, 2018

View reviewed changes

Roasbeef mentioned this pull request Jul 10, 2018

Recovering bitcoin from missing channel #664

Closed

Roasbeef added P2 should be fixed if one has time needs review PR needs review by regular contributors needs testing PR hasn't yet been actively tested on testnet/mainnet labels Jul 11, 2018

halseth force-pushed the data-loss-protect branch 4 times, most recently from 116e18d to 45d0661 Compare July 12, 2018 10:48

halseth force-pushed the data-loss-protect branch from 45d0661 to 04bf19a Compare July 12, 2018 11:08

halseth closed this Jul 12, 2018

halseth deleted the data-loss-protect branch July 12, 2018 13:27

halseth restored the data-loss-protect branch July 12, 2018 13:35

halseth reopened this Jul 12, 2018

Roasbeef added first pass review done PR has had first pass of review, needs more tho needs rebase PR has merge conflicts and removed needs review PR needs review by regular contributors labels Jul 12, 2018

Roasbeef requested a review from wpaulino July 17, 2018 06:20

halseth force-pushed the data-loss-protect branch from 04bf19a to ac5f36e Compare July 17, 2018 08:49

halseth removed the needs rebase PR has merge conflicts label Jul 17, 2018

cfromknecht reviewed Jul 31, 2018

View reviewed changes

halseth added 6 commits July 31, 2018 08:27

htlcswitch tests: add missing OnChannelFailure to test link configs

22e21da

lnwallet/channel: make NewUnilateralCloseSummary take commitPoint

06ceba4

lnwallet/channel test: take commitPoint in NewUnilateralCloseSummary

d9e9b61

contractcourt/chain_watcher: use commitPoint directly instead of isPe…

2626bba

…ndingCommit

lnwallet/channel: extract local balance from spend instead of stored …

eed052e

…commit

halseth force-pushed the data-loss-protect branch from 8a24196 to 79197ee Compare July 31, 2018 06:27

cfromknecht previously approved these changes Jul 31, 2018

View reviewed changes

halseth dismissed cfromknecht’s stale review via 717c38c July 31, 2018 12:59

halseth force-pushed the data-loss-protect branch from 75b4804 to 717c38c Compare July 31, 2018 12:59

halseth added 12 commits July 31, 2018 15:07

channeldb: make chanStatus unexported

ea6aca2

Since the ChanStatus field can be changed from concurrent callers, we make it unexported and add the method ChanStatus() for safe retrieval.

channeldb/channel: methods for marking borked+dataloss commitPoint in db

6cdf0e2

lnwallet/channel: reduce scope of commitSecretCorrect

3825ca7

lnwallet/channel test: rename ErrCommitSyncDataLoss->ErrCommitSyncLoc…

7fb3be8

…alDataLoss

lnwallet/channel: check validity of received commitPoint

78a4a15

This commit adds a check for the LocalUnrevokedCommitPoint sent to us by the remote during channel reestablishment, ensuring it is the same point as they have previously sent us.

lnwallet/channel test: add TestChanSyncFailure

410b730

htlcswitch/link: inspect sync errors, force close channel

ebed786

This commit makes the link inspect the error encountered during channel sync, force closing the channel if we detect a remote data loss.

lnd_test: add testDataLossProtection

afccca5

This commit adds the integration test testDataLossProtection, that ensures that when a node loses state, the channel counterparty will force close the channel, and they both can recover their funds.

halseth force-pushed the data-loss-protect branch from 717c38c to afccca5 Compare July 31, 2018 13:16

cfromknecht approved these changes Jul 31, 2018

View reviewed changes

Roasbeef merged commit 1e39cfc into lightningnetwork:master Aug 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data loss protection #1364

Data loss protection #1364

halseth commented Jun 11, 2018

lcasassa commented Jun 17, 2018

irekzielinski commented Jun 18, 2018

lcasassa commented Jul 4, 2018

Roasbeef commented Jul 4, 2018 via email

Roasbeef commented Jul 4, 2018 via email

lcasassa commented Jul 5, 2018

Roasbeef left a comment

Roasbeef Jun 28, 2018

Roasbeef Jun 28, 2018

Roasbeef Jun 28, 2018

Roasbeef Jun 28, 2018

Roasbeef Jun 28, 2018

Roasbeef Jul 7, 2018

Roasbeef Jul 7, 2018

Roasbeef Jul 7, 2018

Roasbeef Jul 7, 2018

Roasbeef Jul 7, 2018

halseth commented Jul 12, 2018

cfromknecht Jul 31, 2018

halseth Jul 31, 2018

cfromknecht Jul 31, 2018

cfromknecht left a comment

cfromknecht left a comment

lcasassa commented Aug 1, 2018

mcduck76 commented Feb 6, 2019

Roasbeef commented Feb 6, 2019

mcduck76 commented Feb 14, 2019

Data loss protection #1364

Data loss protection #1364

Conversation

halseth commented Jun 11, 2018

lcasassa commented Jun 17, 2018

irekzielinski commented Jun 18, 2018

lcasassa commented Jul 4, 2018

Roasbeef commented Jul 4, 2018 via email

Roasbeef commented Jul 4, 2018 via email

lcasassa commented Jul 5, 2018

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

halseth commented Jul 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfromknecht left a comment

Choose a reason for hiding this comment

cfromknecht left a comment

Choose a reason for hiding this comment

lcasassa commented Aug 1, 2018

mcduck76 commented Feb 6, 2019

Roasbeef commented Feb 6, 2019

mcduck76 commented Feb 14, 2019