-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data loss protection #1364
Data loss protection #1364
Conversation
After doing a rebase to master, the docker image does not compile. Can you take a look please 👍 |
Can't wait for this to be merged in - this will allow BLW wallet to use LND nodes! |
Any updates on this? |
You're free to help test and review Linus
…On Wed, Jul 4, 2018, 2:07 PM Linus Casassa ***@***.***> wrote:
Any updates on this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1364 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA87Lh_ILbFtX8Xd9yJ8-xvYDr1C_xO2ks5uDRJygaJpZM4Uira9>
.
|
I have a pending review which should be posted soon.
…On Wed, Jul 4, 2018, 2:27 PM Olaoluwa Osuntokun ***@***.***> wrote:
You're free to help test and review Linus
On Wed, Jul 4, 2018, 2:07 PM Linus Casassa ***@***.***>
wrote:
> Any updates on this?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#1364 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AA87Lh_ILbFtX8Xd9yJ8-xvYDr1C_xO2ks5uDRJygaJpZM4Uira9>
> .
>
|
I'm happy to test. But I get an error when compiling the docker image after doing a rebase from master. :/ If I don't do the rebase I get the offline channel error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent PR! This is one of the last few things we're missing in lnd
in terms of additional safety measures to ensure that our user's funds ah safu!
The architecture and the structure of the PR, along with the set of tests look pretty good to me at first pass. Many of my comments are style related, and pointing out some measures in the PR that should break after a rebase to the current master (as this PR is a month out of date at present).
lnd_test.go
Outdated
@@ -2236,6 +2236,11 @@ func testChannelForceClosure(net *lntest.NetworkHarness, t *harnessTest) { | |||
carolExpectedBalance, | |||
carolBalResp.ConfirmedBalance) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be the case that this commit is no longer needed after rebase.
@@ -588,7 +597,7 @@ func (c *chainWatcher) dispatchRemoteForceClose(commitSpend *chainntnfs.SpendDet | |||
// channel on-chain. | |||
uniClose, err := lnwallet.NewUnilateralCloseSummary( | |||
c.cfg.chanState, c.cfg.signer, c.cfg.pCache, commitSpend, | |||
remoteCommit, isRemotePendingCommit, | |||
remoteCommit, commitPoint, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're already passing in the channel state, then why do we need to pass in this point as well?
func NewUnilateralCloseSummary(chanState *channeldb.OpenChannel, signer Signer, | ||
pCache PreimageCache, commitSpend *chainntnfs.SpendDetail, | ||
remoteCommit channeldb.ChannelCommitment, | ||
remotePendingCommit bool) (*UnilateralCloseSummary, error) { | ||
commitPoint *btcec.PublicKey) (*UnilateralCloseSummary, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here, we're already passing in the entire state.
lnd_test.go
Outdated
// Carol will be the breached party. We set --nolisten to ensure Bob | ||
// won't be able to connect to her and trigger the channel data | ||
// protection logic automatically. | ||
carol, err := net.NewNode("Carol", []string{"--debughtlc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit w.r.t line wrapping here.
lnd_test.go
Outdated
@@ -4981,27 +5007,44 @@ func testRevokedCloseRetributionZeroValueRemoteOutput(net *lntest.NetworkHarness | |||
|
|||
// Since we'd like to test some multi-hop failure scenarios, we'll | |||
// introduce another node into our test network: Carol. | |||
carol, err := net.NewNode("Carol", []string{"--debughtlc", "--hodl.exit-settle"}) | |||
carol, err := net.NewNode("Carol", []string{"--debughtlc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit w.r.t line wrapping here.
// state, we'll just pass an empty commitment. Note | ||
// that this means we won't be able to recover any HTLC | ||
// funds. | ||
// TODO(halseth): can we try to recover some HTLCs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe...only if we have partial state, and this isn't actually the result of us restoring w/ seed+static-backups.
// funds. | ||
// TODO(halseth): can we try to recover some HTLCs? | ||
err = c.dispatchRemoteForceClose( | ||
commitSpend, channeldb.ChannelCommitment{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah now I see why the commitPoint
and chan state are passed distinctly.
lnd_test.go
Outdated
}() | ||
|
||
// Dave will be the party losing his state. | ||
dave, err := net.NewNode("Dave", []string{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just pass a nil
here as the second param.
lnd_test.go
Outdated
|
||
// We must let Dave communicate with Carol before they are able to open | ||
// channel, so we connect Dave and Carol, | ||
if err := net.ConnectNodes(ctxb, carol, dave); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll fail after rebasing to master as carol
was started in --nolisten
mode.
lnd_test.go
Outdated
block = mineBlocks(t, net, 1)[0] | ||
assertTxInBlock(t, block, daveSweep) | ||
|
||
// Now Dave should considere the channel fully closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considere -> consider?
116e18d
to
45d0661
Compare
Rebased and addressed comments. Also increased the cases we force close to include the remote giving us invalid data. I will create a few issues to handle the remaining follow-ups, most notably prohibiting the user from force closing a desynced channel, and adding chanSync message resend. PTAL |
45d0661
to
04bf19a
Compare
04bf19a
to
ac5f36e
Compare
channeldb/channel.go
Outdated
c.Lock() | ||
defer c.Unlock() | ||
|
||
if err := c.Db.Update(func(tx *bolt.Tx) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great to use putChanStatus
here to reduce code duplication, signature might be more like:
func (c *OpenChannel) putChanStatus(tx *bolt.Tx, status ChannelStatus) (ChannelStatus, error)
for easier composition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried doing this, but this turned out to now reduce duplication, since then the DB must be View
ed from the caller, and the bucket must be retrieved twice in MarkDataLoss
. The other suggestion were added, PTAL :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, thanks for trying anyway 😂
In this commit we modify the integration tests slightly, by setting the parties that gets breached during the breach tests to --nolisten. We do this to ensure that once the data protection logic is in place, they nodes won't automatically connect, detect the state desync and recover before we are able to trigger the breach.
8a24196
to
79197ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 💾🔒 just needs a squash before merging
75b4804
to
717c38c
Compare
Since the ChanStatus field can be changed from concurrent callers, we make it unexported and add the method ChanStatus() for safe retrieval.
This commit defines a few new errors that we can potentially encounter during channel reestablishment: * ErrInvalidLocalUnrevokedCommitPoint * ErrCommitSyncLocalDataLoss * ErrCommitSyncRemoteDataLoss in addition to the already defined errors * ErrInvalidLastCommitSecret * ErrCannotSyncCommitChains
This commit enumerates the various error cases we can encounter when we compare our local commit chain to the view the remote communicates to us via msg.RemoteCommitTailHeight. We now compare this height to our local tail height (note that there's never a local "tip" at this point), returning relevant error in case of a unrecoverable desync, and re-send a revocation in case we owe one.
This commit enumerates the various error cases we can encounter when we compare our remote commit chain to the view the remote communicates to us via msg.NextLocalCommitHeight. We now compare this height to our remote tail and tip height, returning relevant error in case of a unrecoverable desync, and re-send a commitment signature (including log updates) in case we owe one.
This commit adds a check for the LocalUnrevokedCommitPoint sent to us by the remote during channel reestablishment, ensuring it is the same point as they have previously sent us.
This commit makes the link inspect the error encountered during channel sync, force closing the channel if we detect a remote data loss.
…oss commitPoint This commit makes the chainwatcher attempt to dispatch a remote close when it detects a remote state with a state number higher than our known remote state. This can mean that we lost some state, and we check the database for (hopefully) a data loss commit point retrieved during channel sync with the remote peer. If this commit point is found in the database we use it to try to recover our funds from the commitment.
This commit adds the integration test testDataLossProtection, that ensures that when a node loses state, the channel counterparty will force close the channel, and they both can recover their funds.
717c38c
to
afccca5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to merge 🎉
Nice work here!! Thanks!! |
Sorry for being late to the party, but just today bitcoin lightning wallet refuses to connect with "data loss protection not supported by this peer". Is there an option to enable the feature? From what I can tell it is on by default since this merge. |
@mcduck76 I'd report that to BLW, looks like they aren't interpreting feature bits correctly. |
I can connect from BLW to outher LND nodes. I really can't think of what is wrong with my installation, other nodes can connect fine. |
This PR implements the logic required for "data loss protection", as described in BOLT#2: https://github.com/lightningnetwork/lightning-rfc/blob/master/02-peer-protocol.md#message-retransmission
When syncing channel states we will now detect the cases where we probably have lost our local state, meaning broadcasting our commitment would be unsafe as it could be considered a breach. Instead we store the
my_current_per_commitment_point
sent to us by the remote in the database. This commitment point can later be used to reclaim our funds if the remote party decides to unilaterally close the channel using the corresponding state.If we detect that the remote party probably has lost state, we'll be a good citizen and force close the channel using our latest commitment.
@Roasbeef: what is meant by:
?
Fixes #1131