Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi: resend shutdown on reestablish #8464

Merged
merged 8 commits into from Feb 21, 2024

Conversation

ellemouton
Copy link
Collaborator

@ellemouton ellemouton commented Feb 6, 2024

~ Replaces #8447 -> opening new PR since this approach is different & so comments wont all transfer. ~

Summary

In this PR we ensure that if a re-establish happens after shutdown is
sent but before fee negotiation starts, then shutdown is correctly
re-sent and the coop closure continues. This includes ensuring that
the delivery address sent in the first shutdown is the one that is
used in the final co-op close tx.

The issue

The issue is that we only mark a channel as ChanStatusCoopBroadcasted
at the time of actually doing the broadcast. This means that if there is a
re-connect between the shutdown message being sent and the coop
close tx being finalised then we will forget that we were in the middle of a
shutdown and will continue with normal operation on restart. This is not
compliant with the spec which says that on re-connect, if we previously
sent a shutdown then we MUST resend it again.

Fix overview

This adds a ShutdownInfo type which we persist when we send a shutdown message.
The existence of this on re-start means we should start the link in "shutdown mode" meaning
that we immediately disable HTLC adds in the outgoing direction & queue a shutdown message
once any pending CommitSig has been sent.

PR flow

  1. some type fixes, re-formatting & refactoring.
  2. Call DisableAdds from outside link.OnCommitOnce
  3. Demonstrate the bug by recreating it in an itest.
  4. Then, we add the new ShutdownInfo type along with encode/decode methods for it. Then
    methods to persist & fetch
  5. Provide the link with an existing Shutdown message on startup so that it knows to immediately block outgoing
    htlcs & to queue the shutdown message.
  6. Finally, we make use of the new methods & fix the itest to show the correct
    behaviour

Fixes #8397

Copy link

coderabbitai bot commented Feb 6, 2024

Important

Auto Review Skipped

Auto reviews are limited to the following labels: llm-review. Please add one of these labels to enable auto reviews.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository.

To trigger a single review, invoke the @coderabbitai review command.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

@ellemouton ellemouton marked this pull request as draft February 6, 2024 17:58
@ellemouton
Copy link
Collaborator Author

@Crypt-iQ & @ProofOfKeags - pinging here for an Approach ACK.
I think there is still a flake lurking that I need to find. So no full review needed just yet.

@Crypt-iQ
Copy link
Collaborator

Crypt-iQ commented Feb 6, 2024

Approach ACK 👍

@saubyk saubyk added this to the v0.18.0 milestone Feb 6, 2024
@ellemouton ellemouton force-pushed the resend-shutdown-2 branch 3 times, most recently from 0b404d0 to 6f3c0b0 Compare February 7, 2024 11:10
@ellemouton ellemouton marked this pull request as ready for review February 7, 2024 12:46
@ellemouton
Copy link
Collaborator Author

PTAL @Crypt-iQ & @ProofOfKeags 🙏

Copy link
Collaborator

@ProofOfKeags ProofOfKeags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is good. My main concern is how we square some of the choices made here with the v2 close so I've summoned @Roasbeef for that. I personally think this has priority even if we can't square it completely so I don't know that even if there are design conflicts that we would change how this works.

A lot of the problems here are related to the fact that we use the brontide to manage the integration of channel subcomponents which we shouldn't really do but that's way way way beyond the scope of what we need to address in this PR.

channeldb/channel.go Outdated Show resolved Hide resolved
channeldb/channel.go Show resolved Hide resolved
Comment on lines +1205 to +1208
script := []byte{1, 3, 4, 5}

// Create a ShutdownInfo struct.
shutdownInfo := NewShutdownInfo(script, locallyInitiated)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this works implies we need to fix our type system. I figured we'd have to newtype wrap the byte array or use a constructor that lifts it into the DeliveryAddress. The test overall seems fine but I don't think you should be able to construct a DeliveryAddress that looks like this.

Fixing it is better done in a diff PR though.

itest/lnd_coop_close_with_htlcs_test.go Show resolved Hide resolved
Comment on lines 276 to 282
// ShutdownMsg is an optional value that is set if, at the time of
// the link being started, persisted shutdown info was found for the
// channel. This value being set means that we previously sent a
// Shutdown message to our peer, and so we should do so again on
// re-establish and should not allow anymore HTLC adds on the outgoing
// direction of the link.
ShutdownMsg fn.Option[lnwire.Shutdown]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go this route, where we're willing to let the link be shutdown aware, we probably ought to have a Shutdown method on the link too and then we can ditch all of the lifecycle hooks and stuff. When I was working on #8167, @Roasbeef was averse to doing this type of thing. I happen to side with you here that shutdown is ultimately part of the link's lifecycle so this should be OK but I do think we need an explicit ACK from @Roasbeef

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the choice to make this a static field does have implications in regards to the coop close v2, since Shutdown may be transmitted multiple times in an attempt to reset protocol state back to the beginning of shutdown negotiations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ultimately part of the link's lifecycle

Yeah. I think the Shutdown message is very much apart of the link lifecycle. The co-op close messages are not & so the link does not need to be aware of those at all but the Shutdown message gives the link an indication that it should wind down. Especially since the link is expected to perform a certain set of actions once the shutdown message is sent .

does have implications in regards to the coop close v2,

shouldnt be a problem for 2 reasons: 1) even if we do transmit Shutdown multiple times - it MUST be the exact same message (with identical contents) and 2) I think any co-op close stuff (and reset of negotiations etc) wont be handled in the link.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right here. As a side note this lends some validity to the idea that some messages may be processed by multiple subsystems too in #8434

it MUST be the exact same message

I wonder what happens if it's not.. 🤔

I think any co-op close stuff (and reset of negotiations etc) wont be handled in the link.

Definitely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if it's not..

It might give the other side reason to just force close since to them we are doing something fishy. Not sure if any impl actually does bork out here though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I just realised what could happen:

  1. we send Shutdown to the peer with delivery address A
  2. they persist that delivery address
  3. we re-send shutdown with a new delivery address B
  4. remote side swallows & ignores the message since they have already received a shutdown from us which was meant to be identical to the first one so they dont even bother parsing it fully.
  5. The 2 nodes start fee negotiation. But this will fail as soon as the remote side gets a signature from us that doesnt make sense given the transaction they have constructed based on delivery address A
  6. remote side does force close instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since this is in the config, I might also suggest we call this something else besides ShutdownMsg. Though the comment on the struct row is sufficiently descriptive I wonder if we would benefit from referring to this as "QueuedShutdown", "PendingShutdown", or "PreexistingShutdown"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to "PreviouslySentShutdown" 👍

peer/brontide.go Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this all works but I have absolutely no idea how we are going to patch @Roasbeef's coop close v2 PR into this.

Copy link
Collaborator Author

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the initial review @ProofOfKeags 🙏

Will contact @Roasbeef to see what he says re the simple co-op close stuff

channeldb/channel.go Outdated Show resolved Hide resolved
Comment on lines 276 to 282
// ShutdownMsg is an optional value that is set if, at the time of
// the link being started, persisted shutdown info was found for the
// channel. This value being set means that we previously sent a
// Shutdown message to our peer, and so we should do so again on
// re-establish and should not allow anymore HTLC adds on the outgoing
// direction of the link.
ShutdownMsg fn.Option[lnwire.Shutdown]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if it's not..

It might give the other side reason to just force close since to them we are doing something fishy. Not sure if any impl actually does bork out here though

Copy link
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit

htlcswitch/link.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎣

@saubyk saubyk removed the request for review from morehouse February 15, 2024 15:20
@lightninglabs-deploy
Copy link

@ProofOfKeags: review reminder

Copy link
Collaborator

@ProofOfKeags ProofOfKeags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple nits wrt naming, but you can take them or leave them at your discretion. Otherwise, ship it.

channeldb/channel.go Outdated Show resolved Hide resolved

// ShutdownInfo contains various info about the shutdown initiation of a
// channel.
type ShutdownInfo struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm generally not a fan of things that end in "Info". Looking at this structure it seems to contain the Shutdown parameters, state, or metadata, but I'm not sure which of these is most appropriate. Ultimately if you think we should leave it as info, that's fine but just figured I'd mention it since in my experience "Info" is usually a stand-in for a more appropriate, more descriptive name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool yeah gonna leave this one I think just cause in this case I cant really think of a better term & feel that Info is pretty appropriate because it contains a mixture of things we need to put in the shutdown message as well as other things about the shutdown that we should remember.

Comment on lines 276 to 282
// ShutdownMsg is an optional value that is set if, at the time of
// the link being started, persisted shutdown info was found for the
// channel. This value being set means that we previously sent a
// Shutdown message to our peer, and so we should do so again on
// re-establish and should not allow anymore HTLC adds on the outgoing
// direction of the link.
ShutdownMsg fn.Option[lnwire.Shutdown]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since this is in the config, I might also suggest we call this something else besides ShutdownMsg. Though the comment on the struct row is sufficiently descriptive I wonder if we would benefit from referring to this as "QueuedShutdown", "PendingShutdown", or "PreexistingShutdown"

In this commit, the `ChannelUpdateHandler`'s `EnableAdds` and
`DisableAdds` methods are adjusted to return booleans instead of errors.
This is done becuase currently, any error returned by these methods is
treated by just logging the error since today all it means is that the
proposed update has already been done. And so all we do today is log the
error. But in future, if these methods are updated to return actual
errors that need to be handled, then we might forget to handle them
correctly at the various call sights. So we instead change the signature
of the function to just return a boolean. In future, if we do need to
return any error, we will have to go inspect every call sight in any
case to fix compliation & then we can be sure we are handling the errors
correctly.
This commit moves calls to link.DisableAdds to outside link.OnCommitOnce
call backs. This is done so that if a user requests shutdown, then we
can immediately block any new outgoing HTLCs. It's only the sending of
the Shutdown message that needs to wait until after any pending
CommitSig message is sent.
This commit adds an itest to demonstrate that the following bug exists:
If channel Shutdown is initiated but then a re-connect is done before
the shutdown is complete, then the initiating node currently does not
properly resend the Shutdown message as required by the spec. This will
be fixed in an upcoming commit.
ShutdownInfo contains any info about a previous Shutdown message that we
have sent. This commit adds this type along with read and write methods
for it in the channel db. The existence of the ShutdownInfo on disk
represents the fact that we have previously sent the Shutdown message
and hence that we should resend it on re-establish.
In this commit, we add the ability to start a link in shutdown mode.
This means that we immediately disable any new HTLC adds in the outgoing
direction and that we queue up a Shutdown message after the next
CommitSig message is sent (or immediately if no CommitSig message is
owed).
In this commit, we start persisting shutdown info when we send the
Shutdown message. When starting up a link, we also first check if we
have previously persisted Shutdown info and if we have, we start the
link in shutdown mode meaning that it will not accept any new outgoing
HTLC additions and it will queue the shutdown message after any pending
CommitSig has been sent.
@ellemouton ellemouton merged commit 1627976 into lightningnetwork:master Feb 21, 2024
23 of 25 checks passed
@ellemouton ellemouton deleted the resend-shutdown-2 branch February 21, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[spec]: retransmit shutdown on reestablish
5 participants