Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splice: Complete tx_abort implementation #6940

Merged
merged 9 commits into from Feb 11, 2024

Conversation

ddustin
Copy link
Collaborator

@ddustin ddustin commented Dec 13, 2023

Adds support for tx_abort including sending, receiving, and ACK'ing.

When tx_abort is confirmed in channeld we notify lightningd and shutdown. lightningd then restarts the channeld daemon.

This is all done without breaking the connection and special care is made to connect the daemon to the existing connection through a connectd fd.

@ddustin ddustin force-pushed the ddustin/tx_abort branch 15 times, most recently from e3cd1c9 to b21da4e Compare December 14, 2023 19:32
@ddustin ddustin modified the milestones: v24.02, v23.11.1 Dec 14, 2023
@ddustin ddustin force-pushed the ddustin/tx_abort branch 2 times, most recently from 0ce23d6 to ce5dce1 Compare December 14, 2023 20:15
@ksedgwic
Copy link
Collaborator

@ddustin should we try this out (integration tests) w/ our splicing branch? Our branch is not quite released yet, but is close ...

@ddustin
Copy link
Collaborator Author

ddustin commented Dec 15, 2023

@ddustin should we try this out (integration tests) w/ our splicing branch? Our branch is not quite released yet, but is close ...

It's not quite ready for testing yet

@ksedgwic
Copy link
Collaborator

@devrandom review beg

@cdecker
Copy link
Member

cdecker commented Jan 31, 2024

Hi @ddustin, just going through all the open PRs that may make it into v24.02, and this one looks good for inclusion as far as I can see. Can you share your view on the current status, and whether we can include it or not? If included, we may do 1-2 rounds of final reviews and merge it, otherwise I'll untag it for the milestone, until it is ready for inclusion ^^

@ddustin
Copy link
Collaborator Author

ddustin commented Jan 31, 2024

Hi @ddustin, just going through all the open PRs that may make it into v24.02, and this one looks good for inclusion as far as I can see. Can you share your view on the current status, and whether we can include it or not? If included, we may do 1-2 rounds of final reviews and merge it, otherwise I'll untag it for the milestone, until it is ready for inclusion ^^

Hi @cdecker, it would be great to get this included in the next release! I believe it is definitely ready for inclusion.

This should all only apply under the experimental-splicing flag making it a safer change.

In some manual tests aggressively restarting channeld (which tx_abort does) reveals some passed file descriptor issues. I believe this is revealing an existing issue and not related to this PR.

If correct this should enable us to add flaky fd passing to the CI. To build the test however I need PR #6980 merged too as the test is dependent on both.

@cdecker
Copy link
Member

cdecker commented Feb 1, 2024

Ok, got it. I quite like #6980, but the discussion doesn't look finished yet. Am I good to take on this PR and shepherd it through CI, even if #6980 may not make it through in this release? The next release would be 24.05, so not a huge delay, but I prefer pulling in as much as possible to keep a clean worktable.

@ddustin
Copy link
Collaborator Author

ddustin commented Feb 1, 2024

Ok, got it. I quite like #6980, but the discussion doesn't look finished yet. Am I good to take on this PR and shepherd it through CI, even if #6980 may not make it through in this release? The next release would be 24.05, so not a huge delay, but I prefer pulling in as much as possible to keep a clean worktable.

Yeah that works 👍, thanks!

Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial changes mainly.

But last commit should be the first: that means tests always pass!

wallet/wallet.c Outdated Show resolved Hide resolved
wallet/wallet.c Outdated Show resolved Hide resolved
common/wire_error.c Outdated Show resolved Hide resolved
lightningd/channel_control.c Outdated Show resolved Hide resolved
channeld/channeld.c Outdated Show resolved Hide resolved
channeld/channeld.c Outdated Show resolved Hide resolved
channeld/channeld.c Outdated Show resolved Hide resolved
channeld/channeld.c Outdated Show resolved Hide resolved
channeld/channeld.c Outdated Show resolved Hide resolved
@cdecker
Copy link
Member

cdecker commented Feb 9, 2024

I tried rebasing this PR, but managed to somehow break it. It now has an undefined symbol towire_connectd_peer_final_msg and I can't find a trace of that message anywhere.

In he case of initiating an RBF, ABORT is used instead of ERROR.
We add checks for tx_abort and pass them back up to be handled.
Lightningd is responsible to restart channeld when it gets this message.
Add checking for and sending tx_abort to channeld.

When receiving it we first ACK it back, send a request to restart to lightningd, and then shutdown channeld.

We also must update the splice tests that relied on reconnect checks for splice warnings (as some are now tx_aborts instead).
If the peer isn’t required to send signatures first but does while we are awaiting the next user RPC action — we should be caching the message and using it later.

Before we would leave the message cached in the socket itself, but tx_abort semantics require us to check the socket more often.
@ddustin
Copy link
Collaborator Author

ddustin commented Feb 10, 2024

I tried rebasing this PR, but managed to somehow break it. It now has an undefined symbol towire_connectd_peer_final_msg and I can't find a trace of that message anywhere.

Ah looks like commit db6f0da removed that routine. Should be a simple update to use the new method.

@ddustin
Copy link
Collaborator Author

ddustin commented Feb 10, 2024

Trivial changes mainly.

But last commit should be the first: that means tests always pass!

Thanks for the review!

I addressed these changes, rebased master, and updated it to work with the latest master.

@cdecker once it passes CI it should be good to go

@ddustin ddustin added state::fixed These issues should have been addressed. Pending confirmation by OP, will close soon otherwise state::answered and removed state::needinfo state::fixed These issues should have been addressed. Pending confirmation by OP, will close soon otherwise labels Feb 10, 2024
@cdecker cdecker merged commit e72be90 into ElementsProject:master Feb 11, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement protocol These issues are protocol level issues that should be discussed on the protocol spec repo state::answered
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants