-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle retrying sign_counterparty_commitment failures #2558
Conversation
d39e7e9
to
5ef9d7f
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2558 +/- ##
==========================================
+ Coverage 88.71% 88.74% +0.02%
==========================================
Files 112 113 +1
Lines 88502 88876 +374
Branches 88502 88876 +374
==========================================
+ Hits 78517 78873 +356
- Misses 7752 7757 +5
- Partials 2233 2246 +13
☔ View full report in Codecov by Sentry. |
e906663
to
ef6738d
Compare
LG mod one comment - rather than pushing a new commit that makes the code compile against the latest upstream, can you rebase on upstream and interleave the changes into each commit such that each individual commit in the PR builds and passes tests on its own? |
Would you be okay with me squashing any of these first? |
In general I'd prefer to end up with as many commits as possible (as long as they're all free-standing and pass tests by themselves) rather than one big commit that changes several things. |
Ok! Changes interleaved... ptal when you get a chance! Thank you! |
I've been experimenting with this change in our stack and I realize that all is not well. I just pushed 72dbe94 to fix zero-conf inbound channel acceptance. The issue here is that we need to defer sending the There is a similar issue that I've observed with handling @wpaulino @TheBlueMatt as we get into some of the other signatures/secrets, do you think it's going to be necessary to recreate parallel machinery like that which exists for paused channel monitors? That's fairly complex in itself, not to mention how the two might have to interact with each other. 😬 |
329175b
to
6be9aec
Compare
Personally I'd rather push the last commit onto another PR. This PR is nice and simple, and implements async signing for just commitment signing. We should do a separate PR that adds async signing for releasing the revocation secret, rather than trying to do this all at once. |
Also, let's get this rebased so we can land it now that 117 is out 🎉 |
Oookayyy, with 118 out the door I think its time to land this! Sadly it needs a small rebase now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can you squash all of the fixup commits?
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending later, rather than force-closing the channel (which probably won't even work if the signer is missing). Here we add initial handling of sign_counterparty_commitment failing during normal channel operation, setting a new flag in `ChannelContext` which indicates we should retry sending the commitment update later. We don't yet add any ability to do that retry.
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending which required the signature later, rather than force-closing the channel (which probably won't even work if the signer is missing). Here we add initial handling of sign_counterparty_commitment failing during outbound channel funding, setting a new flag in `ChannelContext` which indicates we should retry sending the `funding_created` later. We don't yet add any ability to do that retry.
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending which required the signature later, rather than force-closing the channel (which probably won't even work if the signer is missing). Here we add initial handling of sign_counterparty_commitment failing during inbound channel funding, setting a flag in `ChannelContext` which indicates we should retry sending the `funding_signed` later. We don't yet add any ability to do that retry.
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending which required the signature later, rather than force-closing the channel (which probably won't even work if the signer is missing). This commit adds initial retrying of failures, specifically regenerating commitment updates, attempting to re-sign the `CommitmentSigned` message, and sending it to our peers if we succed.
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending which required the signature later, rather than force-closing the channel (which probably won't even work if the signer is missing). This commit adds retrying of outbound funding_created signing failures, regenerating the `FundingCreated` message, attempting to re-sign, and sending it to our peers if we succeed.
If sign_counterparty_commitment fails (i.e. because the signer is temporarily disconnected), this really indicates that we should retry the message sending which required the signature later, rather than force-closing the channel (which probably won't even work if the signer is missing). This commit adds retrying of inbound funding_created signing failures, regenerating the `FundingSigned` message, attempting to re-sign, and sending it to our peers if we succeed.
Adds a `get_signer` method to the context so that a test can get ahold of the channel signer. Adds a `set_available` method on the `TestChannelSigner` to allow a test to enable and disable the signer: when disabled some of the signer's methods will return `Err` which will typically activate the error handling case. Adds a `set_channel_signer_available` function on the test `Node` class to make it easy to enable and disable a specific signer. Adds a new `async_signer_tests` module: * Check for asynchronous handling of `funding_created` and `funding_signed`. * Check that we correctly resume processing after awaiting an asynchronous signature for a `commitment_signed` event. * Verify correct handling during peer disconnect. * Verify correct handling for inbound zero-conf.
We are intending to release without having completed our async signing logic, which sadly means we need to cfg-gate it to ensure we restore the previous state of panicking on signer errors, rather than putting us in a stuck state with no way to recover. Here we add a new `async_signing` cfg flag and use it to gate all the new logic from lightningdevkit#2558 effectively reverting commits 1da2929 through 014a336.
We are intending to release without having completed our async signing logic, which sadly means we need to cfg-gate it to ensure we restore the previous state of panicking on signer errors, rather than putting us in a stuck state with no way to recover. Here we add a new `async_signing` cfg flag and use it to gate all the new logic from lightningdevkit#2558 effectively reverting commits 1da2929 through 014a336.
Follow on to #2554. Adopts that PR and adds tests.