Skip to content

Conversation

@wpaulino
Copy link
Contributor

HolderCommitmentPoint currently tracks the current and next point used on counterparty commitments, which are unrevoked. When we reestablish a channel, the counterparty sends us the commitment height, along with the corresponding secret, for the state they believe to be the latest. We compare said secret to the derived point we fetch from the signer to know if the peer is being honest.

Since the protocol does not allow peers (assuming no data loss) to be behind the current state by more than one update, we can cache the two latest revoked commitment points alongside HolderCommitmentPoint, such that we no longer need to reach the signer asynchronously when handling channel_reestablish messages throughout the happy path. By doing so, we avoid complexity in needing to pause the state machine (which may also result in needing to stash any update messages from the counterparty) while the signer response is pending.

The only remaining case left to handle is when the counterparty presents a channel_reestablish with a state later than what we know. This can only result in two terminal cases: either they provided a valid commitment secret proving we are behind and we need to panic, or they lied and we force close the channel. This is the only case we choose to handle asynchronously as it's relatively trivial to handle.

@wpaulino wpaulino added this to the 0.3 milestone Oct 30, 2025
@wpaulino wpaulino requested a review from TheBlueMatt October 30, 2025 23:02
@wpaulino wpaulino self-assigned this Oct 30, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 30, 2025

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

❌ Patch coverage is 86.39053% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.34%. Comparing base (6749bc6) to head (1f7b249).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 80.61% 19 Missing ⚠️
lightning/src/ln/channelmanager.rs 75.00% 1 Missing and 2 partials ⚠️
lightning/src/ln/async_signer_tests.rs 98.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4197      +/-   ##
==========================================
- Coverage   89.34%   89.34%   -0.01%     
==========================================
  Files         180      180              
  Lines      138480   138620     +140     
  Branches   138480   138620     +140     
==========================================
+ Hits       123730   123846     +116     
- Misses      12129    12149      +20     
- Partials     2621     2625       +4     
Flag Coverage Δ
fuzzing 35.96% <31.81%> (-0.01%) ⬇️
tests 88.70% <86.39%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

.ok();
if expected_point.is_none() {
self.context.signer_pending_stale_state_verification = Some((commitment_number, given_secret));
return Err(ChannelError::Ignore("Waiting on async signer to verify stale state proof".to_owned()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice I think this means we'll often never panic - the peer will reconnect, we'll ignore the message, then they'll send some other message which will cause us to, for example, ChannelError::close("Got commitment signed message when channel was not in an operational state"). We'll either have to have logic in ~every message handler to ignore the message if signer_pending_stale_state_verification is set or we can just disconnect them here and let them be in a reconnect loop until the signer resolves (which I think is fine?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, ended up disconnecting. Is there any reason for us to close in those cases though? We could just make those ChannelError::close a WarnAndDisconnect instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, those cases could definitely move to a warn-and-disconnect. Historically we've been pretty happy to just close if the peer does something dumb, and in 95% of the cases we've never seen peers do anything so dumb, so we've never really had a motivation to change it. Not crazy to do though.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@TheBlueMatt
Copy link
Collaborator

fwiw clippy is unhappy.

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 7c25f35 to 6b8123d Compare November 3, 2025 22:04
@TheBlueMatt
Copy link
Collaborator

ln::async_signer_tests::test_async_force_close_on_invalid_secret_for_stale_state is failing in CI.

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 6b8123d to fa13381 Compare November 4, 2025 17:50
@wpaulino wpaulino requested a review from TheBlueMatt November 4, 2025 17:50
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @TheBlueMatt! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

TheBlueMatt
TheBlueMatt previously approved these changes Nov 13, 2025
@ldk-reviews-bot
Copy link

✅ Added second reviewer: @valentinewallace

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from fa13381 to 201478e Compare November 17, 2025 18:10
@TheBlueMatt
Copy link
Collaborator

CI is quite sad

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 201478e to 82e2371 Compare November 18, 2025 18:26
@wpaulino
Copy link
Contributor Author

Had to rebase to account for the changes to check_channel_closed

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking!

if expected_point != Some(PublicKey::from_secret_key(&self.context.secp_ctx, &given_secret)) {
return Err(ChannelError::close("Peer sent a channel_reestablish indicating we're stale with an invalid commitment secret".to_owned()));
}
Self::panic_on_stale_state(logger);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have test coverage for hitting this, may be pre-existing though

if expected_point != PublicKey::from_secret_key(&self.context.secp_ctx, &given_secret) {
return Err(ChannelError::close("Peer sent a garbage channel_reestablish with secret key not matching the commitment height provided".to_owned()));
}
} else if msg.next_remote_commitment_number + 1 == our_commitment_transaction {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to rename our_commitment_transaction to our_current_commit_tx_number or something like that, but it tis preexisting

holder_commitment_next_transaction_number + 3,
&secp_ctx,
)
.expect("Must be able to derive the previous revoked commitment point upon channel restoration"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering -- is there plans to get rid of this and go fully async with the method? I guess in a release or three?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for the upgrade case, we assume liveness prior to switching over to an async signer.

return Err(ChannelError::close("Peer sent a channel_reestablish indicating we're stale with an invalid commitment secret".to_owned()));
}
Self::panic_on_stale_state(logger);
} else if msg.next_remote_commitment_number == our_commitment_transaction {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a dumb question -- in the spec the next_remote_commitment_number is described as the next commitment number they expect to receive, but above we seem to be setting current_transaction_number to the current commitment number, which is a bit confusing. Just want to double check there's no off-by-one there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the next_revocation_number in the spec. If you look at get_channel_reestablish, you'll find this comment where we set the field:

			// We have to set next_remote_commitment_number to the next revoke_and_ack we expect to
			// receive, however we track it by the next commitment number for a remote transaction
			// (which is one further, as they always revoke previous commitment transaction, not
			// the one we send) so we have to decrement by 1. Note that if

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, that's confusing. Comment is a bit buried.

@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 82e2371 to 905f990 Compare November 18, 2025 20:08
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment but I'm otherwise good to ack after CI is fixed

`HolderCommitmentPoint` currently tracks the current and next point used
on counterparty commitments, which are unrevoked. When we reestablish a
channel, the counterparty sends us the commitment height, along with the
corresponding secret, for the state they believe to be the latest. We
compare said secret to the derived point we fetch from the signer to
know if the peer is being honest.

Since the protocol does not allow peers (assuming no data loss) to be
behind the current state by more than one update, we can cache the two
latest revoked commitment points alongside `HolderCommitmentPoint`, such
that we no longer need to reach the signer asynchronously when handling
`channel_reestablish` messages throughout the happy path. By doing so,
we avoid complexity in needing to pause the state machine (which may
also result in needing to stash any update messages from the
counterparty) while the signer response is pending.

The only remaining case left to handle is when the counterparty presents
a `channel_reestablish` with a state later than what we know. This can
only result in two terminal cases: either they provided a valid
commitment secret proving we are behind and we need to panic, or they
lied and we force close the channel. This is the only case we choose to
handle asynchronously as it's relatively trivial to handle.
@wpaulino wpaulino force-pushed the async-get-per-commitment-point-channel-reestablish branch from 905f990 to 1f7b249 Compare November 18, 2025 22:09
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landing, given @valentinewallace indicated she was happy with it.

/// Similar to [`Self::signer_pending_commitment_update`] but we're waiting to send a
/// [`msgs::ChannelReady`].
signer_pending_channel_ready: bool,
// Upon receiving a [`msgs::ChannelReestablish`] message with a `next_remote_commitment_number`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: even for internal stuff its nice to make it a doc comment cause then cargo doc --include-private-items will generate docs for them and presumably some people's RLS will see it. Not sure if it actually impacts anyone on the team currently but I imagine in the future LLMs might care or maybe better IDEs might.

@TheBlueMatt TheBlueMatt merged commit 97204d6 into lightningdevkit:main Nov 19, 2025
26 checks passed
@wpaulino wpaulino deleted the async-get-per-commitment-point-channel-reestablish branch November 19, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants