Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

Prior to LDK 0.1, in rare cases we could replay payment claims to ChannelMonitors on startup, which we then expected to be persisted prior to normal node operation. This required re-persisting ChannelMonitors after deserializing the ChannelManager, delaying startup in some cases substantially.

In 0.1 we fixed this, moving claim replays to the background to run after the ChannelManager starts operating (and only updating/persisting changes to the ChannelMonitors which need it). However, we didn't actually enable this meaningfully in our API - nearly all users use our ChainMonitor and the only way to get a chanel into ChainMonitor is through the normal flow which expects to persist.

Here we add a simple method to load ChannelMonitors into the ChainMonitor without persisting them.

I'm kinda on the fence about backporting this. On the one hand its a pretty nontrivial speedup for users who switch to the new interface. On the other hand, it is a "feature" and its kinda weird to backport a feature.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Aug 7, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-faster-0.1-load branch from f968fa1 to e45bfed Compare August 7, 2025 22:38
@tnull tnull requested review from tnull and removed request for joostjager August 8, 2025 06:48
@tnull
Copy link
Contributor

tnull commented Aug 8, 2025

On the other hand, it is a "feature" and its kinda weird to backport a feature.

Yeah, I'd also lean towards not backporting this, also as it changes API (even though just extending).

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Copy link

codecov bot commented Aug 8, 2025

Codecov Report

❌ Patch coverage is 87.87879% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.76%. Comparing base (ed1f304) to head (d377958).
⚠️ Report is 211 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/util/test_utils.rs 85.71% 1 Missing and 2 partials ⚠️
lightning/src/chain/channelmonitor.rs 90.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3996      +/-   ##
==========================================
- Coverage   88.93%   88.76%   -0.17%     
==========================================
  Files         174      176       +2     
  Lines      124575   129396    +4821     
  Branches   124575   129396    +4821     
==========================================
+ Hits       110790   114858    +4068     
- Misses      11287    11929     +642     
- Partials     2498     2609     +111     
Flag Coverage Δ
fuzzing 21.99% <20.68%> (-0.20%) ⬇️
tests 88.59% <87.87%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Aug 12, 2025
@tnull tnull self-requested a review September 1, 2025 17:54
@ldk-reviews-bot
Copy link

✅ Added second reviewer: @joostjager

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to squash!

/// Using this over [`chain::Watch::watch_channel`] avoids re-persisting a [`ChannelMonitor`]
/// that hasn't changed, slowing down startup.
///
/// Note that this method *can* be used if additional blocks were replayed against the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding the assumptions and restrictions of this API is still relatively hard. In particular, how can users be sure which of the APIs on ChannelMonitor are safe to use and which ones change(d) its state in some way that will not be replayed again on a restart.?

And, say users currently just use safe APIs or don't access ChannelMonitor, start using load_existing_monitor, and then some time down the road decide they need to mess with ChannelMonitors for some reason. Do we expect them to remember the fineprint of this API?

IMO, it would be still very much preferable to end up with a single API method for loading monitors that does the optimal thing under the hood, depending on whether the monitor state is 'dirty' or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree in principle, but we also can't perfectly know - someone can in theory apply a ChannelMonitorUpdate that won't be replayed on startup. But, ultimately, I think this will work in 99.99% of cases - a user deciding to do a Thing during their startup routing and then deciding not to it again if they hit a crash and restart seems kinda wild, unless they have a bug and don't persist but do delete the instructions to do the Thing.

Im open to ideas for how to better phrase this, I added a quick "basically this should never happen" when squasing tho:

$ git diff-tree -U5 d79e08f86 cfca2dc97
diff --git a/lightning/src/chain/chainmonitor.rs b/lightning/src/chain/chainmonitor.rs
index 951aad7f78..58a2dbcdb3 100644
--- a/lightning/src/chain/chainmonitor.rs
+++ b/lightning/src/chain/chainmonitor.rs
@@ -833,11 +833,11 @@ where
 	///
 	/// Note that this method *can* be used if additional blocks were replayed against the
 	/// [`ChannelMonitor`] or if a [`ChannelMonitorUpdate`] loaded from disk was replayed such that
 	/// it will replayed on startup, and in general can only *not* be used if you directly accessed
 	/// the [`ChannelMonitor`] and changed its state in some way that will not be replayed again on
-	/// a restart.
+	/// a restart. Such direct access should generally never occur for most LDK-based nodes.
 	///
 	/// For [`ChannelMonitor`]s which were last serialized by an LDK version prior to 0.1 this will
 	/// fall back to calling [`chain::Watch::watch_channel`] and persisting the [`ChannelMonitor`].
 	/// See the release notes for LDK 0.1 for more information on this requirement.
 	///

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ldk-node more and more being the predominant way to run this code, we can still ensure that it is safe in most cases.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-faster-0.1-load branch from d79e08f to cfca2dc Compare September 2, 2025 13:33
/// [`ChannelMonitor`]s which do not need to be persisted (i.e. were last written by LDK 0.1 or
/// later) will be loaded without persistence and this method will return
/// [`ChannelMonitorUpdateStatus::Completed`].
pub fn load_existing_monitor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure: for ldk-node, the idea is that it will call into this function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I anticipate nearly every implementation will call this instead.

(32, pending_funding, optional_vec),
(34, alternative_funding_confirmed, option),
});
let written_by_0_1_or_later = payment_preimages_with_info.is_some();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment why this determines the version?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was pretty self-explanitory given the code, but, sure :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't to me. payment_preimage_with_info was also added in 0.1 I assume?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, answered in diff. Ty.

Prior to LDK 0.1, in rare cases we could replay payment claims to
`ChannelMonitor`s on startup, which we then expected to be
persisted prior to normal node operation. This required
re-persisting `ChannelMonitor`s after deserializing the
`ChannelManager`, delaying startup in some cases substantially.

In 0.1 we fixed this, moving claim replays to the background to run
after the `ChannelManager` starts operating (and only
updating/persisting changes to the `ChannelMonitor`s which need
it). However, we didn't actually enable this meaningfully in our
API - nearly all users use our `ChainMonitor` and the only way to
get a chanel into `ChainMonitor` is through the normal flow which
expects to persist.

Here we add a simple method to load `ChannelMonitor`s into the
`ChainMonitor` without persisting them.
@TheBlueMatt TheBlueMatt force-pushed the 2025-08-faster-0.1-load branch from cfca2dc to d377958 Compare September 3, 2025 12:24
@TheBlueMatt
Copy link
Collaborator Author

Added a comment:

$ git diff-tree -U1 cfca2dc97 d37795800c
diff --git a/lightning/src/chain/channelmonitor.rs b/lightning/src/chain/channelmonitor.rs
index e22b04570c..c0941480af 100644
--- a/lightning/src/chain/channelmonitor.rs
+++ b/lightning/src/chain/channelmonitor.rs
@@ -6093,2 +6093,4 @@ impl<'a, 'b, ES: EntropySource, SP: SignerProvider> ReadableArgs<(&'a ES, &'b SP
 		});
+		// Note that `payment_preimages_with_info` was added (and is always written) in LDK 0.1, so
+		// we can use it to determine if this monitor was last written by LDK 0.1 or later.
 		let written_by_0_1_or_later = payment_preimages_with_info.is_some();

}

// At every point where we get a monitor update, we should be able to send a useful monitor
// to a watchtower and disk...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems out of place? Or maybe I don't understand the purpose of this fn. What I see is serialize the monitor, deserialize it and then load.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.1;
// Note that a ChannelMonitor might not round-trip exactly here as we have tests that were
// serialized prior to LDK 0.1 and re-serializing them will flip the "written after LDK
// 0.1" flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the consequence of this, or the intention of this message?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheBlueMatt TheBlueMatt merged commit 0670550 into lightningdevkit:main Sep 4, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants