Skip to content

Conversation

@TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Oct 29, 2025

Backport of #4122, #4109, #4182, #4129, #4167, #4172, #4173, #4168, #4178 to 0.2

Also now #4184

wpaulino and others added 19 commits October 29, 2025 17:10
Splices negotiated with 0 confirmations require that we immediately lock
it after exchanging `tx_signatures`.

Backport of 1802b6e
This is crucial for peers that serve liquidity for low-availability
(i.e., mobile) nodes. We should allow users to queue a splice request
while the peer is offline, such that it is negotiated once reconnected.
Note that there currently isn't a way to time out/cancel these requests,
this is planned for the near future.

Backport of ab9769f
Since we don't yet support contributing to an incoming splice, we need
to make sure we attempt our splice negotiation eventually if the
counterparty was also attempting a splice at the same time but they won
the quiescence tie-breaker. Since only one pending splice (without RBF)
is allowed at a time, we do this after the existing splice becomes
locked.

Backport of d91e14b
We'll use this in the next commit to test that we'll send a stfu message
for a splice we intend to initiate upon reconnecting.

Backport of 3c4e70c
Adds `is_manual_broadcast` and `funding_seen_onchain` flags to track
whether the channel uses manual funding broadcasts and whether we've
seen the funding tx confirm.

This enables deferring holder commitment broadcasts until after the
funding tx is actually broadcast. For example, in LSPS2 with
client_trusts_lsp=true, the LSP may defer broadcasting the funding tx
until the client claims an HTLC, so we need to avoid broadcasting
commitments that reference outputs that don't exist yet.

Backport of b9158c5
Marks funding_seen_onchain when we see the funding tx confirm.

Backport of 04a2776
Don't queue holder commitment broadcasts until funding is confirmed,
unless explicitly overridden via broadcast_latest_holder_commitment_txn.

Attempting to broadcast commitments before funding confirms would fail
mempool validation since the funding output doesn't exist yet.

Backport of 6c5ef04
For manually-broadcast funding, we can't track claimable outputs until
the funding tx is actually onchain. Otherwise we'd try to claim outputs
that don't exist yet.

Backport of 4131680
Sets should_broadcast_commitment=true when funding confirms.

Since we skip the initial broadcast when funding_seen_onchain is false,
we need to queue it once funding actually hits the chain.

Backport of be1955a
Tests that holder commitment broadcasts are properly deferred until
funding confirms, and that the full manual-funding flow works correctly.

Backport of ea95a15
In 6c5ef04 we prevented broadcast
of the commitment transactions if the funding transaction has not
yet appeared on-chain for manual-broadcast channels to avoid
spurious bumps or unbroadcastable transactions. It also updated
the documentation on
`ChanelMonitor::broadcast_latest_holder_commitment_txn` to
explicitly state that it will override the manual-broadcast state
and broadcast the latest commitment anyway.

However, 4131680 accidentally
reverted this behavior by updating
`generate_claimable_outpoints_and_watch_outputs`, which is caled by
`broadcast_latest_holder_commitment_txn` to also refuse to
broadcast if funding has not been seen on chain.

Here we fix this, passing through the `require_funding_seen` bool
to allow `broadcast_latest_holder_commitment_txn` to broadcast
immediately.

Backport of 64bb44c
In 6c5ef04 we prevented broadcast
of the commitment transactions if the funding transaction has not
yet appeared on-chain for manual-broadcast channels to avoid
spurious bumps or unbroadcastable transactions. However, we missed
the case where a channel is closed due to HTLCs timing out.

Here we fix that by doing the same broadcast-gating when
automatically force-closing a channel due to HTLC timeouts.

Backport of a979c33
This rewrites
`test_manual_broadcast_skips_commitment_until_funding` to test
manual-broadcast channel closing for both forced-broadcast and
close-due-to-HTLC-expiry closes as well as 0-conf and non-0-conf
channels.

Backport of e2831cf
The remaining two manual-broadcast tests were made redundant by the
previous commit which expanded the coverage of the first test. Thus
we remove the two other tests.

Backport of 2ce69f6
In `generate_claimable_outpoints_and_watch_outputs` if we're going
to decide to return nothing and defer broadcasting the commitment
transaction, there's no need to prepare the broadcast tracking
objects, so skip it.

Backport of 91aeac1
@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Oct 29, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 29, 2025

I've assigned @wpaulino as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

❌ Patch coverage is 94.45178% with 84 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.87%. Comparing base (244fcf3) to head (206fb06).
⚠️ Report is 36 commits behind head on 0.2.

Files with missing lines Patch % Lines
lightning/src/ln/splicing_tests.rs 95.12% 10 Missing and 11 partials ⚠️
lightning/src/ln/channel.rs 87.50% 11 Missing and 4 partials ⚠️
lightning/src/ln/channelmanager.rs 80.35% 9 Missing and 2 partials ⚠️
lightning/src/events/bump_transaction/mod.rs 95.31% 5 Missing and 4 partials ⚠️
lightning-background-processor/src/lib.rs 40.00% 6 Missing ⚠️
lightning/src/ln/zero_fee_commitment_tests.rs 97.92% 5 Missing ⚠️
lightning/src/chain/channelmonitor.rs 95.65% 2 Missing and 1 partial ⚠️
lightning/src/ln/functional_test_utils.rs 97.74% 3 Missing ⚠️
lightning/src/ln/peer_handler.rs 92.50% 3 Missing ⚠️
lightning/src/util/test_utils.rs 86.36% 3 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##              0.2    #4185      +/-   ##
==========================================
+ Coverage   88.76%   88.87%   +0.10%     
==========================================
  Files         180      180              
  Lines      136626   137770    +1144     
  Branches   136626   137770    +1144     
==========================================
+ Hits       121282   122447    +1165     
+ Misses      12540    12523      -17     
+ Partials     2804     2800       -4     
Flag Coverage Δ
fuzzing 20.86% <7.24%> (-0.79%) ⬇️
tests 88.72% <94.45%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tankyleo and others added 15 commits October 29, 2025 19:11
Otherwise, we could hit the max 10_000vB size limit on V3 transactions
(BIP 431 rule 4).

Also introduce a `max_tx_weight` parameter to `select_confirmed_utxos`.
This constraint makes sure anchor and HTLC transactions in 0FC channels
satisfy the `TRUC_MAX_WEIGHT` and the `TRUC_CHILD_MAX_WEIGHT`
maximums.

Expand the coin-selection algorithm provided for any `T: WalletSource`
to satisfy this new constraint.

Backport of 4e4a494

Silent MSRV conflict resolved in:
 * lightning/src/events/bump_transaction/mod.rs
If while a splice is pending, the channel happens to not have any
commitment updates, but did prior to the splice being negotiated, it's
possible that we end up with bogus holder HTLC data for the previous
commitment. After the splice becomes locked, we've successfully
transitioned to the new funding transaction, but that funding
transaction never had a commitment transaction negotiated for the
previous state.

Backport of c418034
We relied on `position` giving us the last index we need to prune, but
this may return `None` when all known legacy SCIDs need to be pruned. In
such cases, we ended up not pruning any of the legacy SCIDs at all.

Rewritten by: Matt Corallo <git@bluematt.me>

Backport of f2ada1a
Test tweaked by: Matt Corallo <git@bluematt.me>

Backport of b84ad65
When splicing, we're required by protocol to retain all the
existing keys material except the funding key which we're allowed
to rotate. In the original implementation we acknowledged that but
figured we'd stick with a single `pubkey` method in the
`ChannelSigner` anyway cause adding a specific method for it is
annoying.

Sadly, this was ultimately broken - in `FundingScope::for_splice`,
we called the signer's `new_pubkeys` method (renamed from `pubkeys`
after splicing initially landed), replacing all of the public keys
the `Channel` would use rather than just the funding key. This can
result in commitment signature mismatches if the signer changes any
keys aside from the funding one.

`InMemorySigner` did not do so, however, so we didn't notice the
bug. Luckily-ish, in 189b8ac we
started generating a fresh `remote_key` when splicing (at least
when upgrading from 0.1 to 0.2 or when setting `KeysManager` to use
v1 `remote_key` derivation). This breaks splicing cause we can't
communicate the new `remote_key` to the counterparty during the
splicing handshake.

Ultimately this bug is because the API we had didn't communicate to
the signer that we weren't allowed to change anything except the
funding key, and allowed returning a `ChannelPublicKeys` which
would break the channel.

Here we fix this by renaming `new_pubkeys` `pubkeys` again
(partially reverting 9d291e0 but
keeping the changed requirements that `pubkeys` only be called
once) and adding a new `ChannelSigner:new_funding_pubkey` method
specifically for splicing.

We also update `channel.rs` to correctly fetch the new funding
pubkey before sending `splice_init`, storing it in the
`PendingFunding` untl we build a `FundingScope`.

Backport of e95ebf8
In the previous commit we partially reverted
9d291e0 renaming
`ChannelSigner::new_pubkeys` to `pubkeys` again, but we still don't
want to go back to requiring that `pubkeys` return the same
contents on each call. Thus, here, we add test logic to check that
`pubkeys` isn't called more than once.

Backport of 491b694
`build_commitment_transaction`'s fifth argument is supposed to be
whether we're the one generating the commitment (i.e. because we're
signing rather than validating the commitment). During splicing,
this doesn't matter because there should be no async HTLC
addition/removal happening so the commitment generated wil be the
same in either case, but its still good to pass the correct bool.

Backport of 0f4e6c2
If an HTLC was forwarded in 0.1, but waiting to be failed back, it
will ultimately be failed by adding it to the
`ChannelManager::pending_forwards` map with the channel's original
SCID. If that channel is spliced between when the HTLC was
forwarded (on 0.1) and when the HTLC is failed back (on 0.2), that
SCID may no longer exist, causing the HTLC fail-back to be lost.

Luckily, delaying when an SCID is expired is cheap - its just
storing an extra `u64` or two and generating one requires an
on-chain splice, so we simply delay removal of SCIDs for two months
at which point any incoming HTLCs should have been expired for six
weeks and the counterparty should have force-closed anyway.

Backport of d12c6a3
`Duration::new` adds any nanoseconds in excess of a second to the
second part. This can overflow, however, panicking. In 0.2 we
introduced a few further cases where we store `Duration`s,
specifically some when handling network messages.

Sadly, that introduced a remotely-triggerable crash where someone
can send us, for example, a malicious blinded path context which
can cause us to panic.

Found by the `onion_message` fuzzer

Backport of 7b9bde1
Implementations MUST NOT assume any topological order on the
transactions.

While Bitcoin Core v29+ `submitpackage` RPC allows packages of length 1
to be submitted via `submitpackage`, it still requires any package
submitted there to be a `child-with-parents` package.

So we remove the possibility that a batch of transactions passed to
a `BroadcasterInterface` implementation contains unrelated transactions,
or multiple children.

Backport of 7c9b21f
We recently ran into a race condition on macOS where `read_event`
would return `Ok(true)` (implying reads should be paused) due to
many queued outbound messages but before the caller was able to
set the read-pause flag, the `send_data` calls to flush the
buffered messages completed. Thus, when the `read_event` caller got
scheduled again, the buffer was empty and we should be reading, but
it is finally processing the read-pause flag and we end up hanging,
unwilling to read messages and unable to learn that we should start
reading again as there are no messages to `send_data` for.

This should be fairly rare, but not unheard of - the `pause_read`
flag in `read_event` is calculated before handling the last
message, so there's some time between when its calculated and when
its returned. However, that has to race with multiple calls to
`send_data` to send all the pending messages, which all have to
complete before the `read_event` return happens. We've (as far as I
recall) never hit this in prod, but a benchmark HTLC-flood test
managed to hit it somewhat reliably within a few minutes on macOS
and when a synthetic few-ms sleep was added to each message
handling call.

Ultimately this is an issue with the API - we pause reads via a
returned flag but unpause them via a called method, creating two
independent "stream"s of pause/unpauses which can get out of sync.
Thus, here, we stick to a single "stream" of pause-read events from
`PeerManager` to user code via `send_data` calls, dropping the
read-pause flag return from `read_event` entirely.

Technically this adds risk that someone can flood us with enough
messages fast enough to bloat our outbound buffer for a peer before
`PeerManager::process_events` gets called and can flush the pause
flag via `read_event` calls to all descriptors. This isn't ideal
but it should still be relatively hard to do as `process_events`
calls are pretty quick and should be triggered immediately after
each `read_event` call completes.

Backport of c0855d8
In the previous commit, we moved the `send_data` `resume_read` flag
to also indicate that we should pause if its unset. This should
work as we mostly only set the flag when we're sending but may
cause us to fail to pause if we are blocked on gossip validation
but `awaiting_write_event` wasn't set as we had previously failed
to fully flush a buffer (which no longer implies read-pause).

Here we make this logic much more robust by ensuring we always make
at least one `send_data` call in `do_attempt_write_data` if we
need to pause read (or unpause read).

Backport of 77cbb61
This fixes incorrect docs and comments introduced by
e95ebf8 and
491b694

Backport of ab21871
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-0.2-backports branch from 92e0954 to 860cff9 Compare October 29, 2025 19:11
@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Copy link
Contributor

@tankyleo tankyleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reproduced the backport on my machine,

./ci/ci-tests.sh passes locally with rustc 1.90.0.

@TheBlueMatt TheBlueMatt merged commit 1b4d8e4 into lightningdevkit:0.2 Oct 30, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants