Skip to content

Defer LSPS4 HTLC forwarding until channel usable#6

Merged
amackillop merged 1 commit into
lsp-0.2.0from
austin_mdk-608_reprocess-pending-htlcs
Mar 11, 2026
Merged

Defer LSPS4 HTLC forwarding until channel usable#6
amackillop merged 1 commit into
lsp-0.2.0from
austin_mdk-608_reprocess-pending-htlcs

Conversation

@amackillop
Copy link
Copy Markdown

@amackillop amackillop commented Mar 11, 2026

LDK fires peer_connected after the TCP+Init handshake but before
channel_reestablish completes. During this window, channels exist
but are not yet usable (is_usable=false). The previous code
forwarded HTLCs unconditionally in peer_connected and
htlc_intercepted, causing ~10% of payments to fail on reconnect.

Distinguish two connected-peer cases in htlc_intercepted and
peer_connected:

  • Channels exist but none usable (reestablish in progress): defer
    the HTLC for timer-based retry via process_pending_htlcs().
  • No channels exist (first payment, JIT open needed): proceed
    immediately so calculate_htlc_actions_for_peer can emit the
    OpenChannel event.

process_pending_htlcs (5s timer) only retries the reestablish
case (channels exist, waiting to become usable). It must not
handle the no-channel case to avoid emitting duplicate
OpenChannel events while a JIT open is already in flight.

Remove the re-check race in htlc_intercepted that could fire
concurrently with peer_connected. The webhook + peer_connected
path is the single owner of the offline-peer reconnect flow.

Blocking inside peer_connected was also considered but rejected:
there is no LDK event for "channel usable after reconnect" to
wake on, so it would require a spin-wait with arbitrary timeout.
A timer-based retry is cleaner and avoids holding the lock.

Summary:

Path Usable channels Channels reestablishing No channels (JIT)
htlc_intercepted (connected) Forward Defer Trigger OpenChannel
htlc_intercepted (offline) Store + webhook
peer_connected Forward Defer Trigger OpenChannel
process_pending_htlcs (timer) Forward Wait Skip (no duplicate)
channel_ready Forward

@amackillop amackillop force-pushed the austin_mdk-608_reprocess-pending-htlcs branch 6 times, most recently from 1728656 to e737493 Compare March 11, 2026 15:32
LDK fires peer_connected after the TCP+Init handshake but before
channel_reestablish completes. During this window, channels exist
but are not yet usable (is_usable=false). The previous code
forwarded HTLCs unconditionally in peer_connected and
htlc_intercepted, causing ~10% of payments to fail on reconnect.

Distinguish two connected-peer cases in htlc_intercepted and
peer_connected:
- Channels exist but none usable (reestablish in progress): defer
  the HTLC for timer-based retry via process_pending_htlcs().
- No channels exist (first payment, JIT open needed): proceed
  immediately so calculate_htlc_actions_for_peer can emit the
  OpenChannel event.

process_pending_htlcs (5s timer) only retries the reestablish
case (channels exist, waiting to become usable). It must not
handle the no-channel case to avoid emitting duplicate
OpenChannel events while a JIT open is already in flight.

Remove the re-check race in htlc_intercepted that could fire
concurrently with peer_connected. The webhook + peer_connected
path is the single owner of the offline-peer reconnect flow.

Blocking inside peer_connected was also considered but rejected:
there is no LDK event for "channel usable after reconnect" to
wake on, so it would require a spin-wait with arbitrary timeout.
A timer-based retry is cleaner and avoids holding the lock.
@amackillop amackillop force-pushed the austin_mdk-608_reprocess-pending-htlcs branch from e737493 to e81d9dd Compare March 11, 2026 15:38
@amackillop amackillop merged commit c14b445 into lsp-0.2.0 Mar 11, 2026
12 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant