Skip to content

multi: make sure HTLCs are locked in the itest#9602

Merged
Roasbeef merged 7 commits into
masterfrom
yy-pending-remote-commit
Mar 20, 2025
Merged

multi: make sure HTLCs are locked in the itest#9602
Roasbeef merged 7 commits into
masterfrom
yy-pending-remote-commit

Conversation

@yyforyongyu

Copy link
Copy Markdown
Member

Depends on,

Fix #9486. As seen multiple times in our CI, which is believed to be a flake,

--- FAIL: TestLightningNetworkDaemon/tranche03/116-of-272/btcd/multihop-local_preimage_claim_simple_taproot (61.10s)
        harness_node.go:403: Starting node (name=Alice) with PID=9484
        harness_node.go:403: Starting node (name=Bob) with PID=9487
        harness_node.go:403: Starting node (name=Carol) with PID=9496
        lnd_multi-hop_force_close_test.go:2316: Invoice expire height: 637, current: 624
        harness_assertion.go:2767: 
                Error Trace:    /Users/runner/work/lnd/lnd/lntest/harness_assertion.go:2767
                                                        /Users/runner/work/lnd/lnd/itest/lnd_multi-hop_force_close_test.go:2335
                                                        /Users/runner/work/lnd/lnd/itest/lnd_multi-hop_force_close_test.go:2196
                                                        /Users/runner/work/lnd/lnd/lntest/harness.go:304
                                                        /Users/runner/work/lnd/lnd/itest/lnd_test.go:130
                Error:          Received unexpected error:
                                want 1 , got 2, sweeps: 
                                op=c8fdc20924a2b0b58b7ae7f7436f8f6a07fb17dc9ebcc5d9f292560034ecec80:1, amt=330, type=TAPROOT_ANCHOR_SWEEP_SPEND, deadline=1637
                                op=e28f21097e2edecbd44be1c2e1e8ef7cd7c937753cdb7439600e967bcc3f7e14:1, amt=330, type=TAPROOT_ANCHOR_SWEEP_SPEND, deadline=635
                Test:           TestLightningNetworkDaemon/tranche03/116-of-272/btcd/multihop-local_preimage_claim_simple_taproot
                Messages:       Carol: check pending sweeps timeout
        harness.go:382: finished test: multihop-local_preimage_claim_simple_taproot, start height=610, end height=630, mined blocks=20
        harness.go:341: test failed, skipped cleanup
    lnd_test.go:138: Failure time: 2025-03-10 09:15:23.977

To make sure we can properly assert HTLC states in our test, we now add a new field LockedIn in lnrpc.HTLC. When an HTLC is added to the local commit, we need to wait till the remote has sent us the revoke_and_ack so that we know for sure the HTLC is locked in before proceeding in our tests.

In addition, when processing outside resolutions in our link, we now catch the link's quit signal so we don't create unexpected pending commit.

@yyforyongyu yyforyongyu added rpc Related to the RPC interface htlcswitch flake fix size/micro small bug fix or feature, less than 15 mins of review, less than 250 labels Mar 11, 2025
@yyforyongyu yyforyongyu added this to the v0.19.0 milestone Mar 11, 2025
@coderabbitai

coderabbitai Bot commented Mar 11, 2025

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions

Copy link
Copy Markdown

Pull reviewers stats

Stats of the last 30 days for lnd:

User Total reviews Time to review Total comments
yyforyongyu
🥇
42
▀▀▀
1d 3h 42m
48
▀▀▀
guggero
🥈
24
▀▀
12h
32
▀▀
bhandras
🥉
23
▀▀
5h 12m
30
▀▀
ellemouton
14
3h 29m
8
ziggie1984
8
14h 43m
7
Crypt-iQ
6
2d 16h 7m
22
Roasbeef
5
7h 41m
3
saubyk
2
3h 22m
0
GustavoStingelin
2
1d 12h 5m
3
MPins
2
7d 20h 53m
▀▀▀
4
Reallyfe
2
2d 12h 8m
0
starius
1
3d 7h 13m
0
gijswijs
1
4m
0
sputn1ck
1
5m
0
ViktorTigerstrom
1
9m
1
hieblmi
1
13m
0
morehouse
1
2h 43m
0
GeorgeTsagk
1
32m
1
Abdulkbk
1
2h 56m
2
twofaktor
1
3h 43m
3
Biloparoch
1
2d 3h 58m
0

@guggero guggero self-requested a review March 11, 2025 14:34

@guggero guggero left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for squashing more flake bugs!

Comment thread contractcourt/channel_arbitrator.go Outdated
Comment thread htlcswitch/link.go
Comment thread lnrpc/lightning.proto Outdated
Comment thread lntest/harness_assertion.go Outdated
}

flakeSkipPendingSweepsCheckDarwin(ht, bob, numSweeps)
ht.AssertNumPendingSweeps(bob, numSweeps)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, sorry if this is a misunderstanding on my end. But how does AssertNumPendingSweeps relate to the other changes in AssertNumActiveHtlcs in this commit?

@yyforyongyu yyforyongyu Mar 12, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good q - it's indeed deeply hidden, and I think I missed adding extra AssertNumActiveHtlcs checks so I added them. Also reorder the last two commits to make it more clear. Long story short, they share a similar cause, tho different in details, I commented on each of the cases independently to make them more clear.

}

flakeSkipPendingSweepsCheckDarwin(ht, bob, numSweeps)
ht.AssertNumPendingSweeps(bob, numSweeps)

@yyforyongyu yyforyongyu Mar 12, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case one: Prior to the LockIn check, here Bob would have an HTLC on his pending remote commit if the test runs quickly, causing him to create another request that sweeps his anchor output from the pending remote commit, which fails the test.

}

flakeSkipPendingSweepsCheckDarwin(ht, carol, numSweeps)
ht.AssertNumPendingSweeps(carol, numSweeps)

@yyforyongyu yyforyongyu Mar 12, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case two: Similar to the above case, Carol would need to make sure the HTLC has locked in at L737 so there's no pending remote commit. In addition, we assert the HTLC is still there when she settles it with the preimage. However, this AssertNumActiveHtlcs only asserts that the HTLC is still there, it doesn't assert the HTLC is now settled with a preimage, which means the new commitment dance may still be in the process while we now check her sweep requests. This happens because, previously we would create another pending remote commit for this settlement, although the link is already broken. This behavior is now fixed in the previous commit - we skip it when the link shuts down.

}

flakeSkipPendingSweepsCheckDarwin(ht, carol, numSweeps)
ht.AssertNumPendingSweeps(carol, numSweeps)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same case as case two.

}

flakeSkipPendingSweepsCheckDarwin(ht, bob, numSweeps)
ht.AssertNumPendingSweeps(bob, numSweeps)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as case one.

@guggero guggero left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM 🎉

@saubyk saubyk requested a review from Crypt-iQ March 17, 2025 20:40
@yyforyongyu yyforyongyu force-pushed the yy-pending-remote-commit branch from c26e0eb to 36d0ba6 Compare March 18, 2025 12:22
@yyforyongyu yyforyongyu force-pushed the yy-pending-remote-commit branch from 36d0ba6 to bcc80e7 Compare March 18, 2025 12:24
@saubyk saubyk requested a review from Roasbeef March 18, 2025 16:36
@yyforyongyu yyforyongyu changed the base branch from yy-more-flakes to master March 20, 2025 19:23
In this commit, we add a new field `LockedIn` on HTLCs so it can be used
to decide whether an HTLC found on the local commitment has been
committed on the remote commitment.
This commit makes sure when processing resolutions, e.g, settling
invoices, when the link is already broken, the process would exit with
an error. This fixes the issue we found in the itest, where an
unexpected empty remote pending commitment was created although the
remote peer is already offline.
Now that we have the new RPC to assert the HTLC state, this flake should
be fixed.
@yyforyongyu yyforyongyu force-pushed the yy-pending-remote-commit branch from bcc80e7 to 3b7f9e1 Compare March 20, 2025 19:25

@ziggie1984 ziggie1984 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to enhance the key for the htlc map with its index, otherwise good to go

Comment thread lnrpc/lightning.proto
uint64 forwarding_htlc_index = 7;

/*
Whether the HTLC is locked in. An HTLC is considered locked in when the

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: So are we talking about locked_in for outgoing HTLCs here, and would that mean it is considered as locked_in even if it's only on one commitment transaction e.g. when sending the HTLCs out? My understanding of locked-in was always that the HTLC has to be committed on both sides of the commitment ?

Comment thread rpcserver.go
// used to decide whether the HTLCs from the local commitment has been
// locked in or not.
remoteHTLCs := fn.NewSet[[32]byte]()
for _, htlc := range dbChannel.RemoteCommitment.Htlcs {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can we have multiple HTLCs with the same hash (MPP), then I think we cannot rely on this information ? Probably we need to also add the htlcIndex as a key here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is used mainly in the itests right now, and in all cases we end up using unique payment hashes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created a short test, it seems like they end up with the same hash:

{
    "channels": [
        {
            "active": true,
            "remote_pubkey": "02373b8ebde6e9c5c0398a3bdababc2182530af63679fd166d9adf18fa6184379c",
            "channel_point": "ac23a5759face3e0379a1895b8b08ff4338a9830b89d9d4b19f79b6a4fe1ad74:0",
            "chan_id": "74ade14f6a9bf7194b9d9db830988a33f48fb0b895189a37e0e3ac9f75a523ac",
            "scid": "359540302348288",
            "scid_str": "327x1x0",
            "capacity": "10000000",
            "local_balance": "9495530",
            "remote_balance": "500000",
            "commit_fee": "3810",
            "commit_weight": "1116",
            "fee_per_kw": "2500",
            "unsettled_balance": "1000",
            "total_satoshis_sent": "0",
            "total_satoshis_received": "0",
            "num_updates": "32",
            "pending_htlcs": [
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "100",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "101",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "102",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "103",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "104",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "105",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "106",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "107",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "108",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                },
                {
                    "incoming": false,
                    "amount": "100",
                    "hash_lock": "4ee4f8273264c66d5065cec2098906e826cb6aa6ff5381c78cf886dea1813e6f",
                    "expiration_height": 427,
                    "htlc_index": "109",
                    "forwarding_channel": "0",
                    "forwarding_htlc_index": "0"
                }
            ],
            "csv_delay": 1201,
            "private": false,
            "initiator": true,
            "chan_status_flags": "ChanStatusDefault",
            "local_chan_reserve_sat": "100000",
            "remote_chan_reserve_sat": "100000",
            "static_remote_key": false,
            "commitment_type": "ANCHORS",
            "lifetime": "1058",
            "uptime": "1058",
            "close_address": "",
            "push_amount_sat": "500000",
            "thaw_height": 0,
            "local_constraints": {
                "csv_delay": 1201,
                "chan_reserve_sat": "100000",
                "dust_limit_sat": "354",
                "max_pending_amt_msat": "9900000000",
                "min_htlc_msat": "1",
                "max_accepted_htlcs": 483
            },
            "remote_constraints": {
                "csv_delay": 1201,
                "chan_reserve_sat": "100000",
                "dust_limit_sat": "354",
                "max_pending_amt_msat": "9900000000",
                "min_htlc_msat": "1",
                "max_accepted_htlcs": 483
            },
            "alias_scids": [],
            "zero_conf": false,
            "zero_conf_confirmed_scid": "0",
            "peer_alias": "My Lightning ☇",
            "peer_scid_alias": "0",
            "memo": "",
            "custom_channel_data": ""
        }
    ]
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just add the htlc_index to the key and we are gtg

@@ -2155,6 +2155,20 @@ func (c *ChannelArbitrator) checkRemoteDanglingActions(
continue
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit Msg: what do you mean by empty remote pending commitment ? A commitment without HTLCs ? Why was it send in the first place, was it a feeUpdate ?

@Roasbeef Roasbeef left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐦

Comment thread rpcserver.go
// used to decide whether the HTLCs from the local commitment has been
// locked in or not.
remoteHTLCs := fn.NewSet[[32]byte]()
for _, htlc := range dbChannel.RemoteCommitment.Htlcs {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is used mainly in the itests right now, and in all cases we end up using unique payment hashes.

@Roasbeef Roasbeef merged commit e8875e0 into master Mar 20, 2025
@yyforyongyu yyforyongyu deleted the yy-pending-remote-commit branch March 21, 2025 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flake fix htlcswitch rpc Related to the RPC interface size/micro small bug fix or feature, less than 15 mins of review, less than 250

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug]: force close from settled HTLC being falsely considered as dangling

4 participants