Skip to content

Drop the need for fork headers when calling Listen's disconnect #3876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

TheBlueMatt
Copy link
Collaborator

The Listen::block_disconnected method is nice in that listeners
learn about each block disconnected in series. Further, it included
the header of the block that is being disconnected to allow the
listeners to do some checking that the interface is being used
correctly (namely, asserting that the header's block hash matches
their current understanding of the best chain).

However, this interface has some substantial drawbacks. Namely, the
requirement that fork headers be passed in means that restarting
with a new node that has no idea about a previous fork leaves us
unable to replay the chain at all. Further, while when various
listeners were initially written learning about each block
disconnected in series seemed useful, but now we no longer rely on
that anyway because the Confirm interface does not allow for it.

Thus, here, we replace Listen::block_disconnected with a new
Listen::blocks_disconnected, taking only information about the
fork point/new best chain tip (in the form of its block hash and
height) rather than information about previous fork blocks and only
requiring a single call to complete multiple block disconnections
during a reorg.

This requires removing some assertions on block disconnection
ordering, but because we now provide lightning-block-sync and
expect users to use it when using the Listen interface, these
assertions are much less critical.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 18, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz June 18, 2025 21:12
@TheBlueMatt TheBlueMatt force-pushed the 2025-06-disconnect-less-info branch from b045453 to 5942c8b Compare June 18, 2025 21:25
The `Listen::block_disconnected` method is nice in that listeners
learn about each block disconnected in series. Further, it included
the header of the block that is being disconnected to allow the
listeners to do some checking that the interface is being used
correctly (namely, asserting that the header's block hash matches
their current understanding of the best chain).

However, this interface has some substantial drawbacks. Namely, the
requirement that fork headers be passed in means that restarting
with a new node that has no idea about a previous fork leaves us
unable to replay the chain at all. Further, while when various
listeners were initially written learning about each block
disconnected in series seemed useful, but now we no longer rely on
that anyway because the `Confirm` interface does not allow for it.

Thus, here, we replace `Listen::block_disconnected` with a new
`Listen::blocks_disconnected`, taking only information about the
fork point/new best chain tip (in the form of its block hash and
height) rather than information about previous fork blocks and only
requiring a single call to complete multiple block disconnections
during a reorg.

We also swap to using a single `BestBlock` to describe the new
chain tip, in anticipation of future extensions to `BestBlock`.

This requires removing some assertions on block disconnection
ordering, but because we now provide `lightning-block-sync` and
expect users to use it when using the `Listen` interface, these
assertions are much less critical.
@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@ldk-reviews-bot
Copy link

✅ Added second reviewer: @joostjager

@tnull tnull requested review from tnull and removed request for joostjager June 19, 2025 16:11
Copy link

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 85.34483% with 17 lines in your changes missing coverage. Please review.

Project coverage is 90.14%. Comparing base (9db0fb2) to head (a6985eb).
Report is 86 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/chain/mod.rs 0.00% 5 Missing ⚠️
lightning-block-sync/src/test_utils.rs 57.14% 3 Missing ⚠️
lightning/src/util/sweep.rs 0.00% 3 Missing ⚠️
lightning-liquidity/src/manager.rs 0.00% 2 Missing ⚠️
lightning/src/chain/channelmonitor.rs 88.23% 2 Missing ⚠️
lightning-block-sync/src/init.rs 90.00% 1 Missing ⚠️
lightning/src/ln/functional_tests.rs 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3876      +/-   ##
==========================================
+ Coverage   89.76%   90.14%   +0.37%     
==========================================
  Files         164      165       +1     
  Lines      133048   133580     +532     
  Branches   133048   133580     +532     
==========================================
+ Hits       119434   120413     +979     
+ Misses      10937    10700     -237     
+ Partials     2677     2467     -210     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt force-pushed the 2025-06-disconnect-less-info branch from b06cbd4 to a54159c Compare June 19, 2025 18:47
jkczyz
jkczyz previously approved these changes Jun 20, 2025
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, but while we're here it would be good to add some details regarding the implicit assumptions on the Listen docs, and adding (back) some asserts/debug_asserts would be appreciated.

@TheBlueMatt TheBlueMatt requested review from jkczyz and tnull June 27, 2025 21:01
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

1 similar comment
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

///
/// Each block must be connected in chain order with one call to either
/// [`Listen::block_connected`] or [`Listen::filtered_block_connected`]. If a call to the
/// [`Filter`] interface was made during block processing and further transaction(s) from the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this a new requirement? I imagine it's very hard to reliably implement this: keep track of all registered Txids/outputs, then match them on a block, apply the block, then check again if something new was registered, filter the block again, then potentially call once more. Also, why do we require it only here, but not on Confirm?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has always been the case, just poorly documented. It is, indeed, not the case on Confirm (which doesn't require "blocks" to be connected in-order at all, though it does require all transactions to be topologically sorted).

}
self.chain_listener.block_disconnected(&header.header, header.height);
}
if let Some(block) = disconnected_blocks.last() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're now happy to use the fork point, why are we keeping a Vec of all disconnected blocks around at all? Shouldn't we now be able to convert this to the new behavior entirely? Seems ChainDifference::common_ancestor would already provide what we need, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its needed by the cache, which I didn't want to change in this PR - #3876 (comment)

fn disconnect_blocks(&mut self, mut disconnected_blocks: Vec<ValidatedBlockHeader>) {
for header in disconnected_blocks.drain(..) {
fn disconnect_blocks(&mut self, disconnected_blocks: Vec<ValidatedBlockHeader>) {
for header in disconnected_blocks.iter() {
if let Some(cached_header) = self.header_cache.block_disconnected(&header.block_hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the Cache API accordingly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but it seems somewhat disjoint from this PR, given there's further work to do in lightning-block-sync anyway, so I kinda wanted to keep it as small as can be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, might be good to document somewhere what steps you deem left before you'd consider the transition to this new approach complete. I guess all of them should land before the next release then, to not end up with a half-migrated codebase in the release?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the API/stuff in lightning in in a perfectly fine state after this PR, and lightning-block-sync just deserves a bit of efficiency in dropping a vec and improving the cache afterwards. This PR obviously doesn't accomplish the goals that #3600 set out to, but is just a step towards it (the "complicated" step that requires making sure all the lightning-internal stuff isn't broken by changing the API).

AFAIU, to get to the point that bitcoind (and all) syncs support swapping to a new chain during a reorg and don't fetch a pile of redundant blocks we need to:

  • make BestBlock contain a list of 6(ish?) block hashes, not one
  • use that list in lightning-block-sync (and eventually lightning-transaction-sync) to do the sync without leaning on the cache
  • expand the runtime caching to cache partial chains across clients in the init sync logic
  • (eventually) ensure lightning-block-sync doesn't have any spare vecs or whatever lying around.

Copy link
Contributor

@tnull tnull Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it would make sense to open an issue for this larger refactoring you're planning, as there are connected aspects to discuss that don't really belong in this PR.

make BestBlock contain a list of 6(ish?) block hashes, not one

For example, how do we imagine this to work exactly, especially for the Confirm/lightning-transaction-sync case?

For Esplora, we currently retrieve the tip hash, retrieve the header and the block status (including the hight), before we call best_block_updated, which is already 3 costly RTTs. If we now would require to track the 6 most-recent hashes, we'd need to make at least 5 more subsequent calls (bringing this it at least 8 RTTs per sync round) to retrieve the hashes at [(height-5)..(height-1)]. But, given that these are individual calls, there is no way to tell if any reorg happened during some of these calls, so they are inherently race-y.

For Electrum, the results would be very similar for the polling client in its current form. Note we eventually want to switch the client to a streaming version making use of Electum's subscription model though, and requiring 'last 6 blocks' would probably require us to resort to costly polling again.

If we'd otherwise extend the client to start tracking more of the chain state (i.e., actually tracking a local chain 6-suffix) across rounds this would complicate the client logic quite a bit and make it even more susceptible to race conditions.

TLDR: I still maintain all of this would be/is resulting in a major refactor across our chain syncing crates and logic, and it would be great to discuss what we imagine to be involved and the corresponding trade-offs before we just jump into it heads first.

`OnchainTxHandler` is an internal struct and doesn't implement
`Listen`, but its still nice to have its API mirror the `Listen`
API so that internal code all looks similar.
Now that the `Listen` interface allows blocks to be disconnected in
batches rather than one at a time, we should test this. Here we add
a new `ConnectStyle` for the functional test framework which tests
doing so.
When calling `Channel::best_block_updated` we pass it the timestamp
of the block we're connecting so that it can track the highest
timestamp it has seen.

However, in some cases, we don't actually have a timestamp to pass,
which `Channel::best_block_updated` will happily ignore as it
always takes the `max` of its existing value. Thus, we really
should pass a `None` to ensure the API is understandable, which we
do here.
`Listen` is somewhat quiet on high-level use and even requirements,
which we document further here.
@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rustfmt is unhappy right now

@TheBlueMatt TheBlueMatt force-pushed the 2025-06-disconnect-less-info branch from ef73b0a to a6985eb Compare July 2, 2025 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants