Skip to content

feat: Make MonitorUpdatingPersister change persist type based on size #3834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Prabhat1308
Copy link
Contributor

fixes #3770

  • Skips full persistence when ChannelMonitor is smaller than a pre-determined size.
  • Adds a field minimum_monitor_size_for_updates to specify the minimum size for full persistence to be activated
  • Adds new_with_default_threshold function to setup MonitorUpdatingPersister with a default minimum_monitor_size_for_updates value

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 7, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@tnull tnull requested review from tnull and removed request for joostjager June 9, 2025 07:24
/// For small channel monitors (below `minimum_monitor_size_for_updates` bytes when serialized),
/// this persister will always write the full monitor instead of individual updates. This avoids
/// the overhead of managing update files and later compaction for tiny monitors that don't benefit
/// from differential updates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still not clear to me how much the gain is of this in practice. Also worried that disabling the incremental path initially allow certain bugs to linger for longer, just because the path isn't hit as much, or rarely.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, but can we add some test coverage for the new behavior?

Additionally, benchmarks would indeed very helpful to evaluate what a reasonable threshold value would be.

@Prabhat1308 Prabhat1308 force-pushed the probot/change_persist_type branch from ac6966e to c6ad41b Compare June 11, 2025 17:52
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rewrite your commit history so that it's clear what are feature and fixup commits. Each commit message should have a clear headline followed by some paragraph(s) describing the change, where it happened, and why it was necessary. For guidance, please take a look at https://cbea.ms/git-commit/

Introduces an optimization to the MonitorUpdatingPersister to
avoid writing differential updates for small channel monitors.

When a channel monitor is smaller than a configurable threshold
, the persister will now write the full monitor instead of an update.
This avoids the I/O overhead of creating and managing many small update
files for monitors that don't benefit significantly from differential updates.
Adds unit test for introduced size based optimisation. Also updates
the old unit test to use a threshold value and use constructors to
increase test coverage
@Prabhat1308 Prabhat1308 force-pushed the probot/change_persist_type branch from 357ce4b to b313e39 Compare June 12, 2025 09:54
@Prabhat1308
Copy link
Contributor Author

I have created a minimal benchmarking setup here for this.
https://github.com/Prabhat1308/rust-lightning/tree/probot/benchmark

from this I get that adding update based persistence is reducing the performance completely opposite to what is claimed in the issue . [There are a lot of assumptions in the benchmarking regarding the update size and monitor sizes ]

@TheBlueMatt
Copy link
Collaborator

I'm not sure we can conclude all that much from a local-filesystem benchmark, sadly. We have several users (and hopefully more soon with ldk-node) who use the MonitorUpdatingPersister with remote storage, where costs can be very different from local (eg IP packet size bound). I'd say we use a threshold of 8192 for now and call it a day.

@tnull
Copy link
Contributor

tnull commented Jun 16, 2025

I have created a minimal benchmarking setup here for this. https://github.com/Prabhat1308/rust-lightning/tree/probot/benchmark

from this I get that adding update based persistence is reducing the performance completely opposite to what is claimed in the issue . [There are a lot of assumptions in the benchmarking regarding the update size and monitor sizes ]

Thanks for taking a first look at these benchmarks. How did you arrive at some of these assumptions? E.g., how did you decide on the 8KB threshold being 'large' monitors. I believe in reality monitors could end up being much larger. Maybe @TheBlueMatt can provide some realistic monitor sizes here?

I'm asking because I suspect the results may stem from a latency vs. throughput trade-off, and under certain circumstances one might dominate over the other. The filesystem store for example likely (especially assuming that most file systems by now have 4kb block size) does not incur that much more latency when reading/writing 'larger' monitors (as 8kb is still tiny and the syscall / IO latency likely the dominant factor), while persistence to a remote server would see much slower write speeds and hence higher latency when (re-)persisting full monitors.

@TheBlueMatt
Copy link
Collaborator

Maybe @TheBlueMatt can provide some realistic monitor sizes here?

Yea, so 8KiB is a reasonable "channel that got opened and has only done a handful of HTLC operations in its history" threshold (tho is maybe even a bit too small for a more active mobile wallet). On my routing node the largest monitor is ~73MiB, I imagine c= has some that get into the hundreds of MiB.

@Prabhat1308
Copy link
Contributor Author

Thanks for taking a first look at these benchmarks. How did you arrive at some of these assumptions? E.g., how did you decide on the 8KB threshold being 'large' monitors. I believe in reality monitors could end up being much larger.

My assumptions here were to test the crossover point to select what was the optimal value of the threshold . Since the talks in the issue for threshold were in kBs , I assumed them to be in the same range and didn't go as far as to check the monitors in MB range and the results I was getting was becoming worse as I was moving towards bigger sizes which also made me not go higher. So the comments to monitor sizes in benchmark are not of much significance as they were relative to 4kB monitor size.

changes the threshold value to 8192 bytes as suggested in the PR comments
@Prabhat1308 Prabhat1308 force-pushed the probot/change_persist_type branch from 41c5253 to 9b4508b Compare June 16, 2025 17:09
@domZippilli
Copy link
Contributor

Left some thoughts on the issue, since my thoughts aren't about this implementation but more the feature itself. But posting here since this is where the recent action is. 🙇

@yuvicc
Copy link

yuvicc commented Jun 20, 2025

Concept ACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make MonitorUpdatingPersister change persist type based on size
7 participants