Skip to content

Rework GraduatedRebalancer to work across restarts#46

Open
benthecarman wants to merge 2 commits intomasterfrom
transfer-rework
Open

Rework GraduatedRebalancer to work across restarts#46
benthecarman wants to merge 2 commits intomasterfrom
transfer-rework

Conversation

@benthecarman
Copy link
Collaborator

@benthecarman benthecarman commented Jan 5, 2026

Previously, we would await payment successes before doing a rebalance successful event. This would cause issues if the app was closed during a rebalance. This adds handling so the GraduatedRebalancer is now event based and persists its state. We also now have better handling for failed rebalances to go along with it.

} => {
println!(
"{} Rebalance failed: {} msat, trigger_payment_id: {}, trusted_rebalance_payment_id: {}, reason: {}",
"❌".bright_red(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol claude says hi

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was actually github copilot :)

&self, payment_hash: [u8; 32],
) -> Pin<Box<dyn Future<Output = Option<ReceivedLightningPayment>> + Send + '_>>;
/// Get the rebalance monitor holder for this wallet
fn rebalance_monitor_holder(&self) -> RebalanceMonitorHolder;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to leak into the trusted wallet API? Can't we just have the higher-level logic see the notifications of incoming payments and match them with a rebalance?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a circular dep because the ln and trusted wallet need to be created before we create the rebalancer because the rebalancer uses them. I moved this into the rebalance event handler and made it a lot cleaner than before

@benthecarman benthecarman marked this pull request as ready for review January 7, 2026 01:56
@benthecarman benthecarman self-assigned this Jan 8, 2026
state.trusted_payment_id.map(|id| id.to_lower_hex_string())
);

// If trusted_payment_id is None, we crashed after persisting state but before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No? We call send and only persist after send returns, so we could start sending, the payment actually goes out, and then crash without writing the trusted_payment_id to disk. We need to be able to recover the trusted payment id from the list of trusted payments pending (or otherwise figure out if its actually pending/went out before we crashed/shut down).

Previously, we would await payment successes before doing a rebalance
successful event. This would cause issues if the app was closed during a
rebalance. This adds handling so the GraduatedRebalancer is now event
based and persists its state. We also now have better handling for
failed rebalances to go along with it.
"Failed to query trusted wallet ({e:?}) for rebalance {}, assuming payment was not sent",
state.expected_payment_hash.as_hex()
);
self.persistence.remove_trusted_rebalance_state().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should wipe here? We can try again on the next restart or maybe queue it up to try again later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what wiping does. Since we won't have an active rebalance, then on the next check if rebalance needed we will init a new one.

Some(user_channel_id) => {
log_info!(
self.logger,
"Found pending channel {user_channel_id} for on-chain rebalance, recovering"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we have no idea if the splice-balance worked? Generating a spurious event doesn't feel that critical but AFAICT this may cause us to not try another rebalance if we fail to balance once and don't receive another on-chain tx.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we are actually able to tell, we don't get something like a splice id so we can't link the rebalance event to a given splice pending event, beside waiting in real time. If we could get a splice history from ldk we potentially could just do it by index or something.

@TheBlueMatt
Copy link
Contributor

I kinda wonder if a rebalance failure shouldn't automatically spawn a rebalance retry a minute later?

@benthecarman
Copy link
Collaborator Author

I kinda wonder if a rebalance failure shouldn't automatically spawn a rebalance retry a minute later?

it kinda already does, at least we always check on startup. But yeah, we could reschedule one in a minute, only really matters if we expect the wallet to be long running or not. Probably better to do just in case though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants