Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement concurrent broadcast tolerance for distributed watchtowers #679

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
47 changes: 26 additions & 21 deletions lightning/src/ln/channelmonitor.rs
Expand Up @@ -133,11 +133,19 @@ pub enum ChannelMonitorUpdateErr {
TemporaryFailure,
/// Used to indicate no further channel monitor updates will be allowed (eg we've moved on to a
/// different watchtower and cannot update with all watchtowers that were previously informed
/// of this channel). This will force-close the channel in question (which will generate one
/// final ChannelMonitorUpdate which must be delivered to at least one ChannelMonitor copy).
/// of this channel).
///
/// Should also be used to indicate a failure to update the local persisted copy of the channel
/// monitor.
/// At reception of this error, ChannelManager will force-close the channel and return at
/// least a final ChannelMonitorUpdate::ChannelForceClosed which must be delivered to at
/// least one ChannelMonitor copy. Revocation secret MUST NOT be released and offchain channel
/// update must be rejected.
///
/// This failure may also signal a failure to update the local persisted copy of one of
/// the channel monitor instance.
///
/// Note that even when you fail a holder commitment transaction update, you must store the
/// update to ensure you can claim from it in case of a duplicate copy of this ChannelMonitor
/// broadcasts it (e.g distributed channel-monitor deployment)
PermanentFailure,
}

Expand Down Expand Up @@ -824,6 +832,10 @@ pub struct ChannelMonitor<ChanSigner: ChannelKeys> {
// Set once we've signed a holder commitment transaction and handed it over to our
// OnchainTxHandler. After this is set, no future updates to our holder commitment transactions
// may occur, and we fail any such monitor updates.
//
// In case of update rejection due to a locally already signed commitment transaction, we
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:
Note that even when we fail a local commitment transaction update, we still store the update to ensure we can claim from it in case a duplicate copy of this ChannelMonitor broadcasts it.

// nevertheless store update content to track in case of concurrent broadcast by another
// remote monitor out-of-order with regards to the block view.
holder_tx_signed: bool,

// We simply modify last_block_hash in Channel's block_connected so that serialization is
Expand Down Expand Up @@ -888,6 +900,11 @@ pub trait ManyChannelMonitor: Send + Sync {
///
/// Any spends of outputs which should have been registered which aren't passed to
/// ChannelMonitors via block_connected may result in FUNDS LOSS.
///
/// In case of distributed watchtowers deployment, even if an Err is return, the new version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more detail about the exact setup of a distributed ChannelMonitor, as several are possible (I think its a misnomer to call it a watchtower as that implies something specific about availability of keys which isn't true today), and probably this comment should be on ChannelMonitor::update_monitor, not the trait (though it could be referenced in the trait), because the trait's return value is only relevant for ChannelManager.

Copy link
Author

@ariard ariard Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know writing a documentation for a distributed ChannelMonitor is tracked by #604. I intend to do it post-anchor as there are implications for bumping utxo management, namely you should also duplicate those keys across distributed monitors.

/// must be written to disk, as state may have been stored but rejected due to a block forcing
/// a commitment broadcast. This storage is used to claim outputs of rejected state confirmed
/// onchain by another watchtower, lagging behind on block processing.
fn update_monitor(&self, funding_txo: OutPoint, monitor: ChannelMonitorUpdate) -> Result<(), ChannelMonitorUpdateErr>;

/// Used by ChannelManager to get list of HTLC resolved onchain and which needed to be updated
Expand Down Expand Up @@ -1167,12 +1184,7 @@ impl<ChanSigner: ChannelKeys> ChannelMonitor<ChanSigner> {
feerate_per_kw: initial_holder_commitment_tx.feerate_per_kw,
htlc_outputs: Vec::new(), // There are never any HTLCs in the initial commitment transactions
};
// Returning a monitor error before updating tracking points means in case of using
// a concurrent watchtower implementation for same channel, if this one doesn't
// reject update as we do, you MAY have the latest holder valid commitment tx onchain
// for which you want to spend outputs. We're NOT robust again this scenario right
// now but we should consider it later.
onchain_tx_handler.provide_latest_holder_tx(initial_holder_commitment_tx).unwrap();
onchain_tx_handler.provide_latest_holder_tx(initial_holder_commitment_tx);

ChannelMonitor {
latest_update_id: 0,
Expand Down Expand Up @@ -1327,9 +1339,6 @@ impl<ChanSigner: ChannelKeys> ChannelMonitor<ChanSigner> {
/// up-to-date as our holder commitment transaction is updated.
/// Panics if set_on_holder_tx_csv has never been called.
pub(super) fn provide_latest_holder_commitment_tx_info(&mut self, commitment_tx: HolderCommitmentTransaction, htlc_outputs: Vec<(HTLCOutputInCommitment, Option<Signature>, Option<HTLCSource>)>) -> Result<(), MonitorUpdateError> {
if self.holder_tx_signed {
return Err(MonitorUpdateError("A holder commitment tx has already been signed, no new holder commitment txn can be sent to our counterparty"));
}
let txid = commitment_tx.txid();
let sequence = commitment_tx.unsigned_tx.input[0].sequence as u64;
let locktime = commitment_tx.unsigned_tx.lock_time as u64;
Expand All @@ -1343,17 +1352,13 @@ impl<ChanSigner: ChannelKeys> ChannelMonitor<ChanSigner> {
feerate_per_kw: commitment_tx.feerate_per_kw,
htlc_outputs: htlc_outputs,
};
// Returning a monitor error before updating tracking points means in case of using
// a concurrent watchtower implementation for same channel, if this one doesn't
// reject update as we do, you MAY have the latest holder valid commitment tx onchain
// for which you want to spend outputs. We're NOT robust again this scenario right
// now but we should consider it later.
if let Err(_) = self.onchain_tx_handler.provide_latest_holder_tx(commitment_tx) {
return Err(MonitorUpdateError("Holder commitment signed has already been signed, no further update of LOCAL commitment transaction is allowed"));
}
self.onchain_tx_handler.provide_latest_holder_tx(commitment_tx);
self.current_holder_commitment_number = 0xffff_ffff_ffff - ((((sequence & 0xffffff) << 3*8) | (locktime as u64 & 0xffffff)) ^ self.commitment_transaction_number_obscure_factor);
mem::swap(&mut new_holder_commitment_tx, &mut self.current_holder_commitment_tx);
self.prev_holder_signed_commitment_tx = Some(new_holder_commitment_tx);
if self.holder_tx_signed {
return Err(MonitorUpdateError("Latest holder commitment signed has already been signed, update is rejected"));
}
Ok(())
}

Expand Down
112 changes: 112 additions & 0 deletions lightning/src/ln/functional_tests.rs
Expand Up @@ -8735,3 +8735,115 @@ fn test_update_err_monitor_lockdown() {
let events = nodes[0].node.get_and_clear_pending_events();
assert_eq!(events.len(), 1);
}

#[test]
fn test_concurrent_monitor_claim() {
// Watchtower A receives block, broadcasts state N, then channel receives new state N+1,
// sending it to both watchtowers, Bob accepts N+1, then receives block and broadcasts
// the latest state N+1, Alice rejects state N+1, but Bob has already broadcast it,
// state N+1 confirms. Alice claims output from state N+1.

let chanmon_cfgs = create_chanmon_cfgs(2);
let node_cfgs = create_node_cfgs(2, &chanmon_cfgs);
let node_chanmgrs = create_node_chanmgrs(2, &node_cfgs, &[None, None]);
let mut nodes = create_network(2, &node_cfgs, &node_chanmgrs);

// Create some initial channel
let chan_1 = create_announced_chan_between_nodes(&nodes, 0, 1, InitFeatures::known(), InitFeatures::known());
let outpoint = OutPoint { txid: chan_1.3.txid(), index: 0 };

// Rebalance the network to generate htlc in the two directions
send_payment(&nodes[0], &vec!(&nodes[1])[..], 10_000_000, 10_000_000);

// Route a HTLC from node 0 to node 1 (but don't settle)
route_payment(&nodes[0], &vec!(&nodes[1])[..], 9_000_000).0;

// Copy SimpleManyChannelMonitor to simulate watchtower Alice and update block height her ChannelMonitor timeout HTLC onchain
let logger = test_utils::TestLogger::with_id(format!("node {}", "Alice"));
let chain_monitor = chaininterface::ChainWatchInterfaceUtil::new(Network::Testnet);
let watchtower_alice = {
let monitors = nodes[0].chan_monitor.simple_monitor.monitors.lock().unwrap();
let monitor = monitors.get(&outpoint).unwrap();
let mut w = test_utils::TestVecWriter(Vec::new());
monitor.write_for_disk(&mut w).unwrap();
let new_monitor = <(BlockHash, channelmonitor::ChannelMonitor<EnforcingChannelKeys>)>::read(
&mut ::std::io::Cursor::new(&w.0)).unwrap().1;
assert!(new_monitor == *monitor);
let watchtower = test_utils::TestChannelMonitor::new(&chain_monitor, &chanmon_cfgs[0].tx_broadcaster, &logger, &chanmon_cfgs[0].fee_estimator);
assert!(watchtower.add_monitor(outpoint, new_monitor).is_ok());
watchtower
};
let header = BlockHeader { version: 0x20000000, prev_blockhash: Default::default(), merkle_root: Default::default(), time: 42, bits: 42, nonce: 42 };
watchtower_alice.simple_monitor.block_connected(&header, 135, &vec![], &vec![]);

// Watchtower Alice should have broadcast a commitment/HTLC-timeout
{
let mut txn = chanmon_cfgs[0].tx_broadcaster.txn_broadcasted.lock().unwrap();
assert_eq!(txn.len(), 2);
txn.clear();
}

// Copy SimpleManyChannelMonitor to simulate watchtower Bob and make it receive a commitment update first.
let logger = test_utils::TestLogger::with_id(format!("node {}", "Bob"));
let chain_monitor = chaininterface::ChainWatchInterfaceUtil::new(Network::Testnet);
let watchtower_bob = {
let monitors = nodes[0].chan_monitor.simple_monitor.monitors.lock().unwrap();
let monitor = monitors.get(&outpoint).unwrap();
let mut w = test_utils::TestVecWriter(Vec::new());
monitor.write_for_disk(&mut w).unwrap();
let new_monitor = <(BlockHash, channelmonitor::ChannelMonitor<EnforcingChannelKeys>)>::read(
&mut ::std::io::Cursor::new(&w.0)).unwrap().1;
assert!(new_monitor == *monitor);
let watchtower = test_utils::TestChannelMonitor::new(&chain_monitor, &chanmon_cfgs[0].tx_broadcaster, &logger, &chanmon_cfgs[0].fee_estimator);
assert!(watchtower.add_monitor(outpoint, new_monitor).is_ok());
watchtower
};
let header = BlockHeader { version: 0x20000000, prev_blockhash: Default::default(), merkle_root: Default::default(), time: 42, bits: 42, nonce: 42 };
watchtower_bob.simple_monitor.block_connected(&header, 134, &vec![], &vec![]);

// Route another payment to generate another update with still previous HTLC pending
let (_, payment_hash) = get_payment_preimage_hash!(nodes[0]);
{
let net_graph_msg_handler = &nodes[1].net_graph_msg_handler;
let route = get_route(&nodes[1].node.get_our_node_id(), &net_graph_msg_handler.network_graph.read().unwrap(), &nodes[0].node.get_our_node_id(), None, &Vec::new(), 3000000 , TEST_FINAL_CLTV, &logger).unwrap();
nodes[1].node.send_payment(&route, payment_hash, &None).unwrap();
}
check_added_monitors!(nodes[1], 1);

let updates = get_htlc_update_msgs!(nodes[1], nodes[0].node.get_our_node_id());
assert_eq!(updates.update_add_htlcs.len(), 1);
nodes[0].node.handle_update_add_htlc(&nodes[1].node.get_our_node_id(), &updates.update_add_htlcs[0]);
if let Some(ref mut channel) = nodes[0].node.channel_state.lock().unwrap().by_id.get_mut(&chan_1.2) {
if let Ok((_, _, _, update)) = channel.commitment_signed(&updates.commitment_signed, &node_cfgs[0].fee_estimator, &node_cfgs[0].logger) {
// Watchtower Alice should already have seen the block and reject the update
if let Err(_) = watchtower_alice.simple_monitor.update_monitor(outpoint, update.clone()) {} else { assert!(false); }
if let Ok(_) = watchtower_bob.simple_monitor.update_monitor(outpoint, update.clone()) {} else { assert!(false); }
if let Ok(_) = nodes[0].chan_monitor.update_monitor(outpoint, update) {} else { assert!(false); }
} else { assert!(false); }
} else { assert!(false); };
// Our local monitor is in-sync and hasn't processed yet timeout
check_added_monitors!(nodes[0], 1);

//// Provide one more block to watchtower Bob, expect broadcast of commitment and HTLC-Timeout
watchtower_bob.simple_monitor.block_connected(&header, 135, &vec![], &vec![]);

// Watchtower Bob should have broadcast a commitment/HTLC-timeout
let bob_state_y;
{
let mut txn = chanmon_cfgs[0].tx_broadcaster.txn_broadcasted.lock().unwrap();
assert_eq!(txn.len(), 2);
bob_state_y = txn[0].clone();
txn.clear();
};

// We confirm Bob's state Y on Alice, she should broadcast a HTLC-timeout
watchtower_alice.simple_monitor.block_connected(&header, 136, &vec![&bob_state_y][..], &vec![]);
{
let htlc_txn = chanmon_cfgs[0].tx_broadcaster.txn_broadcasted.lock().unwrap();
// We broadcast twice the transaction, once due to the HTLC-timeout, once due
// the onchain detection of the HTLC output
assert_eq!(htlc_txn.len(), 2);
check_spends!(htlc_txn[0], bob_state_y);
check_spends!(htlc_txn[1], bob_state_y);
}
}
11 changes: 1 addition & 10 deletions lightning/src/ln/onchaintx.rs
Expand Up @@ -877,18 +877,9 @@ impl<ChanSigner: ChannelKeys> OnchainTxHandler<ChanSigner> {
}
}

pub(super) fn provide_latest_holder_tx(&mut self, tx: HolderCommitmentTransaction) -> Result<(), ()> {
// To prevent any unsafe state discrepancy between offchain and onchain, once holder
// commitment transaction has been signed due to an event (either block height for
// HTLC-timeout or channel force-closure), don't allow any further update of holder
// commitment transaction view to avoid delivery of revocation secret to counterparty
// for the aformentionned signed transaction.
if self.holder_htlc_sigs.is_some() || self.prev_holder_htlc_sigs.is_some() {
return Err(());
}
pub(super) fn provide_latest_holder_tx(&mut self, tx: HolderCommitmentTransaction) {
self.prev_holder_commitment = self.holder_commitment.take();
self.holder_commitment = Some(tx);
Ok(())
}

fn sign_latest_holder_htlcs(&mut self) {
Expand Down