Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Still cannot force close inactive channels #2468

Closed
kiaraho opened this issue Jan 13, 2019 · 91 comments
Closed

Still cannot force close inactive channels #2468

kiaraho opened this issue Jan 13, 2019 · 91 comments
Assignees

Comments

@kiaraho
Copy link

@kiaraho kiaraho commented Jan 13, 2019

Background

Follow up to issue #2391. Now running 0.5.1. Most of the channels have closed and I can see the funds on-chain when I run lncli walletbalance. However, there are still some funds off-chain in one pending_force_closing_channels and two waiting_close_channels. Those channels have been in limbo for several days now.

Your environment

lncli version 0.5.1-beta commit=
Linux hostname 4.15.0-43-generic #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Bitcoin Core Daemon version v0.17.0.0-ge1ed37edaedc85b8c3468bd9a726046344036243

Steps to reproduce

Ran lncli closeallchannels --force to close inactive channels. Got the following message:

{
"remote_pub_key": "02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b",
"channel_point": "825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1",
"closing_txid": "cb118298c5526ca09576dd54f8be48b06994fafb790738879069ef50d8e156e6",
"error": ""
}
{
"remote_pub_key": "032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada",
"channel_point": "fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1",
"closing_txid": "e2e7c83691ad21deb37a42bcbb22886225a88c0b67a57be94e0060992d8e0b3c",
"error": ""
}

Expected behaviour

Inactive channels should close.

Actual behaviour

Three channels still stuck in limbo. The two closing_txid above do not seem to have been broadcast. Running lncli closeallchannels --force again returns a "[lncli] no open channels to close" message. Output from lncli pendingchannels command:

{
"total_limbo_balance": "11783124",
"pending_open_channels": [
],
"pending_closing_channels": [
],
"pending_force_closing_channels": [
{
"channel": {
"remote_node_pub": "029afc726a18abc8dc75ef6c9ed34354c275261086597d98067dd972bd72965943",
"channel_point": "cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0",
"capacity": "2062186",
"local_balance": "2054765",
"remote_balance": "0"
},
"closing_txid": "297735f112539d8c1f7325863468b8a67c20f3584be06bb4f9634c10c458df36",
"limbo_balance": "0",
"maturity_height": 0,
"blocks_til_maturity": 0,
"recovered_balance": "0",
"pending_htlcs": [
]
}
],
"waiting_close_channels": [
{
"channel": {
"remote_node_pub": "02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b",
"channel_point": "825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1",
"capacity": "7119858",
"local_balance": "6947834",
"remote_balance": "0"
},
"limbo_balance": "6947834"
},
{
"channel": {
"remote_node_pub": "032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada",
"channel_point": "fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1",
"capacity": "4842711",
"local_balance": "4835290",
"remote_balance": "0"
},
"limbo_balance": "4835290"
}
]
}

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Jan 13, 2019

That first channel has been fully swept on chain. Do you have start up logs for that? Similarly, do you have logs for the latter two?

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 14, 2019

By start up logs, are you referring to the entries in lnd.log after running lncli unlock? If so, which log entries in particular?

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Jan 15, 2019

In lnd.log grep for either of those channel point or txid's.

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 15, 2019

Here are the log entries I have immediately after running lncli pendingchannels.

grep cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666 ~/.lnd/logs/bitcoin/mainnet/lnd.log
2019-01-15 09:44:04.884 [INF] UTXN: NurseryReport: building nursery report for channel cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0

grep 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e ~/.lnd/logs/bitcoin/mainnet/lnd.log
2019-01-15 09:43:37.082 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 10m0s.

grep fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4 ~/.lnd/logs/bitcoin/mainnet/lnd.log
2019-01-15 09:43:37.117 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 10m0s.

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Jan 15, 2019

Aah... looks like you restored from an outdated backup?

If you're unlucky, you force-closed the channels (which is a breach when your backup is outdated), and the remote took the funds. If you're lucky, the remote force closed and has data-loss protection enabled. If that's the case, you can try to connect to the peer (02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b and 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada) and see what happens. Please post debug logs!

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 15, 2019

I was already connected to both peers.

///

lncli listpeers | grep -A 7 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b
"pub_key": "02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b",
"address": "167.99.50.31:9735",
"bytes_sent": "6728289",
"bytes_recv": "10243259",
"sat_sent": "0",
"sat_recv": "0",
"inbound": false,
"ping_time": "2025706"

lncli listpeers | grep -A 7 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada
"pub_key": "032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada",
"address": "5.189.141.242:9735",
"bytes_sent": "6171020",
"bytes_recv": "10269721",
"sat_sent": "0",
"sat_recv": "0",
"inbound": false,
"ping_time": "234567"

///

Should I disconnect then reconnect? If so, what debuglevel should I set prior to doing so and what entries should I grep for?

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Jan 15, 2019

Set debuglevel debug and reconnect. Interesting logs you can grep for are the channel points (825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e and fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4).

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 15, 2019

So now I seem to be stuck in an LND roach motel, as lncli disconnect won't let me check out! Output from lncli pendingchannels is the same as before, with one pending_force_closing_channels and two waiting_close_channels. Limbo balance is unchanged.

///

lncli disconnect --node_key 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b
[lncli] rpc error: code = Unknown desc = cannot disconnect from peer(02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b), all active channels with the peer need to be closed first

lncli disconnect --node_key 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada
[lncli] rpc error: code = Unknown desc = cannot disconnect from peer(032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada), all active channels with the peer need to be closed first

lncli listchannels
{
"channels": [
]
}

grep 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b ~/.lnd/logs/bitcoin/mainnet/lnd.log | tail -n 5
2019-01-15 19:15:13.309 [INF] CRTR: New channel discovered! Link connects 02282cfc83d4e4ac088a654184be62d7de0be6298e83af39431f9598c8c5631c09 and 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b with ChannelPoint(c8fdd4f1e9e57f68ada64b8d00a0c4bb3b8642a4d61161efafdfd010ec0865a2:0): chan_id=612600600140906496, capacity=0.00317161 BTC
2019-01-15 19:15:13.315 [INF] CRTR: New channel discovered! Link connects 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b and 02c5371591b640da03f9a645e2ddfa5620d8537388b03ddd524d36766c0f550db0 with ChannelPoint(9b7d587ef135daabbc1039e401b86ff9a063252427c9c5bd035fb0589635e84a:1): chan_id=612440071376011265, capacity=0.003 BTC
2019-01-15 19:15:13.318 [INF] CRTR: New channel discovered! Link connects 02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b and 03cfb64a81ad9b94e6f3fa2e34218c9242606890bbfb65a0bb57d603158e6c590c with ChannelPoint(cc88267b03a05a0d6059550f6ad8aaec1c196b9a7f72dc68fb7012930ebc47ce:0): chan_id=612615993258999808, capacity=0.00353752 BTC
2019-01-15 19:18:15.906 [DBG] RPCS: [disconnectpeer] from peer(02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b)
2019-01-15 19:30:50.100 [DBG] RPCS: [disconnectpeer] from peer(02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b)

grep 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada ~/.lnd/logs/bitcoin/mainnet/lnd.log | tail -n 5
2019-01-15 19:01:03.497 [INF] CRTR: New channel discovered! Link connects 024b8019795ec753fcdb1eb21ecd319bfab6a8b05e017f357dc5f052a181fb8c91 and 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada with ChannelPoint(18871533aec34e712053d661f7266c1f13f35f0e4b3bc8eda66cfe62ee747011:0): chan_id=582232088920064000, capacity=0.0002 BTC
2019-01-15 19:28:23.864 [DBG] RPCS: [disconnectpeer] from peer(032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada)
2019-01-15 19:30:51.862 [DBG] RPCS: [disconnectpeer] from peer(032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada)
2019-01-15 19:31:03.732 [INF] CRTR: New channel discovered! Link connects 02f3069a342ae2883a6f29e275f06f28a56a6ea2e2d96f5888a3266444dcf542b6 and 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada with ChannelPoint(2eb868dc242ce74f072abe9fab06c3a1823ce99b5b4ef156a3abf0dfc5109414:1): chan_id=592900650213703681, capacity=0.0002 BTC
2019-01-15 19:31:03.758 [INF] CRTR: New channel discovered! Link connects 02f3069a342ae2883a6f29e275f06f28a56a6ea2e2d96f5888a3266444dcf542b6 and 032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada with ChannelPoint(fcbf2a617b8165fb9d9c2d885bb0db24286f884acde2171bff05ec690f8e83e5:1): chan_id=605291046758973441, capacity=0.0002 BTC

grep 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e ~/.lnd/logs/bitcoin/mainnet/lnd.log | tail -n 1
2019-01-15 19:33:37.095 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 10m0s.

grep fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4 ~/.lnd/logs/bitcoin/mainnet/lnd.log | tail -n 1
2019-01-15 19:33:37.129 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 10m0s.

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Jan 15, 2019

Try restarting lnd, and post the grepped logs after restart (up until the "no commit point").

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 15, 2019

Stopped lnd, moved the existing lnd.log to a backup location, then restarted lnd so it creates a fresh lnd.log. Looks like older entries in the current lnd.log got overwritten, as I do not see what is usually the first line, Active chain: Bitcoin (network=mainnet). I've uploaded the contents of the current lnd.log up until the "no commit point", though I'm guessing lots of crucial debug messages got overwritten.

lnd-log_before_no-commit-point-found.txt

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Jan 16, 2019

Yeah, basically what I want to look for is after you connect to the peers, the channels will attempt to sync up. Here I expect the peer to not respond since they have closed the channel. If they respond with the commitment point then we might be able to recover the funds.

You can try increasing the saved logs size (take a look at lnd -h) so they don't get overwritten!

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 18, 2019

Used the --maxlogfilesize=100 option to increase the log size so it won't get overwritten so quickly. Here is the grepped output for the three channel points (last 30 lines).

grep cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666 lnd.log | tail -n 30

2019-01-18 20:13:19.957 [DBG] CNCT: Starting ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0), htlc_set=(contractcourt.htlcSet) {
2019-01-18 20:13:19.957 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): starting state=StateWaitingFullResolution
2019-01-18 20:13:19.960 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): still awaiting contract resolution
2019-01-18 20:13:19.960 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): relaunching 1 contract resolvers
2019-01-18 20:13:19.961 [DBG] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): attempting to resolve *contractcourt.commitSweepResolver
2019-01-18 20:13:19.961 [DBG] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): contract *contractcourt.commitSweepResolver not yet resolved
2019-01-18 20:13:19.961 [DBG] CNCT: *contractcourt.commitSweepResolver(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): waiting for commit tx to confirm
2019-01-18 20:13:20.150 [INF] CNCT: *contractcourt.commitSweepResolver(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): waiting for commit sweep txid=3b6abf124da45c1a11f4811b21444586c2a3d720d1edcfdb1d8d03867da0788e conf
2019-01-18 20:15:04.452 [INF] UTXN: NurseryReport: building nursery report for channel cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0
2019-01-18 20:26:48.613 [INF] UTXN: NurseryReport: building nursery report for channel cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0

grep 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e lnd.log | tail -n 30

2019-01-18 20:13:19.940 [DBG] CNCT: New ChainEventSubscription(id=0) for ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1)
2019-01-18 20:13:19.951 [DBG] CNCT: Starting chain watcher for ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1)
2019-01-18 20:13:19.951 [DBG] NTFN: Using height hint 537372 retrieved from cache for 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1
2019-01-18 20:13:19.951 [INF] NTFN: New spend subscription: spend_id=2, outpoint=825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1, height_hint=537372
2019-01-18 20:13:19.957 [INF] CNCT: Close observer for ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) active
2019-01-18 20:13:19.960 [DBG] CNCT: Starting ChannelArbitrator(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1), htlc_set=(contractcourt.htlcSet) {
2019-01-18 20:13:19.961 [INF] CNCT: ChannelArbitrator(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1): starting state=StateCommitmentBroadcasted
2019-01-18 20:13:20.088 [INF] NTFN: Dispatching spend notification for outpoint=825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1 at height=559059
2019-01-18 20:13:20.088 [WRN] CNCT: Unprompted commitment broadcast for ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) 
2019-01-18 20:13:20.088 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 1s.
2019-01-18 20:13:20.107 [INF] CNCT: ChannelArbitrator(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1): noop trigger chainTrigger
2019-01-18 20:13:21.064 [DBG] LNWL: ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1), starting local commitment: (*lnwallet.commitment)(0xc0114b6000)({
    PreviousOutPoint: (wire.OutPoint) 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1,
2019-01-18 20:13:21.066 [DBG] LNWL: ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1), starting remote commitment: (*lnwallet.commitment)(0xc010f3c000)({
    PreviousOutPoint: (wire.OutPoint) 825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1,
2019-01-18 20:13:21.066 [INF] PEER: NodeKey(02ad6fb8d693dc1e4569bcedefadf5f72a931ae027dc0f0c544b34c1c6f3b9a02b) loading ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1)
2019-01-18 20:13:21.066 [WRN] PEER: ChannelPoint(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) has status CommitmentBroadcasted, won't start.
2019-01-18 20:13:21.090 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 2s.
2019-01-18 20:13:23.090 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 4s.
2019-01-18 20:13:27.090 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 8s.
2019-01-18 20:13:35.093 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 16s.
2019-01-18 20:13:51.094 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 32s.
2019-01-18 20:14:23.097 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 1m4s.
2019-01-18 20:15:27.103 [ERR] CNCT: Unable to retrieve commitment point for channel(825f7a52a6ac12b40f5ae110baf00441a080bea290abcf403d2172eebe90126e:1) with lost state: no commit point found. Retrying in 2m8s.

grep fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4 lnd.log | tail -n 30

2019-01-18 20:13:19.940 [DBG] CNCT: New ChainEventSubscription(id=0) for ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1)
2019-01-18 20:13:19.951 [DBG] CNCT: Starting chain watcher for ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1)
2019-01-18 20:13:19.951 [DBG] NTFN: Using height hint 538170 retrieved from cache for fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1
2019-01-18 20:13:19.951 [INF] NTFN: New spend subscription: spend_id=1, outpoint=fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1, height_hint=538170
2019-01-18 20:13:19.957 [INF] CNCT: Close observer for ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) active
2019-01-18 20:13:20.107 [DBG] CNCT: Starting ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), htlc_set=(contractcourt.htlcSet) {
2019-01-18 20:13:20.107 [INF] CNCT: ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1): starting state=StateCommitmentBroadcasted
2019-01-18 20:13:20.110 [INF] CNCT: ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1): noop trigger chainTrigger
2019-01-18 20:13:20.160 [INF] NTFN: Dispatching spend notification for outpoint=fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1 at height=559059
2019-01-18 20:13:20.161 [WRN] CNCT: Unprompted commitment broadcast for ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) 
2019-01-18 20:13:20.162 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 1s.
2019-01-18 20:13:20.873 [DBG] LNWL: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), starting local commitment: (*lnwallet.commitment)(0xc010a44180)({
    PreviousOutPoint: (wire.OutPoint) fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1,
2019-01-18 20:13:20.875 [DBG] LNWL: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), starting remote commitment: (*lnwallet.commitment)(0xc010f22000)({
    PreviousOutPoint: (wire.OutPoint) fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1,
2019-01-18 20:13:20.875 [INF] PEER: NodeKey(032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada) loading ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1)
2019-01-18 20:13:20.876 [WRN] PEER: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) has status CommitmentBroadcasted, won't start.
2019-01-18 20:13:21.165 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 2s.
2019-01-18 20:13:23.165 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 4s.
2019-01-18 20:13:27.169 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 8s.
2019-01-18 20:13:35.170 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 16s.
2019-01-18 20:13:51.169 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 32s.
2019-01-18 20:14:23.171 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 1m4s.
2019-01-18 20:15:27.174 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 2m8s.
2019-01-18 20:17:35.172 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 4m16s.
@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Jan 18, 2019

If you restored from an old backup (which you should never do!), and the peers you were connected to either didn't support DLP (data loss protection), or the channel was closed before they were running 0.5.1, then they won't re-send the commitment point you need to sweep those funds. This was fixed in #1937. It's also the case that if you started with a stale backup, and broadcasted a revoked commitment, then the other party swept the funds. The channel (https://www.smartbit.com.au/tx/fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4) has been swept into the wallet, if you upgrade to master then if this hasn't confirmed yet we'll rebroadcast it with up to date fees.

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 18, 2019

I had no choice but to restore from a backup due to disk failure on the affected node. I have two nodes running in a part of the world where power outages are not uncommon. Is #1937 merged in 0.5.2 because both my nodes are running 0.5.1? I'm assuming that is what you meant by upgrading to master.

@molxyz

This comment has been minimized.

Copy link

@molxyz molxyz commented Jan 23, 2019

@kiaraho

Is #1937 merged in 0.5.2 because both my nodes are running 0.5.1? I'm assuming that is what you meant by upgrading to master.

No, that's not what he meant by upgrading to master. We don't have v0.5.2 and master is a different branch where the code is most uptodate and it might be able to fix your issue. This is how you do to get master on your terminal screen:

cd $GOPATH/src/github.com/lightningnetwork/lnd
git fetch
git checkout origin/master
make && make install

Restart LND.

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Jan 23, 2019

@molxyz, thanks for the clarification. I downloaded the LND binaries from https://github.com/lightningnetwork/lnd/releases/ instead of using Git. I'm assuming the most up to date code will eventually make its way into the next official release. If that is true, I can wait until those binaries are released.

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Feb 7, 2019

New version of LND is now released.

@halseth

This comment has been minimized.

Copy link
Collaborator

@halseth halseth commented Feb 7, 2019

Note that it is your channel peer that needs to update, and it probably won't work for old channels.

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Feb 8, 2019

I think this can be closed now as it's the role of DLP now (if possible) to remedy the situation?

@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Feb 8, 2019

Updated to v0.5.2 and waited for the node to get synched to chain. Limbo balance remains unchanged. The two channels below still show up under waiting_close_channels. Will wait to see what happens.

grep cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666 lnd.log | tail -n 30

2019-02-08 14:41:36.444 [INF] UTXN: NurseryReport: building nursery report for channel cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0
2019-02-08 14:45:46.430 [DBG] CNCT: Starting ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0), htlc_set=(contractcourt.htlcSet) {
2019-02-08 14:45:46.430 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): starting state=StateWaitingFullResolution
2019-02-08 14:45:46.432 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): still awaiting contract resolution
2019-02-08 14:45:46.432 [INF] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): relaunching 1 contract resolvers
2019-02-08 14:45:46.433 [DBG] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): attempting to resolve *contractcourt.commitSweepResolver
2019-02-08 14:45:46.433 [DBG] CNCT: ChannelArbitrator(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): contract *contractcourt.commitSweepResolver not yet resolved
2019-02-08 14:45:46.433 [DBG] CNCT: *contractcourt.commitSweepResolver(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): waiting for commit tx to confirm
2019-02-08 14:45:46.570 [INF] CNCT: *contractcourt.commitSweepResolver(cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0): waiting for commit sweep txid=3b6abf124da45c1a11f4811b21444586c2a3d720d1edcfdb1d8d03867da0788e conf
2019-02-08 16:21:50.202 [INF] UTXN: NurseryReport: building nursery report for channel cbb1d6bdfa2d1197b13167e83c3b7e18feb410e8cbff02d0b5488e14cd1be666:0

grep fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4 lnd.log | tail -n 30

2019-02-08 14:45:46.240 [DBG] CNCT: Starting ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), htlc_set=(contractcourt.htlcSet) {
2019-02-08 14:45:46.241 [INF] CNCT: ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1): starting state=StateCommitmentBroadcasted
2019-02-08 14:45:46.430 [INF] CNCT: ChannelArbitrator(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1): noop trigger chainTrigger
2019-02-08 14:45:46.504 [INF] NTFN: Dispatching spend notification for outpoint=fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1 at height=562158
2019-02-08 14:45:46.504 [WRN] CNCT: Unprompted commitment broadcast for ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) 
2019-02-08 14:45:46.504 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 1s.
2019-02-08 14:45:47.504 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 2s.
2019-02-08 14:45:49.505 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 4s.
2019-02-08 14:45:53.505 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 8s.
2019-02-08 14:46:01.505 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 16s.
2019-02-08 14:46:17.505 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 32s.
2019-02-08 14:46:49.506 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 1m4s.
2019-02-08 14:47:53.506 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 2m8s.
2019-02-08 14:49:37.293 [DBG] LNWL: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), starting local commitment: (*lnwallet.commitment)(0xc000da9ec0)({
    PreviousOutPoint: (wire.OutPoint) fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1,
2019-02-08 14:49:37.298 [DBG] LNWL: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1), starting remote commitment: (*lnwallet.commitment)(0xc000b06840)({
    PreviousOutPoint: (wire.OutPoint) fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1,
2019-02-08 14:49:37.299 [INF] PEER: NodeKey(032b2b3f4abda9677bb9563e226c068d3a2456fb8b036635a81c9bcaa1671d1ada) loading ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1)
2019-02-08 14:49:37.299 [WRN] PEER: ChannelPoint(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) has status CommitmentBroadcasted, won't start.
2019-02-08 14:50:01.506 [ERR] CNCT: Unable to retrieve commitment point for channel(fe49e1cc7f146578a73a42bb371355e087d1b4e14764ca1cd485085d3e3665b4:1) with lost state: no commit point found. Retrying in 4m16s.
@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Feb 8, 2019

@kiaraho you lost data. Therefore your node is now waiting on the remote peer to connnect and give you data you need to recover the channel state. Depending on which version they're running (or even implementation), they may never give you this special data. The latest version of lnd will always give you this data.

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Feb 8, 2019

See this comment: #2468 (comment)

@Roasbeef Roasbeef closed this Feb 8, 2019
@kiaraho

This comment has been minimized.

Copy link
Author

@kiaraho kiaraho commented Feb 8, 2019

image

To all the channels we lost...

@MykelSIlver

This comment has been minimized.

Copy link

@MykelSIlver MykelSIlver commented Mar 12, 2019

So it makes no sense to make backups. When something happens, you always restore an old state (backups are at least some hours old). When nodes close channels when your node is offline, and this is the case when you restore the backup, then those funds are lost.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Oct 9, 2019

I have the same problem with more than 3 BTC missing and "If you restored from an old backup (which you should never do!)" sounds like the most irrational thing, because:

  1. Every backup is old and that's the essence of it.
  2. LND have an option to create static channel backups...

Then why the f*ck I'm making those backups if I should never use them?!

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Oct 10, 2019

Static Channel Backups are static in that they never need to be updated. They're a snapshot of the static data in a channel that never changes. We've provided a safe way to do this for several releases now.

If you're attempting to backup the dynamic per state update data, then you'll need a proper strongly consistent backup system. Backing up state very N seconds and hoping things are consistent is not a valid way to do this.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Oct 10, 2019

Static Channel Backups are static in that they never need to be updated. They're a snapshot of the static data in a channel that never changes. We've provided a safe way to do this for several releases now.

If you're attempting to backup the dynamic per state update data, then you'll need a proper strongly consistent backup system. Backing up state very N seconds and hoping things are consistent is not a valid way to do this.

I'm confused, because there are two types of SCBs and both of them are files.

JSON ones which looks like that (have multi channel backup at the bottom):
https://i.postimg.cc/jjJ1CFBM/Screenshot-2019-10-10-at-4-04-17.png

And the other type have the following structure:
https://i.postimg.cc/bw5MyhVG/Screenshot-2019-10-10-at-4-04-06.png

The command will accept backups in one of three forms:

 * A single channel packed SCB, which can be obtained from
   exportchanbackup. This should be passed in hex encoded format.

 * A packed multi-channel SCB, which couples several individual
   static channel backups in single blob.

 * A file path which points to a packed multi-channel backup within a
   file, using the same format that lnd does in its channels.backup
   file.

OPTIONS:
--single_backup value a hex encoded single channel backup obtained from exportchanbackup
--multi_backup value a hex encoded multi-channel backup obtained from exportchanbackup
--multi_file value the path to a multi-channel back up file

How to import the JSON one into 0.7.1-beta commit=v0.7.1-beta-231-g1f92b7587ce031628bfb5efe2b60e0edf6648968-dirty ?

Tried to verify the json one and unable to do it:
https://i.postimg.cc/cL5HdvNQ/Screenshot-2019-10-10-at-4-16-34.png

Also tried to verify the other type of backup and the response is unknown?
https://i.postimg.cc/3wMsjhXs/Screenshot-2019-10-10-at-4-22-35.png

Then tried this command after deleting the channel.db file from the LND wallet:
https://i.postimg.cc/66RYRZjp/Screenshot-2019-10-10-at-4-31-07.png

Logs are showing that something is happening:
https://i.postimg.cc/wByhzKnj/Screenshot-2019-10-10-at-4-33-30.png

Will keep you updated.

@Roasbeef

This comment has been minimized.

Copy link
Member

@Roasbeef Roasbeef commented Oct 10, 2019

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Oct 31, 2019

@MasterNeuron - I am a bit curious about the progress? Are you able to recover some money, by now?

Hi @vegardengen, unfortunately not.

I've just woke up and checked my node, after ~2 days of loading it's over. That's the current node state after importing the channel.backup file from 30.09.2019 on LND 0.7.1, recovered from fresh paper wallet seed of my node:

Screenshot 2019-10-31 at 10 44 12

pendingchannels response is that:
Screenshot 2019-10-31 at 10 52 29

pendingchannels.txt

listchannels:
Screenshot 2019-10-31 at 11 03 26

I'm stuck... in limbo.

Regards,
INWHY

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Oct 31, 2019

Hi,

A restart is known to kick off something, sometimes. Can not hurt.

What you need now, is for your node to find the already submitted force-close transactions.

I can see that at least in the channel I have with you, your share of my channel with you is waiting for you to claim them and move them to your onchain balance.

Btw: Was there no onchain balance in your node? If there was, you might need to increase the lookahead count when rescanning. But for now, I think that can wait.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Oct 31, 2019

Hi,

A restart is known to kick off something, sometimes. Can not hurt.

What you need now, is for your node to find the already submitted force-close transactions.

Will restart it now. Also when the node is turned off will copy the .lnd folder and will switch to LND 0.8.0 to give it a try.

I can see that at least in the channel I have with you, your share of my channel with you is waiting for you to claim them and move them to your onchain balance.

That sounds promising : )

Btw: Was there no onchain balance in your node? If there was, you might need to increase the lookahead count when rescanning. But for now, I think that can wait.

384UPF8UVTV8b8rA8Binvx6WmHxTQkYDPY - That's my wallet address and before starting the channels import process I've sent on-chain transaction, probably that's why the current balance is zero, every time when there are some bitcoins I'm sending them no-chain.

Thank you!

Update: I've tried with LND 0.8.0 and unable to turn it on:
"Unable to start server: a height hint greater than 0 must be provided"

Screenshot 2019-10-31 at 14 50 19

With LND 0.7.1 it starts well, but still showing zero balances, also there is a strange error:

2019/10/31 13:00:05 http: TLS handshake error from 127.0.0.1:59612: remote error: tls: unknown certificate authority

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Oct 31, 2019

@guggero - any suggestion?

I actually have another suggestion, if the restore from SCB doesn't work out, but ideally I think restore from SCB should work.

Where do the spend subscriptions get their spends from? In our case, they all happened on october 2nd, so quite a few blocks ago....

@MasterNeuron what's becoming increasingly clear is that the fact that you started your node with an old database at october 2nd caused the other nodes (mine included) to use the SCB protocol and do a normal force close, where the funds are just waiting for you to claim them. In my understanding, the restorechanbackup have restored the keys which should enable you to do just that.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Oct 31, 2019

Is there any LND call to claim my funds or that should happen automatically after syncing?

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Oct 31, 2019

That should happen automatically, the spend notifications should pick up those force-closes that has already happened. But it could be that you need more patience.

@guggero - how does this scan for old transactions happen? I guess it will have to scan from the height hint until it finds the transaction?

What is happening in the log files? Perhaps increasing some log level would reveal some activity.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 3, 2019

@vegardengen Thank you about everything, I'm still trying to bring back my funds somehow. Found the best backup with logs from the days when all channels were closed, uploaded the log folder on google drive: https://drive.google.com/file/d/13AP_S9rNGaUzSIzIfCsY31ntHBnz2Cod/view?usp=sharing

Btw I found that article from @guggero https://medium.com/@guggero/did-you-lose-funds-on-the-lightning-network-because-of-a-disk-crash-8971b6a92494 and looks like he have a commission "I can also try to recover the funds for you but then I would need to charge 20% of the recovered satoshis because it’s quite some work" and may need to pay him ~ 7400$ for that service (~0.8 BTC)

Currently trying to make it works, will set the debug level to "trace" and keep you updated.

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 4, 2019

Do you have the logs from october 2nd? That's when all the channels were closed (at least those I have seen).

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 4, 2019

Do you have the logs from october 2nd? That's when all the channels were closed (at least those I have seen).

Sorry, I've provided wrong logs from 02.09.2019 instead of 02.10.2019.
With zgrep found many logs from 2nd of October, zipped and uploaded all of them:

.NUCLEUS/LNDS/.lnd-save/logs/bitcoin/mainnet/lnd.log
.NUCLEUS/LNDS/.lnd-save/logs/bitcoin/mainnet/lnd.log.93.gz

.NUCLEUS/LNDS/.lnd-current2/logs/bitcoin/mainnet/lnd.log

.NUCLEUS/LNDS/.lnd-current3/logs/bitcoin/mainnet/lnd.log

.NUCLEUS/LNDS/.lnd-copy/logs/bitcoin/mainnet/lnd.log
.NUCLEUS/LNDS/.lnd-copy/logs/bitcoin/mainnet/lnd.log.93.gz
.NUCLEUS/LNDS/.lnd-copy/logs/bitcoin/mainnet/lnd.log.93

.NUCLEUS/LNDS/.lnd-havesomechance/logs/bitcoin/mainnet/lnd.log.241.gz
.NUCLEUS/LNDS/.lnd-havesomechance/logs/bitcoin/mainnet/lnd.log.240.gz
.NUCLEUS/LNDS/.lnd-havesomechance/logs/bitcoin/mainnet/lnd.log.242.gz

.NUCLEUS/LNDS/.lnd-buggy/logs/bitcoin/mainnet/lnd.log

.NUCLEUS/otherbackups/.lnd-current1/logs/bitcoin/mainnet/lnd.log
.NUCLEUS/otherbackups/.lnd-current1/logs/bitcoin/mainnet/lnd.log.241.gz
.NUCLEUS/otherbackups/.lnd-current1/logs/bitcoin/mainnet/lnd.log.240.gz

.NUCLEUS/otherbackups/.lnd/logs/bitcoin/mainnet/lnd.log

There are many folders with copies, because I wasn't aware of that "latest state" requirement.

https://drive.google.com/file/d/10gKEowvm3Hx2Z70uEOTsX8trT5Rv0yyD/view?usp=sharing

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 4, 2019

I'm currently running one of those backups and unable to sweep some coins which are showing in walletbalance:
Screenshot 2019-11-04 at 15 03 26

If the output is already spent, then why it's showing in walletbalance while synced_to_chain = true

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 4, 2019

LND onchain wallet can get out of sync like that, "pretty easily". I'd not really worry about that now, but if you would really like to, there are ways to rescan.

More importantly: Where is this channel balance you are trying to sweep coming from? If it coins you sent off from another backup, then they don't really exist in your copy.

But: Which backup are you now running? I was going to suggest, as a next step, to run one taken after having started your node on october 2nd, but before the restore on the 3rd. It's quite likely that that is the one with the most correct information.

But, let is hold off for someone elses opinion. @guggero ?

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 4, 2019

I'm still "fighting" with my node backups and wanna ask publicly @guggero for support, able to give him all files, wallet password and 24 words seed for 20% commission tax of recovered (if any) satoshis.

That's my own wallet address: 384UPF8UVTV8b8rA8Binvx6WmHxTQkYDPY

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 5, 2019

I have now looked through some logs. In your lnd-havesomechance/log/bitcoin/mainnnet/lnd.log.242.fz is where I can see some instances of:

2019-10-02 11:59:25.724 [DBG] PEER: Sending ChannelReestablish(next_local_height=1, remote_tail_height=0) t
o 0200dfda3cb0250b2d801d200155ec056958cf598a30ffd36c207340cb9f6645a3@76.20.210.129:9735
2019-10-02 11:59:25.909 [DBG] PEER: Received ChannelReestablish(next_local_height=501, remote_tail_height=5
00) from 0200dfda3cb0250b2d801d200155ec056958cf598a30ffd36c207340cb9f6645a3@76.20.210.129:9735
2019-10-02 11:59:25.911 [INF] HSWC: Received re-establishment message from remote side for channel(9785a3d3
58ed96c435656b3d50fb1e4871167fd4b2a086c530c8de8550976f93:1)
2019-10-02 11:59:25.911 [ERR] LNWL: ChannelPoint(9785a3d358ed96c435656b3d50fb1e4871167fd4b2a086c530c8de8550
976f93:1), sync failed with local data loss: remote believes our tail height is 500, while we have 0!
2019-10-02 11:59:25.912 [WRN] LNWL: ChannelPoint(9785a3d358ed96c435656b3d50fb1e4871167fd4b2a086c530c8de8550976f93:1): detected restored triggering DLP
2019-10-02 11:59:25.912 [WRN] HSWC: Error when syncing channel states: ChannelPoint(9785a3d358ed96c435656b3d50fb1e4871167fd4b2a086c530c8de8550976f93:1) with CommitPoint(03ba85737f49b8b949d75d2d58a551d46b4bbdc1be0a35c253e24da25595498bc6) had possible local commitment state data loss
2019-10-02 11:59:25.938 [ERR] HSWC: ChannelLink(571528:2517:1) Failing link: unable to synchronize channel states: ChannelPoint(9785a3d358ed96c435656b3d50fb1e4871167fd4b2a086c530c8de8550976f93:1) with CommitPoint(03ba85737f49b8b949d75d2d58a551d46b4bbdc1be0a35c253e24da25595498bc6) had possible local commitment state data loss

I can also see, down there, and in many logs:

2019-10-10 23:55:20.294 [DBG] NTFN: Dispatching historical spend rescan for outpoint=aebc8f6f39f0057aec4ac7dafc2c61fe8576f61e3b0dcb3458c3261b37ef3253:0

These are what ought to find the force-close funds from your channel partners force-closed channels. And you are sure that they never found anything?

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 5, 2019

I would now like to see the logs from your longest running restore of a channel.backup, after this incident...

Btw, more logs. You are not running with a pruned backend?

2019-10-10 16:06:14.745 [ERR] CNCT: Unable to retrieve commitment point for channel(ff77fc071c7ec3441830af465764d
0208c4028e93adcf6580bc2727bde2d8c04:0) with lost state: no commit point found. Retrying in 10m0s.

@guggero

This comment has been minimized.

Copy link
Collaborator

@guggero guggero commented Nov 5, 2019

I haven't had time to look at the logs. So thanks @vegardengen for digging into them!

It looks like there are at least two problems that block you from getting all channels back:

  1. You initialized SCB, which in general is the right thing to do. But SCB relies on the remote peer to be available to give you the data you need to recover the funds. If a peer is offline for good, you won't be able to get your channel balance back with SCB.
    The error message CNCT: Unable to retrieve commitment point for means: "I'm trying to initialize DLP but the remote peer is not responding. Trying again later."

  2. Your chain backend (bitcoind or btcd?) might be overwhelmed with all the rescans it has to do. It's also a very valid question if you run a pruned node? It looks like some rescans are getting lost if the wallet thinks it has UTXOs that are in fact already spent.

Also, I really would have liked to see what would have happened with a full chain rescan while you still had the channel DB up, after you did the "foce close all" command. Because there you would have had a chance of getting funds from non-existing peers while the DB was still intact. But you went ahead and jumped right into SCB where we know for sure that offline peers will never help us.

So it's a huge steaming mess right now with over 400 channels that we have no idea what state they are in. I'm not sure if I even want to give this a try. There's so much code that needs to be written and so much rescanning to be done, it will take forever... This is not at all the situation that I described in my article.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 5, 2019

Btw, more logs. You are not running with a pruned backend?

Sorry, I don't know what pruned backend means.

Your chain backend (bitcoind or btcd?)

Currently using bitcoind, previously btcd.

If a peer is offline for good, you won't be able to get your channel balance back with SCB.

The @vegardengen node is not offline but my funds are still locked.

And you are sure that they never found anything?

At the moment from both nodes I've recovered totally 2.4468972 BTC out of 6.18142594, still need to bring back ~3.73452874 BTC, some of the recovered coins was in my walletbalance even before "closing" those channels.

Also, I really would have liked to see what would have happened with a full chain rescan while you still had the channel DB up, after you did the "foce close all" command.

I've tried and nothing was found.

I would now like to see the logs from your longest running restore of a channel.backup, after this incident...

Let me provide the logs from that "recovery" process described above:

I've just woke up and checked my node, after ~2 days of loading it's over. That's the current node state after importing the channel.backup file from 30.09.2019 on LND 0.7.1, recovered from fresh paper wallet seed of my node:

Screenshot 2019-11-05 at 11 13 35

pending.txt

So it's a huge steaming mess right now with over 400 channels that we have no idea what state they are in. I'm not sure if I even want to give this a try.

Who should have an idea if not the software?, it must work flawlessly like an ATM machine.

After providing a backup file if something is wrong the software need to say "hey, dude, you f***ed up everything and I'm recovering only 20 channels out of 400, those are your logs!"

That would be the main purpose of our backups, to sort out everything no matter how many channels are there. Ok, if 400 are too many, then set some limitations...

It should work like a PLUG & PLAY device. flawlessly. If channel balances are still owned by my node, then what's the issue? - I'm not a programmer, but it's sure that for one reason or another, the software just fails to sweep them back.

I thought that it's enough to have a paper seed backup + static channel backup + file backups + passwords to recover them...

... like every normal software

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 5, 2019

@MasterNeuron - LND will never be an end-user-friendly wallet, it will probably always require some knowledge. There exists more end-user-options from Lightning Labs, the lightning-app, which is available both in desktop, android and mobile version.

That said, everyone agrees that it has a way to go, still, which is everyone is always providing disclaimers about not putting in more than you are willing to risk.

Right now, I would like to see logs from the longest-running, most detailed restore from seed plus channel.backup.

In case it is still unclear, you should then start with a blank state: No wallet.db and no channel.db, it should all be provided by seed and channel.backup.

Do you still have such a restore running? As previously mentioned, this may definitely take time.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 5, 2019

@MasterNeuron - LND will never be an end-user-friendly wallet, it will probably always require some knowledge. There exists more end-user-options from Lightning Labs, the lightning-app, which is available both in desktop, android and mobile version.

There are some differences between end-user-friendly and stable. If the coins are still locked in a limbo state between my node and your node, and if the software was stable, then everything should be fine, but it's not.

That said, everyone agrees that it has a way to go, still, which is everyone is always providing disclaimers about not putting in more than you are willing to risk.

You are right about that.

Right now, I would like to see logs from the longest-running, most detailed restore from seed plus channel.backup.

In case it is still unclear, you should then start with a blank state: No wallet.db and no channel.db, it should all be provided by seed and channel.backup.

Do you still have such a restore running? As previously mentioned, this may definitely take time.

Here is it:
lnd.log

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 5, 2019

It looks to me like things are progressing, gradually. Do you get any new incoming funds in lncli listchaintxns?

Do you get any new closed channels in lncli closedchannels

Is the number of entries in pendingchannels decreasing?

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 5, 2019

I don't have any closedchannels, pendingchannels are still the same amount without any change:

ubuntu@ip-172-31-29-90:~> /home/ubuntu/gocode/bin/lncli closedchannels
{
"channels": [
]
}

I'm still trying to fix some SCB import errors:

[lncli] unable to restore chan backups: rpc error: code = Unknown desc = unable to unpack chan backup: chacha20poly1305: message authentication failed

and another one

Unable to start server: a height hint greater than 0 must be provided

It's full of errors. Many people blame me that it's only my fault, but I have to say that the software isn't less buggy. lightning-casino is my second blockchain project, the first one was created in early 2014, co-founded one of the first bitcoin exchanges in my country, that's me from 2016: https://www.bloombergtv.bg/investbook/2016-06-28/kriptovalutata-se-razviva-po-barzo-ot-tehnologiite-za-sigurnost

@guggero

This comment has been minimized.

Copy link
Collaborator

@guggero guggero commented Nov 6, 2019

It's full of errors.

Yes, you are right. That's why there still is a -beta suffix in each version that we release.

But I am working on fixing those SCB errors.
The a height hint greater than 0 must be provided should be an easy fix. I think it happens when there is a channel in the SCB file that was not confirmed yet at the time of the file update. We should just skip this channel instead of failing the whole file.

The message authentication failed is harder to pin down. It either means that you are trying to use a channel backup file that was not created with the node/seed you are trying to restore. But it could also mean that the SCB file is corrupt for some reason.

I'm going to write a script that goes through all your 400 channels and looks at the TX outputs if there is still something to be claimed. This should also give us a final account of how much was actually lost because of justice transactions (if any).

@guggero

This comment has been minimized.

Copy link
Collaborator

@guggero guggero commented Nov 6, 2019

Looking at the log file, you need to let your node where you started the rescue run for a few days.
When you shut it down on October 31, there were a lot of messages like:

Rescan to determine the spend details of outpoint=6a60b1f49b94f31dc4409f210d86aac32454d344466ae3d92f4477c87b7a318d:1 within range 570450-601598 failed: chain notifier shutting down

Which means, the node was still rescanning channels from the chain backend (bitcoind). This can take a long time!

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 6, 2019

@guggero, Thank you, If I recover some funds will give you a percentage, because without your help I'll be lost.

@vegardengen

This comment has been minimized.

Copy link
Contributor

@vegardengen vegardengen commented Nov 6, 2019

What I meant when I said, "looks like it is progressing nicely", is this:

2019-10-31 12:32:27.931 [INF] LTND: Received interrupt 2019-10-31 12:32:27.932 [INF] LTND: Shutting down... 2019-10-31 12:32:27.932 [INF] LTND: Gracefully shutting down. 2019-10-31 12:32:28.150 [ERR] NTFN: Rescan to determine the spend details of outpoint=bbc18a3bda9d24f99b74bdc89b4 d52671365dd9265f3cb770eb27d742a527e5c:0 within range 566116-601590 failed: chain notifier shutting down 2019-10-31 12:32:30.009 [ERR] NTFN: Rescan to determine the spend details of outpoint=30a6add708ccc85aa38c5015029 b23487f6b46d3adf21d974b87260387362097:1 within range 568399-601613 failed: chain notifier shutting down

If you let these run, there's all reason to believe that your node will find spend details of those outpoints, and it has the keys to sweep them.

I believe this is likely to bring back at least quite a lot of your funds, @guggero has promised to write a script which is likely to tell us to which extent.

The reason why your restore of the static channel backup doesn't work is something else which we don't fully understand. Maybe if you could show us the exact steps you tried when restoring it?

For the record, the starting procedure for using a static channel backup should, in your case, be an empty node. No wallet, no channel database, just your 24 word seed and the static channel backup.

But I believe the best way forward is to continue the attempt which you seemed to stop at november 5th. Let it run - for a week or two if it needs to. There's a lot of scanning your node has to do.

And watch your logs, your listchaintxns and closedchannels and see what happens, if you like. Because every now and then, it should find a closing transaction that it need to sweep funds from.

@vwoo

This comment has been minimized.

Copy link
Contributor

@vwoo vwoo commented Nov 10, 2019

The logs contain commit points, such as in vegardengen's comment #2468 (comment)

As a last resort, could we derive the private key for each utxo with MasterNeuron's lnd aezeed private key and the commit point (if found in the log)? The channel point is right there as well so we can find the relevant utxo with MasterNeuron's funds.

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 10, 2019

@vwoo Thanks for your support, I'm chatting with @guggero on lightning community slack channel and he will try to recover some funds for me with 20% commission which is acceptable. Will send everything to him based on trust, looks like he helped other people before and I hope that there is a chance for my funds. Currently typing an encrypted email with node seed, files, ssh server access etc. That's our conversation:

slack.txt

@MasterNeuron

This comment has been minimized.

Copy link

@MasterNeuron MasterNeuron commented Nov 17, 2019

To keep everybody updated, @guggero unlocked more than 1 BTC of those "lost" coins:

https://www.blockchain.com/btc/tx/1478c8d3e95043f77a18a50e1f51822930c03c1ba60662a623118eae7e94b3a9
https://www.blockchain.com/btc/tx/ed16c57d711061575495ac730ff4d55c13dde5756487a5e3e762986e11030b06

He explained to me every step of that process which is still ongoing.

@guggero

This comment has been minimized.

Copy link
Collaborator

@guggero guggero commented Dec 9, 2019

I'm finally closing this issue as the remaining 3.7 BTC have been restored.
This is the tool I created to assist in the recovery, combined with SCB of course.

These are the most important things we learned from this issue (for people coming across this when searching):

  • SCB restore has a few bugs. 0.8.2 will fix those.
  • 400 channels is a lot! lnd comes to its limits when doing bulk operations on that many channels. Commands like closeallchannels or restorechanbackup can take hours (!). And aborting them will most certainly leave your node in an inconsistent state. Always check the log for activity before hitting Ctrl+C on a long-running command. And we probably should add more warnings to lncli.
  • Restoring from SCB takes a long time with more than a few dozen channels. Think days or even a week. For example, adding an SCB with 430 channels took about 2.5 hours (just the restorechanbackup command) then the node did chain re-scans for about 3 days (on a fast machine with SSD) then took another 2 days to communicate with the peers to close the channels and sweep the funds.
  • SCBs don't work with offline peers. Repeat after me: Close channels with offline peers! See this article for more information. In this case, there was almost 1 BTC in zombie channels and that would have been lost if @MasterNeuron didn't keep copies of the channel.db around. Therefore making file based backups can help as long as you know what you are doing and don't just put the files back in lnd's data directory. Do SCB for the online channels first, then use my tool to force-close zombie channels (which can still be very risky).
  • It's getting harder to actually lose coins, as this issue shows. But lnd is still in beta! That means it's far from "it just works and I don't have to know anything about it". There's still a lot of safety code that has yet to be written (I got some usable ideas from working on this issue). So please if you've got a life-altering amount of BTC in lnd (which you SHOULDN'T), at least educate yourself about the DOs and DON'Ts. I'm going to write some more explicit documentation on this too.
@guggero guggero closed this Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.