Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node forced closed a channel with zapread.com, not sure why #4613

Closed
manreo opened this issue Sep 13, 2020 · 66 comments · Fixed by #4915
Closed

Node forced closed a channel with zapread.com, not sure why #4613

manreo opened this issue Sep 13, 2020 · 66 comments · Fixed by #4915
Assignees
Labels
bug Unintended code behaviour commitments Commitment transactions containing the state of the channel P1 MUST be fixed or reviewed p2p Code related to the peer-to-peer behaviour

Comments

@manreo
Copy link
Contributor

manreo commented Sep 13, 2020

Background

My node force-closed a channel with zapread.com, it was a good and active channel, I am not sure why my node decided to force-close the channel, some relevant things that I found, full log attached

2020-09-02 04:58:55.181 [ERR] HSWC: ChannelLink(563360:2528:1): failing link: ChannelPoint(25194f104e65a27cfd6867ac8e8708c9cb898e1212a51ff1862ffb03cc1e340b:1): received error from peer: chan_id=0b341ecc03fb2f86f11fa512128e89cbc908878eac6768fd7ca2654e104f1924, err=internal error with error: remote error
2020-09-02 04:58:55.185 [ERR] HSWC: ChannelLink(563360:2528:1): link failed, exiting htlcManager
2020-09-02 04:58:55.185 [INF] HSWC: ChannelLink(563360:2528:1): exited

.
.
.

2020-09-02 12:45:07.278 [INF] DISC: GossipSyncer(037659a0ac8eb3b8d0a720114efc861d3a940382dcfa1403746b4f8f6b2e8810ba): applying gossipFilter(start=0001-01-01 00:00:00 +0000 UTC, end=0001-01-01 00:00:00 +0000 UTC)
2020-09-02 12:45:07.279 [INF] DISC: GossipSyncer(0231eccc6510eb2e1c97c8a190d6ea096784aa7c358355442055aac8b20654f932): applying gossipFilter(start=2020-09-02 12:45:07.279317998 +0000 UTC m=+408745.563033846, end=2156-10-09 19:13:22.279317998 +0000 UTC m=+4295376040.563033846)
2020-09-02 12:45:37.279 [INF] DISC: Broadcasting 265 new announcements in 18 sub batches
2020-09-02 12:46:03.069 [INF] CRTR: Processed channels=0 updates=232 nodes=0 in last 59.998613183s
2020-09-02 12:47:03.069 [INF] CRTR: Processed channels=0 updates=272 nodes=0 in last 59.999710346s
2020-09-02 12:47:07.338 [INF] DISC: Broadcasting 388 new announcements in 18 sub batches
2020-09-02 12:48:03.069 [INF] CRTR: Processed channels=0 updates=170 nodes=3 in last 59.999843239s
2020-09-02 12:48:37.433 [INF] DISC: Broadcasting 219 new announcements in 17 sub batches
2020-09-02 12:48:54.682 [INF] CRTR: Pruning channel graph using block 0000000000000000000e65c5c2126e7c902fa09351b8087ab331182eeabebfab (height=646431)
2020-09-02 12:48:56.056 [INF] NTFN: New block: height=646431, sha=0000000000000000000e65c5c2126e7c902fa09351b8087ab331182eeabebfab
2020-09-02 12:48:56.059 [ERR] LTND: Unable to lookup witness: no witnesses
2020-09-02 12:48:56.060 [INF] UTXN: Attempting to graduate height=646431: num_kids=0, num_babies=0
2020-09-02 12:48:56.413 [INF] HSWC: Garbage collected 1 shared secret hashes at height=646431
2020-09-02 12:48:56.467 [INF] CNCT: ChannelArbitrator(25194f104e65a27cfd6867ac8e8708c9cb898e1212a51ff1862ffb03cc1e340b:1): force closing chan
2020-09-02 12:48:56.611 [INF] HSWC: Removing channel link with ChannelID(0b341ecc03fb2f86f11fa512128e89cbc908878eac6768fd7ca2654e104f1924)
2020-09-02 12:48:57.822 [INF] CRTR: Block 0000000000000000000e65c5c2126e7c902fa09351b8087ab331182eeabebfab (height=646431) closed 1 channels
2020-09-02 12:49:00.967 [INF] CNCT: Broadcasting force close transaction d9ad30a963638a6efe405b608e1746b55092cb877e620d7135fe083becadbc51, ChannelPoint(25194f104e65a27cfd6867ac8e8708c9cb898e1212a51ff1862ffb03cc1e340b:1): (*wire.MsgTx)(0x40016b6200)({
 Version: (int32) 2,
 TxIn: ([]*wire.TxIn) (len=1 cap=1) {
  (*wire.TxIn)(0x40072170e0)({
   PreviousOutPoint: (wire.OutPoint) 25194f104e65a27cfd6867ac8e8708c9cb898e1212a51ff1862ffb03cc1e340b:1,
   SignatureScript: ([]uint8) <nil>,
   Witness: (wire.TxWitness) (len=4 cap=4) {
    ([]uint8) <nil>,
    ([]uint8) (len=72 cap=144) {
     00000000  30 45 02 21 00 cb 77 1a  b5 0a 77 19 78 f8 6e d1  |0E.!..w...w.x.n.|
     00000010  5b d3 0d f2 c5 09 b8 0a  f1 cd 7b e4 d5 37 54 ca  |[.........{..7T.|
     00000020  03 8c fd 45 41 02 20 00  98 e9 33 4e 9b 54 72 1c  |...EA. ...3N.Tr.|
     00000030  03 46 f9 80 7b f3 3f 2c  ff 9c 9b bf 9d bf fb d6  |.F..{.?,........|
     00000040  57 0f d7 89 b8 b5 9f 01                           |W.......|
    },
    ([]uint8) (len=72 cap=144) {
     00000000  30 45 02 21 00 c7 1b 89  a9 bb 26 e3 70 62 da 92  |0E.!......&.pb..|
     00000010  18 79 e4 b3 c6 2d cd d2  23 55 33 82 6b 6d 69 e5  |.y...-..#U3.kmi.|
     00000020  4f e5 15 e8 ed 02 20 1a  0f c3 a1 e3 78 c9 60 02  |O..... .....x.`.|
     00000030  23 33 20 47 70 0b 4c a2  71 0a 75 5c 8c f0 ef 90  |#3 Gp.L.q.u\....|
     00000040  5a b2 d8 b8 21 92 e2 01                           |Z...!...|
    },
    ([]uint8) (len=71 cap=500) {
     00000000  52 21 02 10 09 28 9d 07  68 3c dc 41 e5 b6 c5 45  |R!...(..h<.A...E|
     00000010  2d e9 ae e4 2f 24 5a bd  7f c5 ca 2b 7f 81 87 f7  |-.../$Z....+....|
     00000020  1f 5f f9 21 03 fd 59 d9  e6 f3 cc 0b 54 bb 88 11  |._.!..Y.....T...|
     00000030  8b 8e b1 aa 2d 5e f7 f3  b2 35 7f 05 9d 84 d4 0e  |....-^...5......|
     00000040  ab 89 94 d2 d9 52 ae                              |.....R.|
    }
   },
   Sequence: (uint32) 2160573938
  })
 },
 TxOut: ([]*wire.TxOut) (len=2 cap=2) {
  (*wire.TxOut)(0x4004b83480)({
   Value: (int64) 18918,
   PkScript: ([]uint8) (len=22 cap=22) {
    00000000  00 14 34 61 63 e4 5b 89  ae dd 6b 9d 6f 3d bf 58  |..4ac.[...k.o=.X|
    00000010  17 fd c8 2e d4 61                                 |.....a|
   }
  }),
  (*wire.TxOut)(0x4004b834a0)({
   Value: (int64) 151877,
   PkScript: ([]uint8) (len=34 cap=34) {
    00000000  00 20 64 f9 21 c5 22 fc  e3 38 b4 97 59 36 eb a5  |. d.!."..8..Y6..|
    00000010  6d c4 1f 4a 2d 1d 25 d9  18 9b 7d 9b 3b d8 cc cf  |m..J-.%...}.;...|
    00000020  4a db                                             |J.|
   }
  })
 },
 LockTime: (uint32) 540252184
})

2020-09-02 12:49:01.181 [INF] LNWL: Inserting unconfirmed transaction d9ad30a963638a6efe405b608e1746b55092cb877e620d7135fe083becadbc51

Your environment

lnd version 0.11.0-beta commit=v0.11.0-beta
Bitcoin Core Daemon version v0.18.0
Linux rock64 4.4.132-1075-rockchip-ayufan-ga83beded8524 #1 SMP Thu Jul 26 08:22:22 UTC 2018 aarch64 GNU/Linux

Full log available here (password protected, send me a slack message and I'll send you the password)

lnd.log.80.zip

@enorrmann
Copy link

same thing happened to me, had a couple of force closings for no reason
lnd 0.11.0-beta, neutrino, linux x64

@Roasbeef
Copy link
Member

Unless we've identified a new spurious force close vector. I'd check that you didn't have an HTLC which has expired. If not, then although we've fixed a number of spurious force close vectors, both sides need to be updated to have them disappear completely.

@manreo
Copy link
Contributor Author

manreo commented Sep 15, 2020

Unless we've identified a new spurious force close vector. I'd check that you didn't have an HTLC which has expired

Not sure how to check it? I've uploaded the full log on those days.

both sides need to be updated to have them disappear completely.

Luckily he is on github. @Horndev can you verify please what was your ZapRead node lnd version around 2020-09-02 ?
Thank you

@enorrmann
Copy link

enorrmann commented Sep 15, 2020

happened again with another node, I have some logging information here
https://rayo.host/logs/force_close.log
I'm running 0.11Beta.. and the other peer is @Blockdaemon, I'm not sure which version are they on

@enorrmann
Copy link

and another one
https://rayo.host/logs/force_close_2.log

@enorrmann
Copy link

another one
https://rayo.host/logs/force_close_3.log

@Roasbeef
Copy link
Member

As mentioned above, if no HTLCs were present on the commitment transaction, then it may be due to the other node having an older version with bugs that could cause it to send an invalid commitment. In each of those statements, I'd look further back in the logs to see if the peer sent an invalid commitment earlier in the exchange.

@Crypt-iQ Crypt-iQ self-assigned this Sep 21, 2020
@Crypt-iQ
Copy link
Collaborator

If you can get the version of the peer, this would help us. Also if there's a specific "invalid commitment" log message, that would aid us.

@enorrmann
Copy link

enorrmann commented Sep 25, 2020

2 days ago I opened this channels
lncli openchannel --node_key 03a5886df676f3b3216a4520156157b8d653e262b520281d8d325c24fd8b456b9c --sat_per_byte 3 --local_amt 3000000
lncli openchannel --node_key 02004c625d622245606a1ea2c1c69cfb4516b703b47945a3647713c05fe4aaeb1c --sat_per_byte 3 --local_amt 6000000
lncli openchannel --node_key 0395033b252c6f40e3756984162d68174e2bd8060a129c0d3462a9370471c6d28f --sat_per_byte 3 --local_amt 8000000
lncli openchannel --node_key 03e50492eab4107a773141bb419e107bda3de3d55652e6e1a41225f06a0bbf2d56 --sat_per_byte 3 --local_amt 7000000

today I found all of them were force closed from my side
no "invalid commitment" log message found
Edit: node names are
yalls-clearnet
WalletOfSatoshi.com
BitMEXResearch

@Crypt-iQ
Copy link
Collaborator

Crypt-iQ commented Sep 25, 2020

Can you try to pinpoint the error messages, there should be something like link failed. The thing I'm after is what message was sent across. As you are a neutrino node, force closes with older nodes is more likely to happen. If any of the other counterparties are not on the 0.11 branch, then force closes can happen like laolu mentioned up above

@enorrmann
Copy link

@Crypt-iQ
Copy link
Collaborator

Thanks, is there a way to open it to a paste-bin like site if things aren't too sensitive?

@enorrmann
Copy link

https://pastebin.com/BLhbEC44
do you need the full log ?

@Crypt-iQ
Copy link
Collaborator

Two interesting messages:

  • 2020-09-25 04:51:17.080 [ERR] HSWC: ChannelLink(646845:1072:0): failing link: unable to handle upstream add HTLC: ID 27 on HTLC add does not match expected next ID 26 with error: invalid update
  • 2020-09-25 04:51:20.760 [WRN] HSWC: ChannelLink(646845:1072:0): error when syncing channel states: first message sent to sync should be ChannelReestablish, instead received: *lnwire.Error

First error should've been fixed in 0.11 if they upgraded too. Not sure why the other occurs

Ping @cfromknecht

@manreo
Copy link
Contributor Author

manreo commented Sep 25, 2020

Another "local force-close", with runam0k nodl:

Peer Public Key: 02f8f981a3d6cb6536fc12ea2abdcfd20f7490e28197514aafc05a7a1f08d3de09
Channel Point: c5e508504fda9c200d458bb68c8e814615fdf805a44a8bb3c5fa69ffb7f59ce5
Closing Transaction Hash: 22429d9eb955db66c638a4e8a23b0a41cfc4862dc4d713d7b0d6f478e3fe4083

Full log:
https://privatebin.net/?6fa87ce4beca6a74#C8QqFN1YY3zP985v3HQLbe6rv79LQyGyHNsSigycerNz

@Crypt-iQ
Copy link
Collaborator

So you need debug level logs on - and you keep disconnecting with a tor node so it could be that this is triggering those issues fixed in 0.11. It makes sense that a tor node would cause reliability issues.

@manreo
Copy link
Contributor Author

manreo commented Sep 25, 2020

Hi @Crypt-iQ how can I run with debug mode?
About the disconnects, is it related to this:
#4575

Still waiting for a response from @Roasbeef about #4575

@enorrmann
Copy link

Hi @Crypt-iQ how can I run with debug mode?

just edit lnd.conf
debuglevel=debug or finer detail
see https://github.com/lightningnetwork/lnd/blob/master/sample-lnd.conf

@enorrmann
Copy link

today I got another one... force closing from my side
https://pastebin.com/FtzewS93
channel point ea3e68dcba3367a4d5b59da9f6366c42a87d3023422ff79925d1659682587f43:0

@Roasbeef
Copy link
Member

First error should've been fixed in 0.11 if they upgraded too. Not sure why the other occurs

The second there can happen due to slight differences between implementations w.r.t what order they expect certain messages in.

2020-09-25 04:51:17.065 [INF] HSWC: ChannelLink(646845:1072:0): received re-establishment message from remote side
2020-09-25 04:51:17.080 [ERR] HSWC: ChannelLink(646845:1072:0): failing link: unable to handle upstream add HTLC: ID 27 on HTLC add does not match expected next ID 26 with error: invalid update

Note sure which implementation this is, but we've seen in the past that certain implementations don't re-transmit HTLCs in order of increasing HTLC ID, which can at times trigger this issue.

@enorrmann on that latest one, there were no other logs before that force close attempt, nothing on the p2p layer of maybe someone triggering a DLP scenario?

@Roasbeef
Copy link
Member

Edit: node names are

This is really useful info, thanks! So from here, we should be able to check out the node announcements themselves, then check the feature bits to see if we can place them in an lnd version bucket from there.

@Crypt-iQ
Copy link
Collaborator

Can you both resend logs with debug enabled if you can? Can do over slack, name is "eugene". Or publicly here.

@manreo
Copy link
Contributor Author

manreo commented Sep 30, 2020

Hi @Crypt-iQ , the node is now running with debung mode, but I need to wait for a new forceclose to have the log files that includes debugstate, correct ?

@Crypt-iQ
Copy link
Collaborator

@mrmanpew yes

@manreo
Copy link
Contributor Author

manreo commented Oct 6, 2020

I got two more force close, but I don't have the log files as only 3 logs are saved and now that I am in debug mode, the logs already past the time of the close 🤦
I changed the settings to save more than 3 log files... But now we need to wait again. Those force-close are expensive...

@Roasbeef
Copy link
Member

Roasbeef commented Oct 6, 2020

@mrmanpew after doing debug mode for all the logs, if you set lncli debuglevel --level=PEER=info.

@manreo
Copy link
Contributor Author

manreo commented Oct 6, 2020

@Roasbeef
If I set lncli debuglevel --level=PEER=info then what? I think you did not finish the sentence.
This is my lnd.conf:

# RaspiBolt LND Mainnet: lnd configuration
# /home/bitcoin/.lnd/lnd.conf
[Application Options]
debuglevel=debug
maxpendingchannels=5
color=#68F442
listen=localhost
nat=false
maxlogfiles=50
maxlogfilesize=20
#sync-freelist=1
rpclisten=0.0.0.0:10009
tlsextraip=192.168.1.0/24


[Bitcoin]
bitcoin.active=1
bitcoin.mainnet=1
bitcoin.node=bitcoind

[autopilot]
autopilot.active=0
autopilot.maxchannels=5
autopilot.allocation=0.6


[Tor]
tor.active=true
tor.v3=true

@manreo
Copy link
Contributor Author

manreo commented Oct 8, 2020

@Roasbeef @Crypt-iQ
Another one, with LN.SIDESHIFT.AI

2020-10-08 20:01:51.993 [DBG] CNCT: ChannelArbitrator(3114234b7b95278049b6790037b3e7d74d56a9cd13956d9fce32b05adc7ea140:0): go to chain for outgoing htlc 4811bd19193e4a29f5fcd025ab5e4d95b11f5bce0d1be89942bede0e3bec74c8: timeout=651859, blocks_until_expiry=0, broadcast_delta=0
2020-10-08 20:01:52.423 [INF] CNCT: ChannelArbitrator(3114234b7b95278049b6790037b3e7d74d56a9cd13956d9fce32b05adc7ea140:0): force closing chan
2020-10-08 20:01:59.410 [DBG] LNWL: ChannelPoint(3114234b7b95278049b6790037b3e7d74d56a9cd13956d9fce32b05adc7ea140:0): starting local commitment: 

Full log attached
lnd.log.597_pass.zip

Password sent via slack.

@mrfelton
Copy link
Contributor

mrfelton commented Oct 20, 2020

No, and the node that I had to disconnect/reconnect was a different one (Bitrefill). We have several channels with OpenNode and have not noticed any specific issues with them prior to this force closure.

@Crypt-iQ
Copy link
Collaborator

No, and the node that I had to disconnect/reconnect was a different one (Bitrefill). We have several channels with OpenNode and have not noticed any specific issues with them prior to this force closure.

Do you know what version OpenNode is running?

@mrfelton
Copy link
Contributor

Apparently they just updated from 0.7.x to to v0.11.x yesterday @Crypt-iQ

@Crypt-iQ
Copy link
Collaborator

Apparently they just updated from 0.7.x to to v0.11.x yesterday @Crypt-iQ

Do you know (from your logs maybe?) whether the force close occurred before or after the upgrade?

@manreo
Copy link
Contributor Author

manreo commented Oct 20, 2020

Two more force-close
Node names:
@coingate
@BitMEXResearch

In the case of @BitMEXResearch
out_htlc_count>0 so I assume it suppose to force close, right?
Do you want me to keep reporting it? Or should I stop?

2020-10-20 02:31:08.769 [DBG] CNCT: ChannelArbitrator(5508a3cb1bd3658bfd3b4a1b4863210e8af661f650b880dc44ac74ec7af7b375:0): checking commit chain actions at height=653506, in_htlc_count=0, out_htlc_count=2
 lnd.log.905
2020-10-20 02:31:08.774 [DBG] CNCT: ChannelArbitrator(5508a3cb1bd3658bfd3b4a1b4863210e8af661f650b880dc44ac74ec7af7b375:0): go to chain for outgoing htlc 12c242d9196a39238f262f0ab4bdd396b2c87650047739a23cbc9449858b827f: timeout=653506, blocks_until_expiry=0, broadcast_delta=0
 lnd.log.905
2020-10-20 02:31:09.451 [INF] CNCT: ChannelArbitrator(5508a3cb1bd3658bfd3b4a1b4863210e8af661f650b880dc44ac74ec7af7b375:0): force closing chan

In the case of @coingate , It catually seems that I owed them an htlc (we owe them a commitment), so I did not return it and this is why the force-closed the channel. No idea why my node did not return it.

2020-10-19 04:03:19.874 [DBG] CNCT: ChannelArbitrator(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): checking commit chain actions at height=653376, in_htlc_count=2, out_htlc_count=0
2020-10-19 04:05:08.324 [DBG] CHFT: Channel 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0 recording event: peer_offline
2020-10-19 04:05:09.860 [DBG] CHFT: Channel 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0 recording event: peer_online
2020-10-19 04:05:10.244 [DBG] LNWL: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): sync: remote believes our tail height is 3434, while we have 3435, we owe them a revocation
2020-10-19 04:05:10.245 [DBG] LNWL: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): sync: remote's next commit height is 3436, while we believe it is 3436, we owe them a commitment
2020-10-19 04:05:11.796 [ERR] HSWC: ChannelLink(584107:1286:0): failing link: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): received error from peer: chan_id=94d89be7009f2ad541f92d22b0fd357f2025d16813f2f66386bb6e6088dc7b0b, err=invalid commitment with error: remote error
2020-10-19 04:19:49.523 [DBG] NTFN: Found spend of outpoint=0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0, script=0 cd5be724c762ad4f4c889fd4999410b01b5c1db657773c73230406b91ea40b68: spend_tx=9ac829b5ffe4ebb777f67c45f2b1806bc310627cdf6cd24d5c4cec08df697ad2, block_height=653377
2020-10-19 04:19:51.088 [DBG] NTFN: Found spend of outpoint=0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0, script=0 cd5be724c762ad4f4c889fd4999410b01b5c1db657773c73230406b91ea40b68: spend_tx=9ac829b5ffe4ebb777f67c45f2b1806bc310627cdf6cd24d5c4cec08df697ad2, block_height=653377
2020-10-19 04:19:51.088 [DBG] NTFN: Spend request outpoint=0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0, script=0 cd5be724c762ad4f4c889fd4999410b01b5c1db657773c73230406b91ea40b68 spent at tip=653377
2020-10-19 04:19:51.970 [INF] NTFN: Dispatching confirmed spend notification for outpoint=0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0, script=0 cd5be724c762ad4f4c889fd4999410b01b5c1db657773c73230406b91ea40b68 at current height=653377: 9ac829b5ffe4ebb777f67c45f2b1806bc310627cdf6cd24d5c4cec08df697ad2[0] spending 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0 at height=653377
2020-10-19 04:19:51.973 [DBG] CNCT: ChannelArbitrator(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): new block (height=653377) examining active HTLC's
2020-10-19 04:19:51.979 [DBG] CNCT: ChannelArbitrator(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): checking commit chain actions at height=653377, in_htlc_count=2, out_htlc_count=0
2020-10-19 04:19:51.988 [DBG] CNCT: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): local_commit_type=0, local_commit=(*channeldb.ChannelCommitment)(0x400082ebf0)({
    PreviousOutPoint: (wire.OutPoint) 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0,
2020-10-19 04:19:51.995 [DBG] CNCT: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): remote_commit_type=0, remote_commit=(*channeldb.ChannelCommitment)(0x400082ec70)({
    PreviousOutPoint: (wire.OutPoint) 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0,
2020-10-19 04:19:51.999 [DBG] CNCT: ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): remote_pending_commit_type=0, remote_pending_commit=(channeldb.ChannelCommitment) {
    PreviousOutPoint: (wire.OutPoint) 0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0,
2020-10-19 04:19:52.008 [WRN] CNCT: Unprompted commitment broadcast for ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0) 
2020-10-19 04:19:52.013 [INF] CNCT: Unilateral close of ChannelPoint(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0) detected
2020-10-19 04:19:52.019 [INF] CNCT: ChannelArbitrator(0b7bdc88606ebb8663f6f21368d125207f35fdb0222df941d52a9f00e79bd894:0): remote party has closed channel out on-chain

Tell me if you are interested in the full logs.

@mrfelton
Copy link
Contributor

Apparently they just updated from 0.7.x to to v0.11.x yesterday @Crypt-iQ

Do you know (from your logs maybe?) whether the force close occurred before or after the upgrade?

They performed the upgrade at around 22:45 PST. The channel closure happened at 01:42 PST. So, seems the channel was force closed whist their node was running the older version.

@Crypt-iQ
Copy link
Collaborator

@mrmanpew for the coingate channel this is tracked as a spec issue

@mrfelton This should be normal as the older version of the software inconsistently handled some commitment tx data.

If either of you see any force closures with both channel parties running v0.11, then I think we can look into that more concretely.

@mrfelton
Copy link
Contributor

mrfelton commented Oct 22, 2020

Just had three more channels with Fold node simultaneously force closed:

2020-10-22 06:43:23.834 [DBG] CNCT: ChannelArbitrator(5e9eac0744e00bb7b62c5aad5b64047b6a9ead5676d7908b4551bac19f743683:0): checking commit chain actions at height=653793, in_htlc_count=0, out_htlc_count=0
2020-10-22 06:43:23.834 [DBG] CNCT: ChannelArbitrator(84ec882c4afbbe0490d44071bb889cd479a8c9563da8f7df1e5fa5409a38ba7f:0): checking commit chain actions at height=653793, in_htlc_count=0, out_htlc_count=0
2020-10-22 06:43:23.834 [DBG] CNCT: ChannelArbitrator(d829cae5f3a2658de40738200efc4041eb44e3f4ab35748ff121fd7b2df24890:0): checking commit chain actions at height=653793, in_htlc_count=0, out_htlc_count=0
2020-10-22 06:43:23.834 [DBG] CNCT: ChannelArbitrator(7b9e4476fddf8f4fbc0f1ffad157cde4b23fcbe86eacbb2c4eb27d0f351af2be:0): checking commit chain actions at height=653793, in_htlc_count=0, out_htlc_count=0
2020-10-22 06:43:23.835 [INF] CNCT: ChannelArbitrator(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): force closing chan
2020-10-22 06:43:23.835 [INF] CNCT: ChannelArbitrator(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): force closing chan
2020-10-22 06:43:23.838 [INF] HSWC: Garbage collected 10 shared secret hashes at height=653793
2020-10-22 06:43:23.838 [INF] HSWC: Removing channel link with ChannelID(d40b77e9e41ffe6bd7eea5d8f14d0f9460645b4022377e07fcd116ddd05c533f)
2020-10-22 06:43:23.840 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): starting local commitment: (*lnwallet.commitment)(0xc0051b8300)({
2020-10-22 06:43:23.843 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): starting remote commitment: (*lnwallet.commitment)(0xc0051b8480)({
2020-10-22 06:43:23.846 [INF] HSWC: Removing channel link with ChannelID(39d75cc2d2d192adff56def4beea62047ea994998e1e458a34504ad4aac2a48e)
2020-10-22 06:43:23.845 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): pending remote commitment: (*lnwallet.commitment)(0xc003aee3c0)({
2020-10-22 06:43:23.848 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): starting local commitment: (*lnwallet.commitment)(0xc003a12900)({
2020-10-22 06:43:23.850 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): Restoring 0 dangling remote updates
2020-10-22 06:43:23.850 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): Restoring 0 local updates that the peer should sign
2020-10-22 06:43:23.850 [INF] CNCT: ChannelArbitrator(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): force closing chan
2020-10-22 06:43:23.851 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): starting remote commitment: (*lnwallet.commitment)(0xc0033f7380)({
2020-10-22 06:43:23.852 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): pending remote commitment: (*lnwallet.commitment)(0xc0045cd200)({
2020-10-22 06:43:23.852 [INF] HSWC: Removing channel link with ChannelID(03cf6441e8fd5b432e65de2ce6f8a551b71de906e53ff0a4e08affc846aadeb9)
2020-10-22 06:43:23.853 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): Restoring 0 dangling remote updates
2020-10-22 06:43:23.853 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): Restoring 0 local updates that the peer should sign
2020-10-22 06:43:23.854 [INF] CNCT: Broadcasting force close transaction 679ccd2c3464212f25268ca8a01e3ee90bcef0ff5efc6b80e7b85c167cf9209a, ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): (*wire.MsgTx)(0xc000822940)({
2020-10-22 06:43:23.855 [INF] LNWL: Inserting unconfirmed transaction 679ccd2c3464212f25268ca8a01e3ee90bcef0ff5efc6b80e7b85c167cf9209a
2020-10-22 06:43:23.855 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): starting local commitment: (*lnwallet.commitment)(0xc0051b8540)({
2020-10-22 06:43:23.857 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): starting remote commitment: (*lnwallet.commitment)(0xc003f2a6c0)({
2020-10-22 06:43:23.858 [INF] CNCT: Broadcasting force close transaction 5a24bf384d9c0dc805510772c9aac633000401842cf7735fedb293385820d4c3, ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): (*wire.MsgTx)(0xc004ec3340)({
2020-10-22 06:43:23.858 [INF] LNWL: Inserting unconfirmed transaction 5a24bf384d9c0dc805510772c9aac633000401842cf7735fedb293385820d4c3
2020-10-22 06:43:23.858 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): pending remote commitment: (*lnwallet.commitment)(0xc003f2a900)({
2020-10-22 06:43:23.860 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): Restoring 0 dangling remote updates
2020-10-22 06:43:23.860 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): Restoring 0 local updates that the peer should sign
2020-10-22 06:43:23.866 [INF] CNCT: Broadcasting force close transaction 519511b702e65dd42f4b46ac6cb8f2716a9ef10e09cf60a03fdb23f3150eca50, ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): (*wire.MsgTx)(0xc005680f40)({
2020-10-22 06:43:23.867 [INF] LNWL: Inserting unconfirmed transaction 519511b702e65dd42f4b46ac6cb8f2716a9ef10e09cf60a03fdb23f3150eca50
2020-10-22 06:43:23.941 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): starting local commitment: (*lnwallet.commitment)(0xc00473c000)({
2020-10-22 06:43:23.941 [DBG] LNWL: ChannelPoint(3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1): starting local commitment: (*lnwallet.commitment)(0xc0049c7680)({
2020-10-22 06:43:23.943 [DBG] LNWL: ChannelPoint(8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0): starting remote commitment: (*lnwallet.commitment)(0xc003f2a000)({
2020-10-22 06:43:23.944 [DBG] LNWL: ChannelPoint(b9deaa46c8ff8ae0a4f03fe506e91db751a5f8e62cde652e435bfde84164cf03:0): starting local commitment: (*lnwallet.commitment)(0xc004b26300)({

@mrfelton
Copy link
Contributor

Just to add to that too - we are also experiencing connectivity issues with the Fold node. It randomly starts showing as offline when in fact I know it is online. For example, right now I have the following channels with Fold:

7x pending force closing
1x online
1x offline

How can one of the channels be showing as offline whist the other shows as online, given that they are both to the same peer?

Running the following makes the offline one start showing online again:

lncli disconnect 02816caed43171d3c9854e3b0ab2cf0c42be086ff1bd4005acc2a5f7db70d83774
lncli connect 02816caed43171d3c9854e3b0ab2cf0c42be086ff1bd4005acc2a5f7db70d83774@35.238.153.25:9735

I'm detailing it here since this seems to be a common thread with the nodes that keep force closing on us but let me know if I should I create a separate issue for the disconnect/reconnect issue.

@ghost
Copy link

ghost commented Oct 22, 2020

Thanks @mrfelton ! Fold was one I was having the forced closed issue with myself. Do we know what version they are on?

@Crypt-iQ
Copy link
Collaborator

Crypt-iQ commented Oct 22, 2020

Could you please create a separate issue for the connect/disconnect issue so this one is just force closes @mrfelton ? Version for fold would help as well, [PEER] & [HSWC] logs are probably mostly what is needed for diagnosis.

@tdickman
Copy link

Thanks @mrfelton ! Fold was one I was having the forced closed issue with myself. Do we know what version they are on?

Fold is on version 0.11.0-beta.

@Crypt-iQ
Copy link
Collaborator

@tdickman did Fold upgrade before or after these force closes?

@mrfelton
Copy link
Contributor

mrfelton commented Oct 22, 2020

They have been running this node version for a while. Here are the full logs from around the time of the closures.

lnd-logs.zip

@tdickman
Copy link

tdickman commented Oct 22, 2020

@mrfelton and I have been DM-ing back and forth. There are 4 channels that have been force closed between us recently:

3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1
3096b5bbf7a31c2bd9b262572cd4e8b169e6ac78fb77a420f6d8575e109f4db3:0
8ea4c2aad44a50348a451e8e9994a97e0462eabef4de56ffad92d1d2c25cd739:0
88f8307d2c8c9e15a4d2181dc9db2c10a2ae26d5f94e98d831c5ef3db0c04d20:0

None of these appear in our channels list (which is weird). Here is a filtered view of our logs, searching for any references to the first channel: https://paste.ubuntu.com/p/B2ZC4K6Twz/

@mrfelton
Copy link
Contributor

Here is our data on that same channel:

        {
            "channel": {
                "remote_node_pub": "02816caed43171d3c9854e3b0ab2cf0c42be086ff1bd4005acc2a5f7db70d83774",
                "channel_point": "3e535cd0dd16d1fc077e3722405b6460940f4df1d8a5eed76bfe1fe4e9770bd4:1",
                "capacity": "15000000",
                "local_balance": "5032983",
                "remote_balance": "0",
                "local_chan_reserve_sat": "0",
                "remote_chan_reserve_sat": "0",
                "initiator": "INITIATOR_LOCAL",
                "commitment_type": "STATIC_REMOTE_KEY"
            },
            "closing_txid": "679ccd2c3464212f25268ca8a01e3ee90bcef0ff5efc6b80e7b85c167cf9209a",
            "limbo_balance": "5032983",
            "maturity_height": 655597,
            "blocks_til_maturity": 1754,
            "recovered_balance": "0",
            "pending_htlcs": [
            ],
            "anchor": "LIMBO"
        },

@Roasbeef Roasbeef added bug Unintended code behaviour commitments Commitment transactions containing the state of the channel P1 MUST be fixed or reviewed p2p Code related to the peer-to-peer behaviour labels Oct 22, 2020
@sputn1ck
Copy link
Collaborator

We had an unexpected unilateral channel close between 2 nodes we control, both running 0.11.0. We have not found "expire" in our logs. Are there some logs we can provide?
This was the commitment error message:
Oct 21 21:13:22 donnernode lnd[24389]: 2020-10-21 21:13:22.428 [ERR] HSWC: ChannelLink(652714:1118:1): failing link: ChannelPoint(d7f0b27fe98a68640250c71414cbf598a85898ca5d970992d6115f999a410dc9:1): unable to accept new commitment: rejected commitment: commit_height=3189, invalid_commit_sig=3045022100e47d73be443dd4fb11e81a7d4720572b03dca9ff33dbeba7f6e8f8f97bed6e0202206ffa41247f2b575e4f5be5ac8fa78d611e7137833c4311ab532fa1473a0c1f11, commit_tx=0200000001c90d419a995f11d69209975dca9858a898f5cb1414c7500264688ae97fb2f0d70100000000b6d3a58002be3d1b0000000000160014ca59ef805ca3acabbf3992a66a2dd59486503f43e7be300000000000220020a292f4e51422e95a4fef0b61d84f0a54f1bb9c86cea3911ae0d8ad71d5edb4c91be2cf20, sig_hash=77fcb455787523de2045d4fab9c2c5771c5b3b4d56c2809040be8add152a5c5f with error: invalid commitment

@manreo
Copy link
Contributor Author

manreo commented Oct 23, 2020

Another force-close from my side, now with @RiverFinancial , I assume that they are running latest version
in_htlc_count=0, out_htlc_count=0

full log, debug mode (password is the same as before, see slack)

2020-10-22 15:31:48.864 [DBG] CNCT: ChannelArbitrator(2ffd7fc2c769ffac18b04f81a01e8419331ee084ca471cf4aecaa2da800e4631:0): new block (height=653846) examining active HTLC's
2020-10-22 15:31:48.864 [DBG] CNCT: ChannelArbitrator(2ffd7fc2c769ffac18b04f81a01e8419331ee084ca471cf4aecaa2da800e4631:0): checking commit chain actions at height=653846, in_htlc_count=0, out_htlc_count=0
2020-10-22 15:31:49.326 [DBG] DISC: New block: height=653846, hash=0000000000000000000867f160899f5ccbf5e93e050195a0500d3e1bdc80993b
2020-10-22 15:31:49.326 [DBG] PEER: Received ChannelUpdate(chain_hash=000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f, short_chan_id=678047930287587328, mflags=00000001, cflags=00000001, update_time=2020-10-22 15:03:21 +0000 UTC) from 0311cad0edf4ac67298805cf4407d94358ca60cd44f2e360856f3b1c088bcd4782@213.174.156.86:9735
2020-10-22 15:31:49.360 [INF] CNCT: ChannelArbitrator(2ffd7fc2c769ffac18b04f81a01e8419331ee084ca471cf4aecaa2da800e4631:0): force closing chan

lndlog.zip

@acburkhardt
Copy link

Experienced a force close in the following situation:

  • Channel flooded with HTLCs in jamming experiment
  • Endless payment loop after attempting to pay an invoice through this channel
  • Failed to reconnect; rejected commitment:
[ERR] HSWC: ChannelLink(657759:561:0): failing link: ChannelPoint(c37bec006b18df172698a84739ca47128935e0a8666fecd1a843e49b01db207c:0): received error from peer: chan_id=7c20db019be443a8d1ec6f66a8e035891247ca3947a8982617df186b00ec7bc3, err=rejected commitment: commit_height=455, invalid_commit_sig=3044022076fd65191eb6305b723fa6012be378413b6326e2786c38db58b4c02e1f3999d202207605ca31de8b4c5b1d9cd20dc1581dfa2383e0b4e06c8ad4f718ab5c434d8cf5, commit_tx=02000000017c20db019be443a8d1ec6f66a8e035891247ca3947a8982617df186b00ec7bc300000000008a792e8002210d0000000000002200201031cf10a1efef261edd3d0a1a6a953b27bc25bd7150bb2b07afdc69805e02157213000000000000160014de650929042bef58b71783ae1a44834a902a8f2d542ca720, sig_hash=4e0fb804c74376020e4c44a60969b9206eb0aaa9a89b76017d60f23ad5cf63e5 with error: remote error
  • Channel force closed by peer

@dskvr
Copy link

dskvr commented Aug 7, 2023

There must be a better way to debug these issues...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended code behaviour commitments Commitment transactions containing the state of the channel P1 MUST be fixed or reviewed p2p Code related to the peer-to-peer behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

12 participants