Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: Channels are disconnected after an "unknown message" error #7273

Closed
ekimber opened this issue Dec 25, 2022 · 10 comments · Fixed by #7415
Closed

[bug]: Channels are disconnected after an "unknown message" error #7273

ekimber opened this issue Dec 25, 2022 · 10 comments · Fixed by #7415
Assignees
Labels
bug Unintended code behaviour interop interop with other implementations

Comments

@ekimber
Copy link

ekimber commented Dec 25, 2022

Background

Channels with core lightning nodes (most recent 22.11 version and master) frequently disconnect after an "unknown message" error

Your environment

  • version of lnd 0.15.5-beta
  • 5.15.81 1-NixOS x86_64
  • bitcoind v.24.01

Steps to reproduce

Dec 25 00:06:55 lnd[538311]: 2022-12-25 00:06:55.363 [INF] PEER: Peer(02e8f0717f412f455bbcafabc7119f3e3bc1e6c7efeff3df14c98f4da14278dc95): unable to read message from peer: unable to parse message of unknown> Dec 25 00:06:55 lnd[538311]: 2022-12-25 00:06:55.364 [INF] PEER: Peer(02e8f0717f412f455bbcafabc7119f3e3bc1e6c7efeff3df14c98f4da14278dc95): unable to read message from peer: EOF Dec 25 00:06:55 lnd[538311]: 2022-12-25 00:06:55.364 [INF] PEER: Peer(02e8f0717f412f455bbcafabc7119f3e3bc1e6c7efeff3df14c98f4da14278dc95): disconnecting 02e8f0717f412f455bbcafabc7119f3e3bc1e6c7efeff3df14c98f> Dec 25 00:06:55 lnd[538311]: 2022-12-25 00:06:55.364 [INF] NTFN: Cancelling epoch notification, epoch_id=6259

This happens frequently, perhaps several times in an hour.

Expected behaviour

Not disconnecting, or providing more information about the parse error so that it might be investigated

Actual behaviour

Peers frequently disconnect. I cannot exclude the possibility that some of them are LND peers but so far I have identified two of them as core lightning nodes

@ekimber ekimber added bug Unintended code behaviour needs triage labels Dec 25, 2022
@ellemouton ellemouton added the interop interop with other implementations label Jan 3, 2023
@ellemouton
Copy link
Collaborator

@ekimber, is this the full log line? unable to parse message of unknown?
The full line should be something like: unable to parse message of unknown type <type number here>
Which will help us understand if cln is sending a type they should not send or if it is something else.

@Roasbeef
Copy link
Member

Roasbeef commented Jan 3, 2023

The disconnect seems to be from something else:

[INF] PEER: Peer(02e8f0717f412f455bbcafabc7119f3e3bc1e6c7efeff3df14c98f4da14278dc95): unable to read message from peer: EOF 

We'll parse unknown messages and just ignore them and have for some time now. If you run w/ the trace debug level, then you'll be able to see the message itself. We should update this log to print useful info tho:

return "<unknown>"

@ekimber
Copy link
Author

ekimber commented Jan 3, 2023

@ellemouton The message type is of unknown type <unknown>

@Roasbeef So I am guessing that ignoring the message is somehow causing the peer to disconnect because it happens right after the unknown message. What module do I need to run with trace level to see the message itself? It does not log the messages with PEER set to trace.

@ekimber
Copy link
Author

ekimber commented Jan 4, 2023

I believe the reason I do not see the message in the logs is because in brontide.go line 1128 of the readNextMessage function is never reached:
p.logWireMessage(nextMsg, true)
when the message type is unknown.

@guggero
Copy link
Collaborator

guggero commented Jan 5, 2023

Could this be related to #7290? Is your peer a cln node?

@yaslama
Copy link
Contributor

yaslama commented Jan 5, 2023

Could this be related to #7290? Is your peer a cln node?

@ekimber you can try the following patch to see the warning message:

diff --git a/funding/manager.go b/funding/manager.go
index 6f9384880..b96484f9c 100644
--- a/funding/manager.go
+++ b/funding/manager.go
@@ -839,15 +839,23 @@ func (f *Manager) reservationCoordinator() {
 			switch msg := fmsg.msg.(type) {
 			case *lnwire.OpenChannel:
 				f.handleFundingOpen(fmsg.peer, msg)
+
 			case *lnwire.AcceptChannel:
 				f.handleFundingAccept(fmsg.peer, msg)
+
 			case *lnwire.FundingCreated:
 				f.handleFundingCreated(fmsg.peer, msg)
+
 			case *lnwire.FundingSigned:
 				f.handleFundingSigned(fmsg.peer, msg)
+
 			case *lnwire.FundingLocked:
 				f.wg.Add(1)
 				go f.handleFundingLocked(fmsg.peer, msg)
+
+			case *lnwire.Warning:
+				f.handleWarningMsg(fmsg.peer, msg)
+
 			case *lnwire.Error:
 				f.handleErrorMsg(fmsg.peer, msg)
 			}
@@ -4075,12 +4083,16 @@ func (f *Manager) handleInitFundingMsg(msg *InitFundingMsg) {
 	}
 }
 
+// handleWarningMsg processes the warning which was received from remote peer.
+func (f *Manager) handleWarningMsg(peer lnpeer.Peer, msg *lnwire.Warning) {
+	log.Warnf("received warning message from peer %x: %v",
+		peer.IdentityKey().SerializeCompressed(), msg.Warning())
+}
+
 // handleErrorMsg processes the error which was received from remote peer,
 // depending on the type of error we should do different clean up steps and
 // inform the user about it.
-func (f *Manager) handleErrorMsg(peer lnpeer.Peer,
-	msg *lnwire.Error) {
-
+func (f *Manager) handleErrorMsg(peer lnpeer.Peer, msg *lnwire.Error) {
 	chanID := msg.ChanID
 	peerKey := peer.IdentityKey()
 
diff --git a/funding/manager_test.go b/funding/manager_test.go
index 4df3cd360..4d97eef9b 100644
--- a/funding/manager_test.go
+++ b/funding/manager_test.go
@@ -784,6 +784,14 @@ func fundChannel(t *testing.T, alice, bob *testNode, localFundingAmt,
 	// Forward the response to Alice.
 	alice.fundingMgr.ProcessFundingMsg(acceptChannelResponse, bob)
 
+	// Check that sending warning messages does not abort the funding
+	// process.
+	warningMsg := &lnwire.Warning{
+		Data: []byte("random warning"),
+	}
+	alice.fundingMgr.ProcessFundingMsg(warningMsg, bob)
+	bob.fundingMgr.ProcessFundingMsg(warningMsg, alice)
+
 	// Alice responds with a FundingCreated message.
 	fundingCreated := assertFundingMsgSent(
 		t, alice.msgChan, "FundingCreated",
diff --git a/htlcswitch/link.go b/htlcswitch/link.go
index dbf44d3f8..14c7c266a 100644
--- a/htlcswitch/link.go
+++ b/htlcswitch/link.go
@@ -2064,6 +2064,13 @@ func (l *channelLink) handleUpstreamMsg(msg lnwire.Message) {
 		// Update the mailbox's feerate as well.
 		l.mailBox.SetFeeRate(fee)
 
+	// In the case where we receive a warning message from our peer, just
+	// log it and move on. We choose not to disconnect from our peer,
+	// although we "MAY" do so according to the specification.
+	case *lnwire.Warning:
+		l.log.Warnf("received warning message from peer: %v",
+			msg.Warning())
+
 	case *lnwire.Error:
 		// Error received from remote, MUST fail channel, but should
 		// only print the contents of the error message if all
diff --git a/lnwire/lnwire.go b/lnwire/lnwire.go
index 0361d7648..5f042b746 100644
--- a/lnwire/lnwire.go
+++ b/lnwire/lnwire.go
@@ -91,60 +91,70 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case uint8:
 		var b [1]byte
 		b[0] = e
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case FundingFlag:
 		var b [1]byte
 		b[0] = uint8(e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case uint16:
 		var b [2]byte
 		binary.BigEndian.PutUint16(b[:], e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case ChanUpdateMsgFlags:
 		var b [1]byte
 		b[0] = uint8(e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case ChanUpdateChanFlags:
 		var b [1]byte
 		b[0] = uint8(e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case MilliSatoshi:
 		var b [8]byte
 		binary.BigEndian.PutUint64(b[:], uint64(e))
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case btcutil.Amount:
 		var b [8]byte
 		binary.BigEndian.PutUint64(b[:], uint64(e))
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case uint32:
 		var b [4]byte
 		binary.BigEndian.PutUint32(b[:], e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case uint64:
 		var b [8]byte
 		binary.BigEndian.PutUint64(b[:], e)
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case *btcec.PublicKey:
 		if e == nil {
 			return fmt.Errorf("cannot write nil pubkey")
@@ -156,6 +166,7 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(b[:]); err != nil {
 			return err
 		}
+
 	case []Sig:
 		var b [2]byte
 		numSigs := uint16(len(e))
@@ -169,11 +180,13 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 				return err
 			}
 		}
+
 	case Sig:
 		// Write buffer
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case PingPayload:
 		var l [2]byte
 		binary.BigEndian.PutUint16(l[:], uint16(len(e)))
@@ -184,6 +197,7 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case PongPayload:
 		var l [2]byte
 		binary.BigEndian.PutUint16(l[:], uint16(len(e)))
@@ -194,6 +208,18 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
+	case WarningData:
+		var l [2]byte
+		binary.BigEndian.PutUint16(l[:], uint16(len(e)))
+		if _, err := w.Write(l[:]); err != nil {
+			return err
+		}
+
+		if _, err := w.Write(e[:]); err != nil {
+			return err
+		}
+
 	case ErrorData:
 		var l [2]byte
 		binary.BigEndian.PutUint16(l[:], uint16(len(e)))
@@ -204,6 +230,7 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case OpaqueReason:
 		var l [2]byte
 		binary.BigEndian.PutUint16(l[:], uint16(len(e)))
@@ -214,14 +241,17 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case [33]byte:
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case []byte:
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case PkScript:
 		// The largest script we'll accept is a p2wsh which is exactly
 		// 34 bytes long.
@@ -233,6 +263,7 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if err := wire.WriteVarBytes(w, 0, e); err != nil {
 			return err
 		}
+
 	case *RawFeatureVector:
 		if e == nil {
 			return fmt.Errorf("cannot write nil feature vector")
@@ -265,10 +296,12 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 		if _, err := w.Write(e[:]); err != nil {
 			return err
 		}
+
 	case FailCode:
 		if err := WriteElement(w, uint16(e)); err != nil {
 			return err
 		}
+
 	case ShortChannelID:
 		// Check that field fit in 3 bytes and write the blockHeight
 		if e.BlockHeight > ((1 << 24) - 1) {
@@ -399,6 +432,7 @@ func WriteElement(w *bytes.Buffer, element interface{}) error {
 				return err
 			}
 		}
+
 	case color.RGBA:
 		if err := WriteElements(w, e.R, e.G, e.B); err != nil {
 			return err
@@ -473,68 +507,78 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if err != nil {
 			return err
 		}
-
 		*e = alias
+
 	case *ShortChanIDEncoding:
 		var b [1]uint8
 		if _, err := r.Read(b[:]); err != nil {
 			return err
 		}
 		*e = ShortChanIDEncoding(b[0])
+
 	case *uint8:
 		var b [1]uint8
 		if _, err := r.Read(b[:]); err != nil {
 			return err
 		}
 		*e = b[0]
+
 	case *FundingFlag:
 		var b [1]uint8
 		if _, err := r.Read(b[:]); err != nil {
 			return err
 		}
 		*e = FundingFlag(b[0])
+
 	case *uint16:
 		var b [2]byte
 		if _, err := io.ReadFull(r, b[:]); err != nil {
 			return err
 		}
 		*e = binary.BigEndian.Uint16(b[:])
+
 	case *ChanUpdateMsgFlags:
 		var b [1]uint8
 		if _, err := r.Read(b[:]); err != nil {
 			return err
 		}
 		*e = ChanUpdateMsgFlags(b[0])
+
 	case *ChanUpdateChanFlags:
 		var b [1]uint8
 		if _, err := r.Read(b[:]); err != nil {
 			return err
 		}
 		*e = ChanUpdateChanFlags(b[0])
+
 	case *uint32:
 		var b [4]byte
 		if _, err := io.ReadFull(r, b[:]); err != nil {
 			return err
 		}
 		*e = binary.BigEndian.Uint32(b[:])
+
 	case *uint64:
 		var b [8]byte
 		if _, err := io.ReadFull(r, b[:]); err != nil {
 			return err
 		}
 		*e = binary.BigEndian.Uint64(b[:])
+
 	case *MilliSatoshi:
 		var b [8]byte
 		if _, err := io.ReadFull(r, b[:]); err != nil {
 			return err
 		}
 		*e = MilliSatoshi(int64(binary.BigEndian.Uint64(b[:])))
+
 	case *btcutil.Amount:
 		var b [8]byte
 		if _, err := io.ReadFull(r, b[:]); err != nil {
 			return err
 		}
 		*e = btcutil.Amount(int64(binary.BigEndian.Uint64(b[:])))
+
 	case **btcec.PublicKey:
 		var b [btcec.PubKeyBytesLenCompressed]byte
 		if _, err = io.ReadFull(r, b[:]); err != nil {
@@ -546,13 +590,13 @@ func ReadElement(r io.Reader, element interface{}) error {
 			return err
 		}
 		*e = pubKey
+
 	case **RawFeatureVector:
 		f := NewRawFeatureVector()
 		err = f.Decode(r)
 		if err != nil {
 			return err
 		}
-
 		*e = f
 
 	case *[]Sig:
@@ -571,13 +615,13 @@ func ReadElement(r io.Reader, element interface{}) error {
 				}
 			}
 		}
-
 		*e = sigs
 
 	case *Sig:
 		if _, err := io.ReadFull(r, e[:]); err != nil {
 			return err
 		}
+
 	case *OpaqueReason:
 		var l [2]byte
 		if _, err := io.ReadFull(r, l[:]); err != nil {
@@ -589,6 +633,19 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if _, err := io.ReadFull(r, *e); err != nil {
 			return err
 		}
+
+	case *WarningData:
+		var l [2]byte
+		if _, err := io.ReadFull(r, l[:]); err != nil {
+			return err
+		}
+		errorLen := binary.BigEndian.Uint16(l[:])
+
+		*e = WarningData(make([]byte, errorLen))
+		if _, err := io.ReadFull(r, *e); err != nil {
+			return err
+		}
+
 	case *ErrorData:
 		var l [2]byte
 		if _, err := io.ReadFull(r, l[:]); err != nil {
@@ -600,6 +657,7 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if _, err := io.ReadFull(r, *e); err != nil {
 			return err
 		}
+
 	case *PingPayload:
 		var l [2]byte
 		if _, err := io.ReadFull(r, l[:]); err != nil {
@@ -611,6 +669,7 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if _, err := io.ReadFull(r, *e); err != nil {
 			return err
 		}
+
 	case *PongPayload:
 		var l [2]byte
 		if _, err := io.ReadFull(r, l[:]); err != nil {
@@ -622,20 +681,24 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if _, err := io.ReadFull(r, *e); err != nil {
 			return err
 		}
+
 	case *[33]byte:
 		if _, err := io.ReadFull(r, e[:]); err != nil {
 			return err
 		}
+
 	case []byte:
 		if _, err := io.ReadFull(r, e); err != nil {
 			return err
 		}
+
 	case *PkScript:
 		pkScript, err := wire.ReadVarBytes(r, 0, 34, "pkscript")
 		if err != nil {
 			return err
 		}
 		*e = pkScript
+
 	case *wire.OutPoint:
 		var h [32]byte
 		if _, err = io.ReadFull(r, h[:]); err != nil {
@@ -657,10 +720,12 @@ func ReadElement(r io.Reader, element interface{}) error {
 			Hash:  *hash,
 			Index: uint32(index),
 		}
+
 	case *FailCode:
 		if err := ReadElement(r, (*uint16)(e)); err != nil {
 			return err
 		}
+
 	case *ChannelID:
 		if _, err := io.ReadFull(r, e[:]); err != nil {
 			return err
@@ -833,6 +898,7 @@ func ReadElement(r io.Reader, element interface{}) error {
 		}
 
 		*e = addresses
+
 	case *color.RGBA:
 		err := ReadElements(r,
 			&e.R,
@@ -842,6 +908,7 @@ func ReadElement(r io.Reader, element interface{}) error {
 		if err != nil {
 			return err
 		}
+
 	case *DeliveryAddress:
 		var addrLen [2]byte
 		if _, err = io.ReadFull(r, addrLen[:]); err != nil {
diff --git a/lnwire/lnwire_test.go b/lnwire/lnwire_test.go
index ec528a38a..44d6cfb9b 100644
--- a/lnwire/lnwire_test.go
+++ b/lnwire/lnwire_test.go
@@ -964,6 +964,12 @@ func TestLightningWireProtocol(t *testing.T) {
 				return mainScenario(&m)
 			},
 		},
+		{
+			msgType: MsgWarning,
+			scenario: func(m Warning) bool {
+				return mainScenario(&m)
+			},
+		},
 		{
 			msgType: MsgError,
 			scenario: func(m Error) bool {
diff --git a/lnwire/message.go b/lnwire/message.go
index e6de25d25..9d0fd7236 100644
--- a/lnwire/message.go
+++ b/lnwire/message.go
@@ -22,7 +22,8 @@ type MessageType uint16
 // The currently defined message types within this current version of the
 // Lightning protocol.
 const (
-	MsgInit                    MessageType = 16
+	MsgWarning                 MessageType = 1
+	MsgInit                                = 16
 	MsgError                               = 17
 	MsgPing                                = 18
 	MsgPong                                = 19
@@ -75,6 +76,8 @@ func ErrorPayloadTooLarge(size int) error {
 // String return the string representation of message type.
 func (t MessageType) String() string {
 	switch t {
+	case MsgWarning:
+		return "Warning"
 	case MsgInit:
 		return "Init"
 	case MsgOpenChannel:
@@ -146,8 +149,8 @@ type UnknownMessage struct {
 //
 // This is part of the error interface.
 func (u *UnknownMessage) Error() string {
-	return fmt.Sprintf("unable to parse message of unknown type: %v",
-		u.messageType)
+	return fmt.Sprintf("unable to parse message of unknown type: %v, value: %v",
+		u.messageType, uint16(u.messageType))
 }
 
 // Serializable is an interface which defines a lightning wire serializable
@@ -175,6 +178,8 @@ func makeEmptyMessage(msgType MessageType) (Message, error) {
 	var msg Message
 
 	switch msgType {
+	case MsgWarning:
+		msg = &Warning{}
 	case MsgInit:
 		msg = &Init{}
 	case MsgOpenChannel:
diff --git a/lnwire/writer.go b/lnwire/writer.go
index 5b99ab368..9c8841055 100644
--- a/lnwire/writer.go
+++ b/lnwire/writer.go
@@ -241,6 +241,11 @@ func WritePongPayload(buf *bytes.Buffer, payload PongPayload) error {
 	return writeDataWithLength(buf, payload)
 }
 
+// WriteWarningData appends the data to the provided buffer.
+func WriteWarningData(buf *bytes.Buffer, data WarningData) error {
+	return writeDataWithLength(buf, data)
+}
+
 // WriteErrorData appends the data to the provided buffer.
 func WriteErrorData(buf *bytes.Buffer, data ErrorData) error {
 	return writeDataWithLength(buf, data)
diff --git a/peer/brontide.go b/peer/brontide.go
index 151962728..811f68cd0 100644
--- a/peer/brontide.go
+++ b/peer/brontide.go
@@ -1537,6 +1537,10 @@ out:
 				break out
 			}
 
+		case *lnwire.Warning:
+			targetChan = msg.ChanID
+			isLinkUpdate = p.handleWarning(msg)
+
 		case *lnwire.Error:
 			targetChan = msg.ChanID
 			isLinkUpdate = p.handleError(msg)
@@ -1671,6 +1675,38 @@ func (p *Brontide) storeError(err error) {
 	)
 }
 
+// handleWarning processes a warning message read from the remote peer. The
+// boolean return indicates whether the message should be delivered to a
+// targeted peer or not. The message gets stored in memory as an error if we
+// have open channels with the peer we received it from.
+//
+// NOTE: This method should only be called from within the readHandler.
+func (p *Brontide) handleWarning(msg *lnwire.Warning) bool {
+	switch {
+	// Connection wide messages should be forward the warning to all the
+	// channels with this peer.
+	case msg.ChanID == lnwire.ConnectionWideID:
+		for _, chanStream := range p.activeMsgStreams {
+			chanStream.AddMsg(msg)
+		}
+
+		return false
+
+	// If the channel ID for the warning message corresponds to a pending
+	// channel, then the funding manager will handle the warning.
+	case p.cfg.FundingManager.IsPendingChannel(msg.ChanID, p):
+		p.cfg.FundingManager.ProcessFundingMsg(msg, p)
+		return false
+
+	// If not we hand the warning to the channel link for this channel.
+	case p.isActiveChannel(msg.ChanID):
+		return true
+
+	default:
+		return false
+	}
+}
+
 // handleError processes an error message read from the remote peer. The boolean
 // returns indicates whether the message should be delivered to a targeted peer.
 // It stores the error we received from the peer in memory if we have a channel
@@ -1770,6 +1806,9 @@ func messageSummary(msg lnwire.Message) string {
 		return fmt.Sprintf("chan_id=%v, id=%v, fail_code=%v",
 			msg.ChanID, msg.ID, msg.FailureCode)
 
+	case *lnwire.Warning:
+		return fmt.Sprintf("%v", msg.Warning())
+
 	case *lnwire.Error:
 		return fmt.Sprintf("%v", msg.Error())
 

@ekimber
Copy link
Author

ekimber commented Jan 5, 2023

@yaslama I suspect that I am seeing the same issue. Thanks for the patch, however, I do stick to a policy of only running released binaries on mainnet

@btweenthebars
Copy link

btweenthebars commented Jan 5, 2023

I (CLN 22.11.1) have a LND peer (v0.15.5-beta) who executed the SCB. My node received the invalid revocation number but somehow it did not automatically fire the force close, I guess because the connection was terminated by "unknow message" error.

in my log

2023-01-04 23:06:16.381Z INFO    03xxx-chan#22678: Peer transient failure in CHANNELD_NORMAL: Disconnected
2023-01-04 23:06:16.683Z INFO    03xxx-chan#22678: Peer transient failure in CHANNELD_NORMAL: channeld WARNING: bad reestablish revocation_number: 477 vs 483

in his log

2023-01-04 22:49:16.877 [INF] PEER: Peer(02xxx): loading ChannelPoint(67xx:0)
2023-01-04 22:49:16.877 [WRN] PEER: Peer(02xxx): ChannelPoint(67xx:0) has status ChanStatusLocalDataLoss, won't start.
2023-01-04 22:49:16.877 [INF] PEER: Peer(02xxx): Sending 1 channel sync messages to peer after loading active channels
2023-01-04 22:49:16.877 [INF] PEER: Peer(02xxx): Negotiated chan series queries
2023-01-04 22:49:16.877 [INF] NTFN: New block epoch subscription
2023-01-04 22:49:16.877 [INF] DISC: Creating new GossipSyncer for peer=02xxx
2023-01-04 22:49:21.192 [ERR] PEER: Peer(02xxx): resend failed: unable to fetch channel sync messages for peer 02xxx@127.0.0.1:50724: unable to find closed channel summary
2023-01-04 22:49:21.745 [WRN] DISC: ignoring remote ChannelAnnouncement for own channel
2023-01-04 22:49:25.626 [ERR] RPCS: [/lnrpc.Lightning/CloseChannel]: channel not found
2023-01-04 22:49:26.437 [INF] PEER: Peer(02xxx): unable to read message from peer: unable to parse message of unknown type: <unknown>
2023-01-04 22:49:26.438 [INF] PEER: Peer(02xxx): unable to read message from peer: EOF
2023-01-04 22:49:26.438 [INF] PEER: Peer(02xxx): disconnecting 02xxx@127.0.0.1:50724, reason: read handler closed

@Roasbeef
Copy link
Member

Roasbeef commented Jan 13, 2023

Thanks for all the info y'all! I think we traced things down to a CLN change that modified behavior to send a warning then close the connection after they get the bad chan reest message. Because they close the connection, this means they might not actually read our incoming Error message, so then we reconnect and the entire thing starts over again.

We're looking into a workaround that should allow things to work as normal no matter which implementation/version we're connected to.

@Crypt-iQ Crypt-iQ self-assigned this Feb 3, 2023
@Crypt-iQ
Copy link
Collaborator

Crypt-iQ commented Feb 3, 2023

If you are experiencing this issue on v0.15.5-beta after trying to recover your channel with an SCB, the following steps should resolve the issue:

  • Ensure that you have already tried connecting to the CLN node and received Warning followed by a disconnect.
  • Checkout https://github.com/Crypt-iQ/lnd/tree/scb_cln_issue_7301_v0.15.5-beta
  • Build lnd with the dev build tag.
  • Boot up the newer version of lnd with the flag --protocol.custom-message=17.
  • If you have logs, try to fetch the 32-byte (64 hexadecimal characters) chan_id/ChannelID/channel_id value. This may be appear with ChannelLink logs in the HSWC category. If you can't fetch it, feel free to post a comment asking for help.
  • Once you have the 32-byte chan_id/ChannelID/channel_id value, fetch the peer's hexadecimal pubkey. This should appear in listpeers if you are connected to the peer or in some of your logs.
  • Finally, use the sendcustom API like so:
lncli sendcustom --peer=<peer pubkey in hex> --type=17 --data=`chan_id`0000
  • Note that the data field in the sendcustom call is the chan_id followed by 4 zeros.
  • Once you see that your channel has been force closed via the lnd API, continue using the original v0.15.5-beta. This patch is only meant to be used temporarily.

If you are uncomfortable with using my personal branch and would instead prefer a temporary release, please let us know. Or if you don't know how to perform any of the steps above, leave a comment and I can help out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended code behaviour interop interop with other implementations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants