Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node failed to negotiate security protocol and can't connect to any node. #8120

Closed
fusetim opened this issue May 10, 2021 · 10 comments
Closed
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@fusetim
Copy link

fusetim commented May 10, 2021

Version information:

go-ipfs version: 0.8.0-ce693d7e8
Repo version: 11
System version: amd64/linux
Golang version: go1.16.2

Description:

My node is not working anymore (for the last 2 days) without any update or particular change. It seems my node can't connect to any other due to a failed security protocol negotiation.
Firstly, the IPFS node is completely blank as I used to reset it (rm -r $HOME/.ipfs/ and ipfs init).
The configuration is the default one, and I have a dual stack network.

What's going wrong?

I can't connect to my local node (running go-ipfs 0.8.0 on arm64) due to a failed security protocol negotiation :

$ ipfs swarm connect /ip4/192.168.1.202/tcp/4001/p2p/12D3KooWSaji41rv<redacted>J9AnazsCrEJFHUxh2                                                                                                          
Error: connect 12D3KooWSaji41rv<redacted>J9AnazsCrEJFHUxh2 failure: failed to dial 12D3KooWSaji41rv<redacted>J9AnazsCrEJFHUxh2: all dials failed
  * [/ip4/192.168.1.202/tcp/4001] failed to negotiate security protocol: read tcp4 192.168.1.79:4001->192.168.1.202:4001: read: connection reset by peer

Furthermore, my nodes can't connect to any node even the bootstrap one :

$ ipfs swarm addrs
12D3<redacted>C16AKgEL (8)
	/ip4/127.0.0.1/tcp/4001
	/ip4/127.0.0.1/udp/4001/quic
	/ip4/192.168.1.79/tcp/4001
	/ip4/192.168.1.79/udp/4001/quic
	/ip4/88.127.0.0/tcp/26447
	/ip4/88.127.0.0/udp/26447/quic
	/ip6/::1/tcp/4001
	/ip6/::1/udp/4001/quic
12D3KooWSaji41rv<redacted>J9AnazsCrEJFHUxh2 (1)
	/ip4/192.168.1.202/tcp/4001
QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN (1)
	/dnsaddr/bootstrap.libp2p.io
QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa (1)
	/dnsaddr/bootstrap.libp2p.io
QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ (2)
	/ip4/104.131.131.82/tcp/4001
	/ip4/104.131.131.82/udp/4001/quic
QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb (1)
	/dnsaddr/bootstrap.libp2p.io
QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt (1)
	/dnsaddr/bootstrap.libp2p.io

For more information, I started my node using IPFS_LOGGING=verbose ipfs daemon.
full log here : https://gist.github.com/fusetim/cb88f3dbb69a28f0f16cb40b4dccb194
(NOTE: 192.168.1.17 is my local DNS resolver and is actually working even on _dnsaddr.bootstrap.libp2p.io. TXT records)

@fusetim fusetim added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 10, 2021
@welcome
Copy link

welcome bot commented May 10, 2021

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@TheDiscordian
Copy link

TheDiscordian commented May 10, 2021

I have little data on this issue, other than 4 other people on 0.8.0 have experienced this in the past day or so, or similar (I don't):

Edit: Added SO question that fits the timeframe of the other users.

@fusetim
Copy link
Author

fusetim commented May 10, 2021

Oh by the way I can connect to this node from another one (running go-ipfs 0.8.0-19a05b846d on arm64) successfully.

$ ipfs swarm connect /ip4/192.168.1.79/udp/4001/quic/p2p/12D3<redacted>C16AKgEL
connect 12D3<redacted>C16AKgEL success

@fusetim
Copy link
Author

fusetim commented May 10, 2021

Ok so apparently my own node didn't solve the problem however by adding this node /ip4/149.56.89.144/tcp/4001/p2p/12D3KooWDiybBBYDvEEJQmNEp1yJeTgVr6mMgxqDrm9Gi8AKeNww (from https://discuss.ipfs.io/t/go-ipfs-no-peer-connections/11206/4), the node started to work normally again.

Very strange. I guess I'll leave this issue open because the security protocol negotiation issue is still pretty weird.

@fusetim
Copy link
Author

fusetim commented May 10, 2021

if needed, here is the DNS cached response from my resolver:

; <<>> DiG 9.16.12 <<>> @192.168.1.17 _dnsaddr.bootstrap.libp2p.io TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51593
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;_dnsaddr.bootstrap.libp2p.io.	IN	TXT

;; ANSWER SECTION:
_dnsaddr.bootstrap.libp2p.io. 18 IN	TXT	"dnsaddr=/dnsaddr/sjc-2.bootstrap.libp2p.io/p2p/QmZa1sAxajnQjVM8WjWXoMbmPd7NsWhfKsPkErzpm9wGkp"
_dnsaddr.bootstrap.libp2p.io. 18 IN	TXT	"dnsaddr=/dnsaddr/ams-2.bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb"
_dnsaddr.bootstrap.libp2p.io. 18 IN	TXT	"dnsaddr=/dnsaddr/ewr-1.bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa"
_dnsaddr.bootstrap.libp2p.io. 18 IN	TXT	"dnsaddr=/dnsaddr/nrt-1.bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt"
_dnsaddr.bootstrap.libp2p.io. 18 IN	TXT	"dnsaddr=/dnsaddr/sjc-1.bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN"

;; Query time: 0 msec
;; SERVER: 192.168.1.17#53(192.168.1.17)
;; WHEN: lun. mai 10 20:06:39 CEST 2021
;; MSG SIZE  rcvd: 587

I got nothing back for bootstrap.libp2p.io. otherwise for TXT, A, AAAA, and CNAME records

@aschmahmann
Copy link
Contributor

aschmahmann commented May 10, 2021

Looks like you've got a lot of

error resolving /dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: lookup _dnsaddr.bootstrap.libp2p.io on 192.168.1.17:53: cannot unmarshal DNS message from the OS resolver.

One of the default bootstrappers uses a static IP instead of a dnsaddr one and it was down recently so if you were having DNS issues then your node wouldn't have anyone to bootstrap from (which is incidentally why connecting to another DHT server node will make your node spring to life again).

failed to negotiate security protocol

Are these nodes both go-ipfs v0.8.0? Can you post the config files ipfs config show (you've been redacting node IDs so far so feel free to keep doing that as well)? Mostly I want to see what the Transports and Addresses sections look like.

Two big ways the security protocol can fail are 1) if you have mismatched security protocols 2) If you have the wrong PeerID associated with the IP + port

@fusetim
Copy link
Author

fusetim commented May 11, 2021

The two nodes are running go-ipfs 0.8.0, one of them on amd64 and the other on arm64. I will add more details about the config when I get back home.

@fusetim
Copy link
Author

fusetim commented May 11, 2021

You can find my node configs here:
For my PC one (the stuck one): https://gist.github.com/fusetim/2db55f1f356f13740923d0231c4efede
For my working node (arm64 one): https://gist.github.com/fusetim/d35c190529381d6e3055d2fed2b4b644

@fusetim
Copy link
Author

fusetim commented Jun 20, 2021

Since there is not much interaction here anymore, I will close this issue and possibly open two separate issues: one for the DNS issue (but not right now, I will wait for v0.9.0 as I have seen some changes there) and another for the security protocol negotiation issue with much more details about my specific situation.

@fusetim fusetim closed this as completed Jun 20, 2021
@fusetim
Copy link
Author

fusetim commented Jun 20, 2021

Well, I just resolved the security protocol negotiation issue, if you ever need to expose your IPFS node behind a k8s LoadBalancer, you need a specific externalTrafficPolicy set to Local otherwise it does not work. It seems, go-ipfs check the client source IP address (obscured when Cluster is used).
My IPFS Service for Kubernetes:

apiVersion: v1
kind: Service
metadata:
  name: ipfs

spec:
  ports:
    - name: api
      port: 5001
    - name: swarm
      port: 4001
    - name: websocket
      port: 4002
    - name: gateway
      port: 8080
  type: LoadBalancer  
  externalTrafficPolicy: Local
  ipFamilies:
    - IPv4
  ipFamilyPolicy: SingleStack
  selector:
    app: ipfs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

3 participants