New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go Panic using BGP on CentOS 7.4 / K8S 1.10.1 #250

Closed
FireDrunk opened this Issue Apr 24, 2018 · 11 comments

Comments

Projects
3 participants
@FireDrunk
Copy link
Contributor

FireDrunk commented Apr 24, 2018

Is this a bug report or a feature request?:
Bug Report

What happened:
After succesfully connecting BGP to my pfSense router and activating the first LoadBalancer service the Speaker pod crashed.

What you expected to happen:
An activated LoadBalancer IP, and a working pod with the attached IP.

How to reproduce it (as minimally and precisely as possible):
ConfigMap

config:
----
peers:
- peer-address: 10.0.24.1
  peer-asn: 64512
  my-asn: 64512
address-pools:
- name: default
  protocol: bgp
  addresses:
  - 10.2.2.0/24

Anything else we need to know?:
The BGP status was active before the crash (without setting a service type to LoadBalancer).
My router was activly communicationg via BGP with the Speaker and all was running well.

After adding the LoadBalancer type to my service, the Speaker crashed, and the bgp connection went down.

BGP neighbor is 10.0.24.41, remote AS 65512
 Description: k8s-master-001
  BGP version 4, remote router-id 10.0.24.41
  BGP state = Idle
  Last read 00:02:53, holdtime 240s, keepalive interval 80s

  Message statistics:
                  Sent       Received  
  Opens                 2120       2120
  Notifications         2120          0
  Updates                  0          0
  Keepalives               0          0
  Route Refresh            0          0
  Total                 4240       2120

  Update statistics:
                  Sent       Received  
  Updates                  0          0
  Withdraws                0          0
  End-of-Rib               0          0

  Last error: AS unacceptable

After inspecting the container in docker, I saw the following crash:

"caller":"bgp.go:62","event":"sessionUp","localASN":64512,"msg":"BGP session established","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-24T11:29:15.761696909Z"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x52b8d6]
goroutine 27 [running]:
bytes.(*Buffer).WriteTo(0xc420290230, 0x0, 0x0, 0x7fb6a6f72e78, 0xc420290230, 0xc42050db01)
	/usr/local/go/src/bytes/buffer.go:242 +0x96
io.copyBuffer(0x0, 0x0, 0x18fbd40, 0xc420290230, 0x0, 0x0, 0x0, 0x0, 0xc42000e870, 0xc42050dcd0)
	/usr/local/go/src/io/io.go:382 +0x30a
io.Copy(0x0, 0x0, 0x18fbd40, 0xc420290230, 0x4, 0xc420311800, 0x0)
	/usr/local/go/src/io/io.go:362 +0x68
go.universe.tf/metallb/internal/bgp.sendUpdate(0x0, 0x0, 0x0, 0xc42040a408, 0x4, 0x4, 0xc420311800, 0xc420302380, 0x581ce2)
	/go/src/go.universe.tf/metallb/internal/bgp/messages.go:307 +0x273
go.universe.tf/metallb/internal/bgp.(*Session).sendUpdates(0xc4205905b0, 0xc4201ae600)
	/go/src/go.universe.tf/metallb/internal/bgp/bgp.go:88 +0x247
go.universe.tf/metallb/internal/bgp.(*Session).run(0xc4205905b0)
	/go/src/go.universe.tf/metallb/internal/bgp/bgp.go:64 +0x4a7
created by go.universe.tf/metallb/internal/bgp.New
	/go/src/go.universe.tf/metallb/internal/bgp/bgp.go:305 +0x408

Environment:

  • MetalLB version: 0.6.1
  • Kubernetes version: 1.10.1
  • BGP router type/version: pfSense 2.4.1
  • OS (e.g. from /etc/os-release): CentOS 7.4 fully patched (as per 24-02-2018)
  • Kernel (e.g. uname -a): 3.10.0-693.21.1
@ghorofamike

This comment has been minimized.

Copy link
Contributor

ghorofamike commented Apr 28, 2018

@FireDrunk Could you listen with tcpdump and show the traffic thats going back AND forward before the crash? I worked on the significant changes to the Dialer functions required for TCP MD5, would be interesting if there is a flaw in the new Logic.
@danderson Hmm, segfaults, any idea if this is purely from the net connection changes?
I havent met this segfault, so i cant tell.

@danderson danderson self-assigned this Apr 28, 2018

@danderson danderson added the bug label Apr 28, 2018

@danderson danderson added this to To Do in BGP mode via automation Apr 28, 2018

@danderson

This comment has been minimized.

Copy link
Member

danderson commented Apr 28, 2018

Thanks for the bug report. I suspect I know what happened, based on the stack trace. There is a window right after BGP session startup where we might try sending a BGP update on a connection that was just aborted (which nil's out s.conn, so we try writing to a nil io.Writer). So, the sequence of events would be:

  • BGP session connects, we exchange OPENs, session transitions into the Established state
  • MetalLB starts goroutine to consume inbound BGP messages from peer.
  • Main BGP session goroutine releases the BGP session lock while it updates some internal book-keeping.
  • Consumer goroutine receives something it doesn't like from the peer, grabs the lock and calls s.abort() to reset the BGP state back to Idle, ready for a new connection attempt. s.conn is closed and nil'd here.
  • Main session goroutine grabs the lock again to send the initial full BGP update, tries to send a BGP UPDATE message to a nil conn, panic.

It should be a trivial fix, I'll prepare it on a branch so you can test.

The next question is: why is the consumer goroutine terminating the BGP session? That means it received something it doesn't like from the peer, which is either a BGP NOTIFY (peer terminating the session explicitly), or malformed BGP packets (or more likely, we got the parsing wrong somewhere).

Let's fix the crash first, then we can look at what's happening with your session.

danderson added a commit that referenced this issue Apr 28, 2018

danderson added a commit that referenced this issue Apr 28, 2018

@danderson

This comment has been minimized.

Copy link
Member

danderson commented Apr 29, 2018

@FireDrunk can you test the change, with kubectl set image -n metallb-system ds/speaker speaker=metallb/speaker:bug-250 ? That'll replace the docker image with one that has the above change. That should prevent the crash, and hopefully give us some more detail into why the BGP session is being closed.

@FireDrunk

This comment has been minimized.

Copy link
Contributor

FireDrunk commented Apr 30, 2018

Just updated the image, the speaker stays online now, but the logs tell me this:

{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:20:24.479597438Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:20:26.480740327Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:20:30.482237741Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:20:38.483316784Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:20:54.484421073Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:21:26.485549018Z"}
{"caller":"bgp.go:54","error":"read OPEN from \"10.0.24.1:179\": EOF","localASN":64512,"msg":"failed to connect to peer","op":"connect","peer":"10.0.24.1:179","peerASN":64512,"ts":"2018-04-30T06:22:30.486534387Z"}

LoadBalancer IP given to service/pod is also not pingable from node that is running that pod/service.

State of pfSense:
image

TCPDump (tcpdump -i any -s 1500 port 179):
Started tcpdump, killed the pod, and waited for the traffic to quiet down. Capture is only for port 179, if you need more, let me know.

08:38:18.180476 IP k8s-node-002.infra.prov.io.55672 > gateway.bgp: Flags [S], seq 1688816803, win 29200, options [mss 1460,sackOK,TS val 565658507 ecr 0,nop,wscale 7], length 0
08:38:18.180756 IP gateway.bgp > k8s-node-002.infra.prov.io.55672: Flags [S.], seq 2872716625, ack 1688816804, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3656153533 ecr 565658507], length 0
08:38:18.180793 IP k8s-node-002.infra.prov.io.55672 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565658507 ecr 3656153533], length 0
08:38:18.181095 IP gateway.bgp > k8s-node-002.infra.prov.io.55672: Flags [F.], seq 1, ack 1, win 513, options [nop,nop,TS val 3656153534 ecr 565658507], length 0
08:38:18.181236 IP k8s-node-002.infra.prov.io.55672 > gateway.bgp: Flags [.], ack 2, win 229, options [nop,nop,TS val 565658508 ecr 3656153534], length 0
08:38:18.182843 IP k8s-node-002.infra.prov.io.55672 > gateway.bgp: Flags [P.], seq 1:50, ack 2, win 229, options [nop,nop,TS val 565658509 ecr 3656153534], length 49: BGP
08:38:18.182875 IP k8s-node-002.infra.prov.io.55672 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565658509 ecr 3656153534], length 0
08:38:18.183060 IP k8s-node-002.infra.prov.io.44186 > gateway.bgp: Flags [S], seq 3072186876, win 29200, options [mss 1460,sackOK,TS val 565658509 ecr 0,nop,wscale 7], length 0
08:38:18.183096 IP gateway.bgp > k8s-node-002.infra.prov.io.55672: Flags [R], seq 2872716627, win 0, length 0
08:38:18.183123 IP gateway.bgp > k8s-node-002.infra.prov.io.55672: Flags [R], seq 2872716627, win 0, length 0
08:38:18.183191 IP gateway.bgp > k8s-node-002.infra.prov.io.44186: Flags [S.], seq 1442524335, ack 3072186877, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1804486041 ecr 565658509], length 0
08:38:18.183221 IP k8s-node-002.infra.prov.io.44186 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565658509 ecr 1804486041], length 0
08:38:18.183287 IP k8s-node-002.infra.prov.io.44186 > gateway.bgp: Flags [P.], seq 1:50, ack 1, win 229, options [nop,nop,TS val 565658510 ecr 1804486041], length 49: BGP
08:38:18.183477 IP gateway.bgp > k8s-node-002.infra.prov.io.44186: Flags [.], ack 50, win 512, options [nop,nop,TS val 1804486041 ecr 565658510], length 0
08:38:18.183562 IP gateway.bgp > k8s-node-002.infra.prov.io.44186: Flags [F.], seq 1, ack 50, win 513, options [nop,nop,TS val 1804486041 ecr 565658510], length 0
08:38:18.183778 IP k8s-node-002.infra.prov.io.44186 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565658510 ecr 1804486041], length 0
08:38:18.184016 IP gateway.bgp > k8s-node-002.infra.prov.io.44186: Flags [.], ack 51, win 513, options [nop,nop,TS val 1804486042 ecr 565658510], length 0
08:38:19.184322 IP k8s-node-002.infra.prov.io.55608 > gateway.bgp: Flags [S], seq 569334188, win 29200, options [mss 1460,sackOK,TS val 565659511 ecr 0,nop,wscale 7], length 0
08:38:19.184672 IP gateway.bgp > k8s-node-002.infra.prov.io.55608: Flags [S.], seq 3969739542, ack 569334189, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 671099249 ecr 565659511], length 0
08:38:19.184707 IP k8s-node-002.infra.prov.io.55608 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565659511 ecr 671099249], length 0
08:38:19.184817 IP k8s-node-002.infra.prov.io.55608 > gateway.bgp: Flags [P.], seq 1:50, ack 1, win 229, options [nop,nop,TS val 565659511 ecr 671099249], length 49: BGP
08:38:19.184987 IP gateway.bgp > k8s-node-002.infra.prov.io.55608: Flags [F.], seq 1, ack 1, win 513, options [nop,nop,TS val 671099249 ecr 565659511], length 0
08:38:19.185006 IP gateway.bgp > k8s-node-002.infra.prov.io.55608: Flags [R], seq 3969739543, win 0, length 0
08:38:19.185014 IP k8s-node-002.infra.prov.io.55608 > gateway.bgp: Flags [.], ack 2, win 229, options [nop,nop,TS val 565659511 ecr 671099249], length 0
08:38:19.185047 IP k8s-node-002.infra.prov.io.55608 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565659511 ecr 671099249], length 0
08:38:19.185137 IP gateway.bgp > k8s-node-002.infra.prov.io.55608: Flags [R], seq 3969739544, win 0, length 0
08:38:19.185177 IP gateway.bgp > k8s-node-002.infra.prov.io.55608: Flags [R], seq 3969739544, win 0, length 0
08:38:21.185458 IP k8s-node-002.infra.prov.io.59991 > gateway.bgp: Flags [S], seq 2266022264, win 29200, options [mss 1460,sackOK,TS val 565661512 ecr 0,nop,wscale 7], length 0
08:38:21.185847 IP gateway.bgp > k8s-node-002.infra.prov.io.59991: Flags [S.], seq 3868947996, ack 2266022265, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 3951336215 ecr 565661512], length 0
08:38:21.185877 IP k8s-node-002.infra.prov.io.59991 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565661512 ecr 3951336215], length 0
08:38:21.185970 IP k8s-node-002.infra.prov.io.59991 > gateway.bgp: Flags [P.], seq 1:50, ack 1, win 229, options [nop,nop,TS val 565661512 ecr 3951336215], length 49: BGP
08:38:21.186074 IP gateway.bgp > k8s-node-002.infra.prov.io.59991: Flags [F.], seq 1, ack 1, win 513, options [nop,nop,TS val 3951336215 ecr 565661512], length 0
08:38:21.186128 IP k8s-node-002.infra.prov.io.59991 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565661512 ecr 3951336215], length 0
08:38:21.186161 IP gateway.bgp > k8s-node-002.infra.prov.io.59991: Flags [R], seq 3868947997, win 0, length 0
08:38:21.186233 IP gateway.bgp > k8s-node-002.infra.prov.io.59991: Flags [R], seq 3868947998, win 0, length 0
08:38:25.186519 IP k8s-node-002.infra.prov.io.48508 > gateway.bgp: Flags [S], seq 1986455850, win 29200, options [mss 1460,sackOK,TS val 565665513 ecr 0,nop,wscale 7], length 0
08:38:25.186832 IP gateway.bgp > k8s-node-002.infra.prov.io.48508: Flags [S.], seq 3295100222, ack 1986455851, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1364637479 ecr 565665513], length 0
08:38:25.186858 IP k8s-node-002.infra.prov.io.48508 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565665513 ecr 1364637479], length 0
08:38:25.186981 IP k8s-node-002.infra.prov.io.48508 > gateway.bgp: Flags [P.], seq 1:50, ack 1, win 229, options [nop,nop,TS val 565665513 ecr 1364637479], length 49: BGP
08:38:25.187118 IP gateway.bgp > k8s-node-002.infra.prov.io.48508: Flags [.], ack 50, win 512, options [nop,nop,TS val 1364637480 ecr 565665513], length 0
08:38:25.187126 IP gateway.bgp > k8s-node-002.infra.prov.io.48508: Flags [F.], seq 1, ack 50, win 513, options [nop,nop,TS val 1364637480 ecr 565665513], length 0
08:38:25.187175 IP k8s-node-002.infra.prov.io.48508 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565665513 ecr 1364637480], length 0
08:38:25.187365 IP gateway.bgp > k8s-node-002.infra.prov.io.48508: Flags [.], ack 51, win 513, options [nop,nop,TS val 1364637480 ecr 565665513], length 0
08:38:33.187621 IP k8s-node-002.infra.prov.io.46011 > gateway.bgp: Flags [S], seq 201089674, win 29200, options [mss 1460,sackOK,TS val 565673514 ecr 0,nop,wscale 7], length 0
08:38:33.187916 IP gateway.bgp > k8s-node-002.infra.prov.io.46011: Flags [S.], seq 671432403, ack 201089675, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1642001523 ecr 565673514], length 0
08:38:33.187950 IP k8s-node-002.infra.prov.io.46011 > gateway.bgp: Flags [.], ack 1, win 229, options [nop,nop,TS val 565673514 ecr 1642001523], length 0
08:38:33.188048 IP k8s-node-002.infra.prov.io.46011 > gateway.bgp: Flags [P.], seq 1:50, ack 1, win 229, options [nop,nop,TS val 565673514 ecr 1642001523], length 49: BGP
08:38:33.188129 IP gateway.bgp > k8s-node-002.infra.prov.io.46011: Flags [F.], seq 1, ack 1, win 513, options [nop,nop,TS val 1642001523 ecr 565673514], length 0
08:38:33.188185 IP k8s-node-002.infra.prov.io.46011 > gateway.bgp: Flags [F.], seq 50, ack 2, win 229, options [nop,nop,TS val 565673514 ecr 1642001523], length 0
08:38:33.188269 IP gateway.bgp > k8s-node-002.infra.prov.io.46011: Flags [R], seq 671432404, win 0, length 0
08:38:33.188304 IP gateway.bgp > k8s-node-002.infra.prov.io.46011: Flags [R], seq 671432405, win 0, length 0
@danderson

This comment has been minimized.

Copy link
Member

danderson commented Apr 30, 2018

Okay, the pod is stable, good - now the problem seems to be somewhere in configuration. The BGP session is getting terminated by pfSense after MetalLB sends its OPEN message. In your original report, it stated that the last error was "AS unacceptable"

... So, why is the AS unacceptable? I have no idea. It looks to me like a perfectly fine iBGP session, but the router is rejecting it.

It looks like pfSense uses OpenBGPD, which is not one that I've tested against in the past :(. To figure out more, I'm going to have to reproduce this on a local testbench.

The tcpdump was good, however to really dissect the packets I need a pcap. Can you tcpdump -i any -s 1500 -w bug.pcap port 179, capture the same thing you just did (just kill the speaker pod and then give it a couple of minutes to collect some attempts), and attach it to the bug? That'll let me drill down into the BGP packets and see what's what.

@FireDrunk

This comment has been minimized.

Copy link
Contributor

FireDrunk commented Apr 30, 2018

Just found a typo in my OpenBGPd Neighbour config (had AS as 65512 instead of 64512). Just updated, and it seems the session is now correctly established:

BGP neighbor is 10.0.24.41, remote AS 64512
 Description: k8s-master-001
  BGP version 4, remote router-id 10.0.24.41
  BGP state = Established, up for 00:00:09
  Last read 00:00:09, holdtime 90s, keepalive interval 30s
  Neighbor capabilities:
    Multiprotocol extensions: IPv4 unicast, IPv6 unicast
    4-byte AS numbers

bug-250.zip

IP is still offline though.
The status page of OpenBGPd tells me that the metallb speaker has correctly pushed the route for the IP:

I*>   10.2.2.0/32          10.0.24.41           0     0 ?

So I think the current problem is just the container that hasn't configured the IP on the node yet.

EDIT:
After just connecting to the ip, instead of pinging the ip, everything works fine. The IP is just not responding to pings. I think this solves the issue!

@FireDrunk FireDrunk closed this Apr 30, 2018

BGP mode automation moved this from To Do to Done Apr 30, 2018

@danderson

This comment has been minimized.

Copy link
Member

danderson commented Apr 30, 2018

Ah yes, the ping issue is quite confusing, unfortunately kube-proxy doesn't make service IPs pingable, so you have to actually connect to the service to check it :(

I'm glad you got it working!

@FireDrunk

This comment has been minimized.

Copy link
Contributor

FireDrunk commented May 1, 2018

Any idea when this will make it into a release?

@danderson

This comment has been minimized.

Copy link
Member

danderson commented May 1, 2018

Doh, when you closed the bug, I forgot to go and merge it.

I'll release 0.6.2 now, should be available in ~30min.

@danderson danderson reopened this May 1, 2018

BGP mode automation moved this from Done to In progress May 1, 2018

danderson added a commit to danderson/metallb that referenced this issue May 1, 2018

BGP mode automation moved this from In progress to Done May 1, 2018

danderson added a commit that referenced this issue May 1, 2018

danderson added a commit that referenced this issue May 1, 2018

Check for closed or aborted BGP sessions before sending the initial B…
…GP update.

Fixes #250

(cherry picked from commit afd6475)
@danderson

This comment has been minimized.

Copy link
Member

danderson commented May 1, 2018

0.6.2 released with the fix.

@FireDrunk

This comment has been minimized.

Copy link
Contributor

FireDrunk commented May 1, 2018

Awesome! Works like a charm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment