Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: udp packets not load balanced by kubernetes service #31427

Closed
MaurGi opened this issue Apr 11, 2019 · 8 comments

Comments

Projects
None yet
3 participants
@MaurGi
Copy link

commented Apr 11, 2019

What version of Go are you using (go version)?

$ go version go1.10.4 linux/amd64

Does this issue reproduce with the latest release?

sudo apt-get update && sudo apt-get install golang tells me I am on the latest version.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/maurgi/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/maurgi/go"
GORACE=""
GOROOT="/usr/lib/go-1.10"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.10/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build890901110=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I have a simple udp server that just receives packets deployed as a Kubernetes Service with 3 instances.
Then with a simple udp client I call conn = net.Dial and in a forever loop I call conn.Write

What did you expect to see?

UDP packets are randomly distributed across the three pods.

What did you see instead?

UDP packets ALWAYS go to the same server until I stop the client.
If I try to use netcat, the packets correctly go to the right server.

Server:

package main

import (
	"fmt"
	"net"
)

func main() {
	tcpListener, err := net.Listen("tcp", "0.0.0.0:3333")
	checkError(err)
	defer tcpListener.Close()
	fmt.Println("TCP-Listening on 0.0.0.0:3333")

	udpListener, err := net.ListenPacket("udp", "0.0.0.0:3333")
	checkError(err)
	defer udpListener.Close()
	fmt.Println("UDP-Listening on 0.0.0.0:3333")

	go listenUDP(udpListener)

	for {
		tcpConn, err := tcpListener.Accept()
		checkError(err)
		defer tcpConn.Close()
		fmt.Println("TCP-Connected")

		go handleTCPConn(tcpConn)
	}
}

func handleTCPConn(conn net.Conn) {
	// Make a buffer to hold incoming data.
	buf := make([]byte, 1024)
	i := 0

	for {
		// Read the incoming connection into the buffer.
		_, err := conn.Read(buf)
		if err != nil {
			fmt.Println("TCP connection error ", err.Error())
			break
		}

		// Log what received
		fmt.Printf("%d - TCP:%s", i, buf)
		i++
	}
}

func listenUDP(udpListener net.PacketConn) {
	buffer := make([]byte, 1024)
	i := 0

	for {
		//simple read
		udpListener.ReadFrom(buffer)

		// Log what received
		fmt.Printf("%d - UDP:%s", i, buffer)
		i++
	}
}

func checkError(err error) {
	if err != nil {
		fmt.Println("Fatal error ", err.Error())
		panic(err)
	}
}

Client:

package main

import (
	"errors"
	"flag"
	"fmt"
	"io"
	"log"
	"net"
	"syscall"
	"time"
)

func handleError(err error) {
	if err != nil {
		log.Fatalf("Error with the statsd connection %s", err.Error())
		panic(err)
	}
}

// From https://github.com/go-sql-driver/mysql/blob/master/conncheck.go
// Note that it works only on Unix, and it caused several allocations.
// I really want this is implemented in stdlib in efficient (and probably cross platform) way.
var errUnexpectedRead = errors.New("unexpected read from socket")

func connCheck(c net.Conn) error {
	var (
		n    int
		err  error
		buff [1]byte
	)

	sconn, ok := c.(syscall.Conn)
	if !ok {
		return nil
	}
	rc, err := sconn.SyscallConn()
	if err != nil {
		return err
	}
	rerr := rc.Read(func(fd uintptr) bool {
		n, err = syscall.Read(int(fd), buff[:])
		return true
	})
	switch {
	case rerr != nil:
		return rerr
	case n == 0 && err == nil:
		return io.EOF
	case n > 0:
		return errUnexpectedRead
	case err == syscall.EAGAIN || err == syscall.EWOULDBLOCK:
		return nil
	default:
		return err
	}
}

func runClient(server *string, port *string, protocol *string, sleep int) {
	address := fmt.Sprintf("%s:%s", *server, *port)
	prot := *protocol
	conn, err := net.Dial(prot, address)
	handleError(err)

	for {
		msg := fmt.Sprintf("%sTEST\n", prot)
		log.Printf("Sending message: %s", msg)
		_, err := conn.Write([]byte(msg))
		handleError(err)

		if prot == "tcp" {
			err := connCheck(conn)
			handleError(err)
		}

		time.Sleep(time.Duration(sleep) * time.Second)
	}
}

func main() {
	// Command flag: https://gobyexample.com/command-line-flags
	var srv, port, protocol string
	var sleep int
	flag.StringVar(&srv, "server", "127.0.0.1", "the target server")
	flag.StringVar(&port, "port", "3333", "the target port")
	flag.StringVar(&protocol, "protocol", "tcp", "protocol used [tcp|udp]")
	flag.IntVar(&sleep, "sleep", 1, "sleep seconds")

	flag.Parse()

	fmt.Printf("using server: %s, port: %s, protocol: %s, sleep seconds:%d\n", srv, port, protocol, sleep)
	runClient(&srv, &port, &protocol, sleep)
}

Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    run: multiport
  name: multiport
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      run: multiport
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        run: multiport
    spec:
      containers:
      - image: multiport:latest
        imagePullPolicy: IfNotPresent
        name: multiport
        ports:
        - containerPort: 3333
          protocol: TCP
        - containerPort: 3333
          protocol: UDP
        resources: {}

Kubernetes service:

apiVersion: v1
kind: Service
metadata:
  labels:
    run: multiport
  name: multiport
  namespace: default
spec:
  ports:
  - port: 3333
    protocol: UDP
  #  targetPort: 3333
  #  name: udp-port
  selector:
    run: multiport
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

launch the client with:

go run test_client.go -server multiport.default -port 3333 -protocol udp

verify that the packets go to the same server until you stop and restart the client

run a test from bash with:

for i in {1..100}; do echo "sending"; echo "TEST" | nc -w 1 -u multiport.default 3333; done

verify that the packets end on different servers

@mikioh mikioh changed the title udp packets not load balanced by kubernetes service net: udp packets not load balanced by kubernetes service Apr 12, 2019

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

Looks like your expectation on the distribution relies on k8s. Why do you think this is an issue of Go's standard library?

@MaurGi

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

Yeah it can be on either sides, of course - but if I use netcat, it gets distributed correctly, so I guess it has something to do with the go net library.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

Can you please provide a self-contained small example only using Go?

Also, please make sure the detail of your "traffic distribution"; what are the targets, just packets are a bit vague, UDP packets? and the endpoints? UDP sockets? To me, your example makes a UDP socket just listening to "0.0.0.0:3333", all available unicast and anycast addresses on the node w/ transport port # 3333, and doesn't have any traffic distribution magic.

@crvv

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

This is not an issue of package net.

The script calls nc many times.
nc creates a new UDP socket every time and these sockets have different local ports.

Your Go code just creates one socket and send many packets through it. All the packets have the same local port.

Generally, If some UDP packets have the same local port and remote port, they will be assumed to be belong to one "connection".

The packets of one "connection" won't be routed to different backends.

@MaurGi

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

@crvv this is possibly the answer I was looking for, thank you! What in the stack is making that assumption, is it Kubernetes proxy or just Linux networking? Can I affect it - can I change the client port in net library for each call or do I have to reconnect it every time? (that might be expensive)

@MaurGi

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

Can you please provide a self-contained small example only using Go?

Also, please make sure the detail of your "traffic distribution"; what are the targets, just packets are a bit vague, UDP packets? and the endpoints? UDP sockets? To me, your example makes a UDP socket just listening to "0.0.0.0:3333", all available unicast and anycast addresses on the node w/ transport port # 3333, and doesn't have any traffic distribution magic.

The attached files in the bug are the self contained small example and they only use G, updated comments to add some of the details - thanks!

@MaurGi

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

So made more investigation - does not look like the host iptables are involved, this is something I need to bring to the Kubernetes network stack - will open an issue there

@MaurGi

This comment has been minimized.

Copy link
Author

commented Apr 12, 2019

Opened this issue in kubernetes: kubernetes/kubernetes#76517

@MaurGi MaurGi closed this Apr 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.