Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge performance gap compare to tcp #60

Open
drv1234 opened this issue Mar 14, 2023 · 29 comments
Open

Huge performance gap compare to tcp #60

drv1234 opened this issue Mar 14, 2023 · 29 comments

Comments

@drv1234
Copy link

drv1234 commented Mar 14, 2023

Hi,
I am planning to use https://github.com/fiorix/go-diameter. In this project there is an example diameter client/server with benchmark.
With tcp: 259058/s diameter message can be reached (on localhost)
With sctp only 6938/s message.
cpuprof_tcp_vs_sctp.zip

@ishidawataru
Copy link
Owner

This library doesn't use golang runtime polling system. I think the performance gap is caused by that.

@drv1234
Copy link
Author

drv1234 commented Mar 28, 2023

Thank you.
Would you have time to implement it in this module?
I would really appreciate it because I cannot use it as it is now.

Thank you in advance

@ishidawataru
Copy link
Owner

I think we can make similar changes to this to use the runtime polling system.

Unfortunately, I don't have enough time to work on this. PR is welcome!

@alzrck
Copy link

alzrck commented Jul 29, 2023

im not an expert but ill give it a shot, by a chance if you can help me where to watch/change, ill make the changes and run the tests then ill make the PR if it's ok

@drv1234
Copy link
Author

drv1234 commented Nov 8, 2023

hi @alzrck ,
how is it going?
Did you manage to make some progress?

thanks

@alzrck
Copy link

alzrck commented Nov 8, 2023 via email

@drv1234
Copy link
Author

drv1234 commented Nov 8, 2023

where can I find it?

@alzrck
Copy link

alzrck commented Nov 8, 2023 via email

@drv1234
Copy link
Author

drv1234 commented Nov 9, 2023

Is it fit to the golang standard "net/http" architecture?
I cannot see i.e dial or listen functions

@alzrck
Copy link

alzrck commented Nov 9, 2023 via email

@drv1234
Copy link
Author

drv1234 commented Nov 9, 2023

do you work in telco too? :-)

@alzrck
Copy link

alzrck commented Nov 9, 2023 via email

@SX91
Copy link

SX91 commented Nov 10, 2023

I've tried to reimplement underlying methods using approach from https://github.com/mdlayher/socket/ (look for rwT and controlT). But for some reason TestStreams hangs for ~3 seconds in the middle of the test (around ~60 completed client cycles).

Looks like accept and/or connect doesn't get the control call (Write poller doesn't re-call the lambda in rwT) in time. Or the runtime hangs. I don't know how to debug this.

PS does anyone have working benchmark to compare current and via-poll implementations?

@drv1234
Copy link
Author

drv1234 commented Nov 14, 2023

I think tcp does not use poll implementation.
With tcp: 259058/s diameter message can be reached (on localhost)
With sctp only 6938/s message.

@SX91
Copy link

SX91 commented Nov 15, 2023

@drv1234 Could you share the code for throughput benchmark?

@drv1234
Copy link
Author

drv1234 commented Nov 15, 2023

I installed this package:
https://github.com/fiorix/go-diameter

there is an example folder in it but I slightly modified the files:
examples.zip

  1. start server like: "go run server.go --addr=localhost:3867 --network_type=tcp -s "

  2. start client like: "go run client.go --addr=localhost:3867 --bench=True --bench_clients=4 --bench_msgs=10000 --network_type=tcp"

  3. repeat it with "... --network_type=sctp"

@drv1234
Copy link
Author

drv1234 commented Jan 15, 2024

hi,
how is it going?

I studied the solution of "mdlayher/netlink" a bit.
But it seems to me, it is just an "interface" of Linux netlink sockets (to be like a network).
So if you want to use you have to write the other end which send/receive SCTP messages trough this netlink socket. This is actually an extra layer between the communication.

What I understood the problem is that the current SCTP implementation uses "syscall.Recvmsg()" which not only bloks the current goroutine but the whole thread.
So as I see now the "only" thing what should be done to replace the "syscall.Recvmsg()" with "Pread" which uses poll.

https://pkg.go.dev/internal/poll
"Package poll supports non-blocking I/O on file descriptors with polling. This supports I/O operations that block only a goroutine, not a thread. This is used by the net and os packages. It uses a poller built into the runtime, with support from the runtime scheduler."

Can somebody correct me if I am wrong?
@ishidawataru what do you think?

@ishidawataru
Copy link
Owner

@drv1234 mdlayher/netlink uses os.NewFile with a file descriptor that is in non-blocking mode to use the runtime polling system.

see https://github.com/mdlayher/netlink/pull/125/files#diff-13b50968875bb007d5d504aea5954a145602329215391b19588d750eab07a5cfR442-R452

@ishidawataru
Copy link
Owner

ishidawataru commented Jan 22, 2024

@drv1234 Can you try this branch? https://github.com/ishidawataru/sctp/tree/feat-non-blocking

Still WIP. But at least the test is passing in my environment (Ubuntu 20.04).

@alzrck
Copy link

alzrck commented Jan 22, 2024

What i did (/home/alz/sctp is where i cloned) and

alz@alzdell:~/sctp$ git branch
master

  • origin/feat-non-blocking
  1. deleted from go directory the old sctp module from local go directory + cache directory (to be sure is not used)
  2. cloned this branch locally
  3. configured replace github.com/ishidawataru/sctp => /home/alz/sctp in go.mod (project using the sctp code)
  4. rebuild the server and client examples in the go-diameter project
  5. ran again the benchmark tests

with tcp
2000 messages in 17.275023ms: 115774/s

with sctp
2000 messages in 1.82990384s: 1092/s

am i doing something wrong? @drv1234 did you have the chance to try?

@alzrck
Copy link

alzrck commented Jan 22, 2024

to double check im using the correct code and branch, in sctp_linux.go i commented out

	//fmt.Printf("SCTPRead: n: %d, oobn: %d, recvflags: %d, err: %v\n", n, oobn, recvflags, err)

and ran the test again and i get

2000 messages in 1.847410964s: 1082/s

@ishidawataru
Copy link
Owner

ishidawataru commented Jan 22, 2024

@alzrck Thanks for testing. Could you run the test with the master branch and compare the results?
I'd like to check if the performance is at least improving with the use of non-blocking sockets.

@drv1234
Copy link
Author

drv1234 commented Jan 23, 2024

hi,
I repeated the test.
With the original version:
80000 messages in 11.370102203s: 7035/s

With the new one:
80000 messages in 10.837160112s: 7382/s

@alzrck
Copy link

alzrck commented Jan 23, 2024

@ishidawataru with the master branch

2000 messages in 1.244879549s: 1606/s

@ishidawataru
Copy link
Owner

Thanks for testing. It seems my expectation was wrong. The use of the runtime polling system is not improving the performance.

I found this old issue.

netty/netty#611 (comment)

Maybe the performance gap is caused by the OS stack level?

@drv1234
Copy link
Author

drv1234 commented Jan 24, 2024

this issue is really old :-)

  1. Butt according to the numbers the only the "min" value was double compare to the TCP, the "mean" is almost the same and the "99.99%" was better the TCP
  2. Currently I use a C++ implementation of diameter/SCTP and there is no such an issue

@alzrck
Copy link

alzrck commented Jan 24, 2024 via email

@SX91
Copy link

SX91 commented Jan 27, 2024

I've written some (stupid and low quality) client/server bench implementations with SCTP only (no DIAMETER involved).

client:

package main

import (
	"errors"
	"flag"
	"io"
	"log"
	"math/rand"
	"net"
	"time"

	"github.com/ishidawataru/sctp"
)

func init() {
	rand.Seed(time.Now().UnixNano())
}

func main() {
	//  go run client.go  --addr localhost:3868 --clients 4 --count 10000 --network tcp

	addr := flag.String("addr", "localhost:3868", "address in form of ip:port to connect to")
	benchCli := flag.Int("clients", 1, "number of client connections")
	benchMsgs := flag.Int("count", 1000, "number of ACR messages to send")
	networkType := flag.String("network", "tcp", "protocol type tcp/sctp")
	drainMode := flag.Int("drain-mode", 0, "Drain incoming messages mode (0 - disable, 1 - sync, 2 - async)")

	flag.Parse()
	if len(*addr) == 0 {
		flag.Usage()
	}

	connect := func() (net.Conn, error) {
		return dial(*networkType, *addr)
	}

	done := make(chan int, 16)

	benchmark(connect, *benchCli, *benchMsgs, *drainMode, done)
}

func dial(network, addr string) (net.Conn, error) {
	switch network {
	case "sctp", "sctp4", "sctp6":
		sctpAddr, err := sctp.ResolveSCTPAddr(network, addr)
		if err != nil {
			return nil, err
		}

		return sctp.DialSCTP(network, nil, sctpAddr)
	case "tcp", "tcp4", "tcp6":
		tcpAddr, err := net.ResolveTCPAddr(network, addr)
		if err != nil {
			return nil, err
		}

		return net.DialTCP(network, nil, tcpAddr)
	}

	return nil, net.UnknownNetworkError(network)
}

type dialFunc func() (net.Conn, error)

func sender(conn net.Conn, msgs int, drainMode bool, done chan int) {
	rdbuf := make([]byte, 4096)
	total := 0

	payload := make([]byte, 1024)
	_, _ = rand.Read(payload)

	for i := 0; i < msgs; i += 1 {
		n, err := conn.Write(payload)
		if err != nil {
			log.Fatal(err)
		} else if n != len(payload) {
			log.Fatal("not all bytes written")
		}

		if !drainMode {
			if done != nil {
				done <- 1
			}

			continue
		}

		// drain
		received := 0

	drain:
		for {
			n, err := conn.Read(rdbuf[total:])

			if err != nil {
				if errors.Is(err, net.ErrClosed) || errors.Is(err, io.EOF) {
					log.Printf("connection closed")
					return
				}

				log.Fatalf("read error: %e", err)
			}

			total += n

			if total < 1024 {
				continue
			}

			for total >= 1024 {
				total = copy(rdbuf, rdbuf[1024:total])
				received++
			}

			break drain
		}

		if done != nil && received > 0 {
			done <- received
		}
	}
}

func receiver(conn net.Conn, done chan int) {
	rdbuf := make([]byte, 16384)
	total := 0

	for {
		n, err := conn.Read(rdbuf[total:])

		if err != nil {
			if errors.Is(err, net.ErrClosed) || errors.Is(err, io.EOF) {
				log.Printf("connection closed")
				return
			}

			log.Fatalf("read error: %e", err)
		}

		total += n

		received := 0

		for total >= 1024 {
			total = copy(rdbuf, rdbuf[1024:total])
			received++
		}

		if received > 0 {
			done <- received
		}
	}
}

func benchmark(df dialFunc, ncli, msgs int, drainMode int, done chan int) {
	var err error
	c := make([]net.Conn, ncli)
	log.Println("Connecting", ncli, "clients...")
	for i := 0; i < ncli; i++ {
		c[i], err = df() // Dial and do CER/CEA handshake.
		if err != nil {
			log.Fatal(err)
		}
		defer c[i].Close()
	}
	log.Println("Done. Sending messages...")
	start := time.Now()
	for _, cli := range c {
		switch drainMode {
		case 0:
			go sender(cli, msgs, false, done)
		case 1:
			go sender(cli, msgs, true, done)
		case 2:
			go sender(cli, msgs, false, nil)
			go receiver(cli, done)
		}
	}

	count := 0
	total := ncli * msgs
wait:
	for {
		select {
		case n := <-done:
			count += n
			if count == total {
				break wait
			}
		case <-time.After(100 * time.Second):
			log.Fatal("Timeout waiting for messages.")
		}
	}
	elapsed := time.Since(start)
	log.Printf("%d messages in %s: %d/s", count, elapsed,
		int(float64(count)/elapsed.Seconds()))
}

server:

package main

import (
	"errors"
	"flag"
	"io"
	"log"
	"net"
	"syscall"

	_ "net/http/pprof"

	"github.com/ishidawataru/sctp"
)

func main() {
	// go run server.go --addr=localhost:3867 --network=tcp

	addr := flag.String("addr", ":3868", "address in the form of ip:port to listen on")
	networkType := flag.String("network", "tcp", "protocol type tcp/sctp")
	echoMode := flag.Bool("echo", false, "Send back incoming messages")
	flag.Parse()

	err := listen(*networkType, *addr, *echoMode)
	if err != nil {
		log.Fatal(err)
	}
}

func listen(network, addr string, echoMode bool) error {
	log.Println("Starting server on", addr)

	var listener net.Listener

	switch network {
	case "sctp", "sctp4", "sctp6":
		sctpAddr, err := sctp.ResolveSCTPAddr(network, addr)
		if err != nil {
			return err
		}

		sctpListener, err := sctp.ListenSCTP(network, sctpAddr)
		if err != nil {
			return err
		}

		listener = sctpListener
	case "tcp", "tcp4", "tcp6":
		tcpAddr, err := net.ResolveTCPAddr(network, addr)
		if err != nil {
			return err
		}

		tcpListener, err := net.ListenTCP(network, tcpAddr)
		if err != nil {
			return err
		}

		listener = tcpListener
	}

	log.Printf("start listening on %s://%s", network, addr)
	for {
		conn, err := listener.Accept()
		if err != nil {
			log.Fatal("dead listener")
		}

		log.Printf("accepted incoming connection")

		go reader(conn, echoMode)
	}

}

func reader(conn net.Conn, echoMode bool) {
	buf := make([]byte, 4096)
	total := 0
	totalPackets := 0

	for {
		n, err := conn.Read(buf[total:])

		if err != nil {
			log.Printf("read error: %s (processed %d packets)", err, totalPackets)

			if errors.Is(err, net.ErrClosed) || errors.Is(err, syscall.ECONNRESET) || errors.Is(err, io.EOF) {
				log.Printf("connection closed")
				return
			}

			return
		}

		if echoMode {
			total += n

			for total >= 1024 {
				payload := buf[:1024]
				wn, err := conn.Write(payload)
				if err != nil {
					log.Printf("write error: %s (processed %d packets)", err, totalPackets)
					if errors.Is(err, net.ErrClosed) || errors.Is(err, syscall.ECONNRESET) {
						return
					}
					return
				}

				if wn != len(payload) {
					log.Fatal("not all bytes written")
				}

				total = copy(buf, buf[1024:total])

				totalPackets++
			}
		}
	}
}

Try running them as follows:

  1. server in echo mode, client in sync drain-mode:
$ ./server --addr 127.0.0.1:3868 --network sctp --echo
$ ./client --addr 127.0.0.1:3868 --network sctp --clients 1 --count 100000 --drain-mode 1
  1. server in echo mode, client in async drain-mode:
$ ./server --addr 127.0.0.1:3868 --network sctp --echo
$ ./client --addr 127.0.0.1:3868 --network sctp --clients 1 --count 100000 --drain-mode 2
  1. server in receive-only mode, client in send-only mode:
$ ./server --addr 127.0.0.1:3868 --network sctp
$ ./client --addr 127.0.0.1:3868 --network sctp --clients 1 --count 100000 --drain-mode 0

I've messed up with SCTP's sysctl settings, so I don't have results to share, but take a look at setups 1 and 2. The results for TCP vs SCTP in those cases are confusing at least.

@linouxis9
Copy link

Hi all,
Any updates / ideas on this issue?
Thanks!
Valentin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants