Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/ipv{4,6}: ReadBatch memory usage #26838

Open
mhr3 opened this issue Aug 7, 2018 · 3 comments

Comments

@mhr3
Copy link

commented Aug 7, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

package reader

import (
	"io"
	"net"
	"syscall"
	"testing"

	"golang.org/x/net/ipv4"
)

func getConn() net.PacketConn {
	if addr, err := net.ResolveUDPAddr("udp", net.JoinHostPort("127.0.0.1", "40006")); err != nil {
		panic(err)
	} else if l, err := net.ListenUDP("udp", addr); err != nil {
		panic(err)
	} else if f, err := l.File(); err != nil {
		panic(err)
	} else if conn, err := net.FilePacketConn(f); err != nil {
		panic(err)
	} else {
		return conn
	}
}

func BenchmarkMem(b *testing.B) {
	pc := ipv4.NewPacketConn(getConn())

	const numMsgs = 4
	msgs := [numMsgs]ipv4.Message{}
	for i := 0; i < 4; i++ {
		msgs[i].Buffers = [][]byte{make([]byte, 4096)}
	}

	process := func(data []byte, addr net.Addr) {}

	for i := 0; i < b.N; i++ {
		n, packetErr := pc.ReadBatch(msgs[:], syscall.MSG_WAITFORONE)

		if packetErr == io.EOF {
			return
		} else if packetErr != nil {
			return
		}

		for j := 0; j < n; j++ {
			msg := msgs[j]
			process(msg.Buffers[0][:msg.N], msg.Addr)
		}
	}
}

What did you expect to see?

I was benchmarking the program above (with -benchmem) and expected 0 allocations per iteration - as using ReadBatch is supposed to help with performance on heavily-utilized sockets.

What did you see instead?

I saw that each extra message in the msgs array adds 2 extra allocations for each iteration of the benchmark, negating the extra performance of ReadBatch() vs traditional Read().

@gopherbot gopherbot added this to the Unreleased milestone Aug 7, 2018

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

I agree that there are areas for performance improvement on {Read,Write}Batch methods. Feel free to send a CL. Well, the attached benchmark is not self-contained; no traffic generator code. You may use a benchmark in golang.org/x/net/internal/socket_go1_9_test.go.

expected 0 allocations per iteration [...] 2 extra allocations for each iteration

I guess that the allocations are for storage of iovec and sockaddr.

@mikioh mikioh added the Performance label Aug 8, 2018

@mhr3

This comment has been minimized.

Copy link
Author

commented Aug 8, 2018

Right, I used external traffic generator to make sure it doesn't skew the benchmark results (iperf -c 127.0.0.1 -p 40006 -u -b 100M -t 30)

I also used allocfreetrace to confirm your guess:

x/net/internal/socket/mmsghdr_unix.go:15 - vs := make([]iovec, len(ms[i].Buffers))
x/net/internal/socket/mmsghdr_unix.go:18 - sa = make([]byte, sizeofSockaddrInet6)
x/net/internal/socket/rawconn_mmsg.go:25 - var operr error
x/net/internal/socket/rawconn_mmsg.go:26 - var n int
x/net/internal/socket/rawconn_mmsg.go:27 - fn := func(s uintptr) bool { ...

As you suspected, it is the iovec and socketaddr which grow linearly with the number of items in the batch, but there's also a few other allocs (though at least those remain constant).

Not really sure how to fix this without changing the signature of ReadBatch()

@mikioh mikioh changed the title x/net: ReadBatch memory usage x/net/ipv{4,6}: ReadBatch memory usage Aug 12, 2018

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Aug 12, 2018

Not really sure how to fix this without changing the signature of ReadBatch()

You may think about a few ways to remove allocations on IO code path, for example, having a free list for temporary stuff. However, it's better to address your issue on the IO methods first; in general, comparing ReadBatch vs. Read doesn't make sense on connectionless datagram-based transport protocols because of the differences in the transport protocol characteristics, for example, any-to-any vs. one-to-one mapping conversation style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.