Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: UDPConn.WriteTo and UDPConn.ReadFromUDP both allocate #43451

Open
josharian opened this issue Dec 31, 2020 · 21 comments
Open

net: UDPConn.WriteTo and UDPConn.ReadFromUDP both allocate #43451

josharian opened this issue Dec 31, 2020 · 21 comments

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Dec 31, 2020

What version of Go are you using (go version)?

$ go version
go version devel +95ce805d14 Thu Dec 31 02:24:55 2020 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/Users/josh/bin"
GOCACHE="/Users/josh/Library/Caches/go-build"
GOENV="/Users/josh/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/josh/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/josh"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/josh/go/tip"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/josh/go/tip/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="devel +95ce805d14 Thu Dec 31 02:24:55 2020 +0000"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/josh/go/tip/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/1t/n61cbvls5bl293bbb0zyypqw0000gn/T/go-build2869058686=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I'd like to be able to write a program that uses UDPConn.WriteTo and UDPConn.ReadFromUDP without allocating per-packet.

This benchmark indicates one alloc per WriteTo and two allocs per ReadFromUDP.

func BenchmarkWriteToReadFromUDP(b *testing.B) {
	conn, err := ListenUDP("udp4", new(UDPAddr))
	if err != nil {
		b.Fatal(err)
	}
	addr := conn.LocalAddr()
	buf := make([]byte, 8)
	b.ResetTimer()
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		_, err := conn.WriteTo(buf, addr)
		if err != nil {
			b.Fatal(err)
		}
		_, _, err = conn.ReadFromUDP(buf)
		if err != nil {
			b.Fatal(err)
		}
	}
}

Two of the allocs come from constructing syscall.Sockaddrs. Maybe this is fixable, but I don't see an easy way.

The last alloc is from constructing a *UDPAddr to return from ReadFromUDP. I fear the API may make this one unavoidable.

cc @bradfitz @danderson @zx2c4

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Dec 31, 2020

I have a vague impression of pointing this out to @FiloSottile 2 years ago but I don't remember the conclusion of our conversation. CCing in case he has a better recollection.

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Dec 31, 2020

I found some old notes. The conclusion from last I looked into this was that the API made it unavoidable. As a result I wound up making direct syscalls on Linux but didn't port that to all platforms.

I wonder if this warrants adding a new API. ReadFromUDPWithPrealocatedSockaddr() or something...

@gopherbot
Copy link

@gopherbot gopherbot commented Jan 1, 2021

Change https://golang.org/cl/280934 mentions this issue: syscall: cache and re-use inet sockaddrs in anyToSockaddr

@josharian
Copy link
Contributor Author

@josharian josharian commented Jan 1, 2021

IIUC, most sockaddrs will be re-used and they are never mutated. Given that, we could intern them. E.g. we could use a sync.Pool of maps to opportunistically re-use them. (This is the technique that https://github.com/josharian/intern uses for strings; it is fast and pretty good, but not perfect. There are other options, with their own trade-offs, like go4.org/intern.)

I threw together CL 280934 to illustrate using the sync.Pool of maps approach.

@josharian
Copy link
Contributor Author

@josharian josharian commented Jan 1, 2021

Ugh. Nope, that is not safe. We end up exposing the sockaddr memory to the caller in this line (udpsock_posix.go:50):

addr = &UDPAddr{IP: sa.Addr[0:], Port: sa.Port}

Avoiding that would require making a copy of the sa.Addr bytes to serve as the new net.IP. It'd be an allocation of only 4 bytes, which is better than the 32 bytes interning the sockaddr saves, but it's still an allocation.

We might still be able to intern the sockaddrs that are destined for the kernel at least.

If we (a) switched to inet.af/netaddr's IP type and (b) returned a UDPAddr instead of a *UDPAddr, that'd eliminate the non-sockaddr allocations. But (b) requires new API, and (a) is Go 2 material.

We could do both by letting people provide their own *UDPAddr to be filled in, with a preallocated net.IP that could be overwritten...but we already have too many ways to receive a UDP packet. We don't need another one. (At least not in the standard library. Maybe we could set up some of this in golang.org/x/net somehow?)

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Jan 1, 2021

Evil idea: if the buf passed in is pre-allocated memory that's larger than what's needed for the data, and can fit the sockaddr and returned udpaddr and ip, could we stuff that all in at the end of buf?

@josharian
Copy link
Contributor Author

@josharian josharian commented Jan 1, 2021

Hyrum's Law says no. (Plus the Go standard library generally doesn't go for such evil tricks, entertaining though they be.)

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Jan 1, 2021

Maybe a variant of that might be acceptable:

Right now people pass in a buffer of the maximum size of data they want:

data := make([]byte, 1472)
n, addr, err := conn.ReadFromUDP(data)
data = data[:n]

My initial idea was to place the sockaddr allocations in the region of data[n:...] that doesn't wind up getting used. You cited Hyrum.

It occurred to me that other Go APIs sometimes work by taking slice to append to and then return a new slice. The reasoning goes that the caller can preallocate by allocating a slice with a large capacity but a zero length, and then the appending operation is free. What if we use a related trick here:

data := make([]byte, 1472, 2000)
n, addr, err := conn.ReadFromUDP(data)
data = data[:n]

In this instance, rather than placing addr at data[n:...], it's placed after the 1472 bytes by using append -- the region data[len(data):...], which does not need to allocate for another 528 bytes, because the capacity has been preallocated. So we avoid the allocation by putting the sockaddr there.

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Jan 1, 2021

Mmm, looks like that can still lead to unexpected problems:

package main

import (
	"fmt"
)

func doTheAliasingTrick(slice []byte) *byte {
	for i := range slice {
		slice[i] = 41
	}
	return &append(slice, 42)[len(slice)]
}

func main() {
	data := make([]byte, 1472, 2000)
	x := doTheAliasingTrick(data)
	
	fmt.Printf("*x = %d\n", *x) // Prints 42
	_ = append(data, 43)
	fmt.Printf("*x = %d\n", *x) // Prints 43
}

@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Jan 5, 2021

Avoiding that would require making a copy of the sa.Addr bytes to serve as the new net.IP. It'd be an allocation of only 4 bytes, which is better than the 32 bytes interning the sockaddr saves, but it's still an allocation.

I think you can outline that. It will require some care on the caller side, but if someone needs it they can make sure they don't get in the way of the inliner.

@toothrot toothrot added this to the Backlog milestone Jan 5, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented Feb 11, 2021

Change https://golang.org/cl/291390 mentions this issue: net: use mid-stack inlining with ReadFromUDP to avoid an allocation

@gopherbot
Copy link

@gopherbot gopherbot commented Feb 12, 2021

Change https://golang.org/cl/291509 mentions this issue: net: use mid-stack inlining with ReadFromUDP to avoid an allocation

josharian added a commit to tailscale/go that referenced this issue Feb 12, 2021
This commit rewrites ReadFromUDP to be mid-stack inlined
and pass a UDPAddr for lower layers to fill in.

This lets performance-sensitive clients avoid an allocation.
It requires some care on their part to prevent the UDPAddr
from escaping, but it is now possible.
The UDPAddr trivially does not escape in the benchmark,
as it is immediately discarded.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    17.2µs ± 6%    17.1µs ± 5%     ~     (p=0.387 n=9+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

Updates golang#43451

Co-authored-by: Filippo Valsorda <filippo@golang.org>
Change-Id: I1f9d2ab66bd7e4eff07fe39000cfa0b45717bd13
josharian added a commit to josharian/go that referenced this issue Feb 12, 2021
This commit rewrites ReadFromUDP to be mid-stack inlined
and pass a UDPAddr for lower layers to fill in.

This lets performance-sensitive clients avoid an allocation.
It requires some care on their part to prevent the UDPAddr
from escaping, but it is now possible.
The UDPAddr trivially does not escape in the benchmark,
as it is immediately discarded.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    17.2µs ± 6%    17.1µs ± 5%     ~     (p=0.387 n=9+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

Updates golang#43451

Co-authored-by: Filippo Valsorda <filippo@golang.org>
Change-Id: I1f9d2ab66bd7e4eff07fe39000cfa0b45717bd13
@josharian
Copy link
Contributor Author

@josharian josharian commented Feb 12, 2021

I played with this a bit more. Results:

With CL 291509 and those two commits, we would have zero allocs per write and one net.IP backing array allocated per receive (4 or 16 bytes).

But those two commits involve new API and duplicating subtle code. :(

The new API is:

// WriterTo returns an io.Writer that writes UDP packets to addr.
// This is more efficient than WriteTo when many packets will be sent to the same addr.
func (c *UDPConn) WriterTo(addr Addr) (io.Writer, error)

I'm happy to propose the API and mail the other change if there's interest, but my working assumption is that they're both non-starters.

bradfitz added a commit to tailscale/go that referenced this issue Feb 18, 2021
… an allocation

This commit rewrites ReadFromUDP to be mid-stack inlined
and pass a UDPAddr for lower layers to fill in.

This lets performance-sensitive clients avoid an allocation.
It requires some care on their part to prevent the UDPAddr
from escaping, but it is now possible.
The UDPAddr trivially does not escape in the benchmark,
as it is immediately discarded.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    17.2µs ± 6%    17.1µs ± 5%     ~     (p=0.387 n=9+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

Updates golang#43451

Co-authored-by: Filippo Valsorda <filippo@golang.org>
Change-Id: I1f9d2ab66bd7e4eff07fe39000cfa0b45717bd13
gopherbot pushed a commit that referenced this issue Mar 15, 2021
This commit rewrites ReadFromUDP to be mid-stack inlined
and pass a UDPAddr for lower layers to fill in.

This lets performance-sensitive clients avoid an allocation.
It requires some care on their part to prevent the UDPAddr
from escaping, but it is now possible.
The UDPAddr trivially does not escape in the benchmark,
as it is immediately discarded.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    17.2µs ± 6%    17.1µs ± 5%     ~     (p=0.387 n=9+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

Updates #43451

Co-authored-by: Filippo Valsorda <filippo@golang.org>
Change-Id: I1f9d2ab66bd7e4eff07fe39000cfa0b45717bd13
Reviewed-on: https://go-review.googlesource.com/c/go/+/291509
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Trust: Filippo Valsorda <filippo@golang.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Jason A. Donenfeld <Jason@zx2c4.com>
josharian added a commit to tailscale/go that referenced this issue May 26, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Updates golang#43451

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
@josharian
Copy link
Contributor Author

@josharian josharian commented May 26, 2021

tailscale@1074dae removes all allocations for WriteTo.

tailscale@4b4fb83 reduces the size of the allocation for ReadFrom from 32 bytes down to the size of net.IP (4 bytes for IPv4, 16 for IPv6).

Both come at the cost of duplicating a bunch of code, some of it non-trivial. I'm happy to mail for 1.18, if folks aren't too horrified by that.

Remaining the last ReadFrom allocation is impossible without an API change. The key change is having a value IP type, such as netaddr.IP.

cc @neild @bradfitz

@neild
Copy link
Contributor

@neild neild commented May 26, 2021

I'm not horrified by the amount of code duplication in those CLs.

@josharian
Copy link
Contributor Author

@josharian josharian commented Jun 2, 2021

I'm not horrified by the amount of code duplication in those CLs.

Great. I'll plan to mail them soon or early in the 1.18 cycle. And if I forget or don't get to it quickly enough, you (or anyone) has my explicit permission to do so.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 28, 2021

Change https://golang.org/cl/331489 mentions this issue: net: remove allocation from UDPConn.WriteTo

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 28, 2021

Change https://golang.org/cl/331490 mentions this issue: net: reduce allocation size in ReadFromUDP

@gopherbot

This comment has been hidden.

@gopherbot

This comment has been hidden.

@gopherbot
Copy link

@gopherbot gopherbot commented Jun 28, 2021

Change https://golang.org/cl/331511 mentions this issue: net: reduce allocations for UDP send/recv on Windows

josharian added a commit to tailscale/go that referenced this issue Jun 30, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
josharian added a commit to tailscale/go that referenced this issue Jun 30, 2021
Switch to concrete types. Bring your own object to fill in.

Allocate just enough for the IP byte slice.
The allocation is now just 4 bytes for IPv4,
which puts it in the tiny allocator, which is much faster.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    13.7µs ± 1%    13.4µs ± 2%   -2.49%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     32.0B ± 0%      4.0B ± 0%  -87.50%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331490)

Change-Id: Ief506f891b401d28715d22dce6ebda037941924e
josharian added a commit to tailscale/go that referenced this issue Jun 30, 2021
This brings the optimizations added in CLs 331489 and 331490 to Windows.

Updates golang#43451

(cherry picked from golang.org/cl/331511)

Change-Id: I75cf520050325d9eb5c2785d6d8677cc864fcac8
steeve added a commit to znly/go that referenced this issue Jul 5, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
steeve added a commit to znly/go that referenced this issue Jul 15, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
josharian added a commit to tailscale/go that referenced this issue Jul 29, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
josharian added a commit to tailscale/go that referenced this issue Jul 29, 2021
Switch to concrete types. Bring your own object to fill in.

Allocate just enough for the IP byte slice.
The allocation is now just 4 bytes for IPv4,
which puts it in the tiny allocator, which is much faster.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    13.7µs ± 1%    13.4µs ± 2%   -2.49%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     32.0B ± 0%      4.0B ± 0%  -87.50%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331490)

Change-Id: Ief506f891b401d28715d22dce6ebda037941924e
josharian added a commit to tailscale/go that referenced this issue Jul 29, 2021
This brings the optimizations added in CLs 331489 and 331490 to Windows.

Updates golang#43451

(cherry picked from golang.org/cl/331511)

Change-Id: I75cf520050325d9eb5c2785d6d8677cc864fcac8
josharian added a commit to tailscale/go that referenced this issue Jul 30, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
josharian added a commit to tailscale/go that referenced this issue Jul 30, 2021
Switch to concrete types. Bring your own object to fill in.

Allocate just enough for the IP byte slice.
The allocation is now just 4 bytes for IPv4,
which puts it in the tiny allocator, which is much faster.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    13.7µs ± 1%    13.4µs ± 2%   -2.49%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     32.0B ± 0%      4.0B ± 0%  -87.50%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331490)

Change-Id: Ief506f891b401d28715d22dce6ebda037941924e
josharian added a commit to tailscale/go that referenced this issue Jul 30, 2021
This brings the optimizations added in CLs 331489 and 331490 to Windows.

Updates golang#43451

(cherry picked from golang.org/cl/331511)

Change-Id: I75cf520050325d9eb5c2785d6d8677cc864fcac8
josharian added a commit to tailscale/go that referenced this issue Aug 5, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
josharian added a commit to tailscale/go that referenced this issue Aug 5, 2021
Switch to concrete types. Bring your own object to fill in.

Allocate just enough for the IP byte slice.
The allocation is now just 4 bytes for IPv4,
which puts it in the tiny allocator, which is much faster.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    13.7µs ± 1%    13.4µs ± 2%   -2.49%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     32.0B ± 0%      4.0B ± 0%  -87.50%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331490)

Change-Id: Ief506f891b401d28715d22dce6ebda037941924e
josharian added a commit to tailscale/go that referenced this issue Aug 5, 2021
This brings the optimizations added in CLs 331489 and 331490 to Windows.

Updates golang#43451

(cherry picked from golang.org/cl/331511)

Change-Id: I75cf520050325d9eb5c2785d6d8677cc864fcac8
gopherbot pushed a commit that referenced this issue Aug 16, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates #43451

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
Reviewed-on: https://go-review.googlesource.com/c/go/+/331489
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
gopherbot pushed a commit that referenced this issue Aug 16, 2021
Switch to concrete types. Bring your own object to fill in.

Allocate just enough for the IP byte slice.
The allocation is now just 4 bytes for IPv4,
which puts it in the tiny allocator, which is much faster.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    13.7µs ± 1%    13.4µs ± 2%   -2.49%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     32.0B ± 0%      4.0B ± 0%  -87.50%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)

Windows is temporarily stubbed out.

Updates #43451

Change-Id: Ief506f891b401d28715d22dce6ebda037941924e
Reviewed-on: https://go-review.googlesource.com/c/go/+/331490
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
gopherbot pushed a commit that referenced this issue Aug 16, 2021
This brings the optimizations added in CLs 331489 and 331490 to Windows.

Updates #43451

Change-Id: I75cf520050325d9eb5c2785d6d8677cc864fcac8
Reviewed-on: https://go-review.googlesource.com/c/go/+/331511
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
steeve added a commit to znly/go that referenced this issue Aug 19, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
steeve added a commit to znly/go that referenced this issue Oct 28, 2021
Duplicate some code to avoid an interface.

name                  old time/op    new time/op    delta
WriteToReadFromUDP-8    6.38µs ±20%    5.59µs ±10%  -12.38%  (p=0.001 n=10+9)

name                  old alloc/op   new alloc/op   delta
WriteToReadFromUDP-8     64.0B ± 0%     32.0B ± 0%  -50.00%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
WriteToReadFromUDP-8      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Windows is temporarily stubbed out.

Updates golang#43451

(cherry picked from golang.org/cl/331489)

Change-Id: Ied15ff92268c652cf445836e0446025eaeb60cc9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants