Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: calling (*UnixConn).File leads to performance degredation #22953

Closed
stevvooe opened this issue Dec 1, 2017 · 12 comments

Comments

Projects
None yet
5 participants
@stevvooe
Copy link

commented Dec 1, 2017

What version of Go are you using (go version)?

$ go version
go version go1.9.2 linux/amd64

Does this issue reproduce with the latest release?

Yes, this is on 1.9.2. I haven't tried master.

What operating system and processor architecture are you using (go env)?

linux/amd64, Ubuntu 17.04 with 4.10.0-40-generic kernel

What did you do?

I was trying to track down a performance degradation when using unix socket credentials on an abstract socket (bind address starts with a null byte \x00). The following benchmark shows the discrepancy:

goos: linux
goarch: amd64
pkg: github.com/stevvooe/ttrpc
BenchmarkRoundTrip-8                  	   50000	     22078 ns/op	    2593 B/op	      43 allocs/op
BenchmarkRoundTripUnixSocketCreds-8   	   10000	    141333 ns/op	    2593 B/op	      43 allocs/op
PASS
ok  	github.com/stevvooe/ttrpc	3.142s

Notice that the benchmark that uses the unix socket credentials is much slower. The code in question is available here:

https://github.com/stevvooe/ttrpc/blob/45d16b41b590938186c5c7cde8088607b3933231/unixcreds.go#L23-L35

After some experimentation, I found that removing the call to (*UnixConn).File got rid of the performance discrepancy. As an experiment, I applied the following patch to the net package to access the fd directly and pass that to unix.GetsockoptUcred:

diff --git a/net/unixsock.go b/net/unixsock.go.fdfix
index 057940a..df44b1b 100644
--- a/net/unixsock.go
+++ b/net/unixsock.go.fdfix
@@ -63,6 +63,10 @@ type UnixConn struct {
 	conn
 }
 
+func (c *UnixConn) Fd() uintptr {
+	return uintptr(c.conn.fd.pfd.Sysfd)
+}
+
 // SyscallConn returns a raw network connection.
 // This implements the syscall.Conn interface.
 func (c *UnixConn) SyscallConn() (syscall.RawConn, error) {

When I ran the benchmark with that change, passing (*UnixConn).Fd directly to unix.GetsockoptUcred, without calling (*UnixConn).File, I then got the following benchmark numbers:

goos: linux
goarch: amd64
pkg: github.com/stevvooe/ttrpc
BenchmarkRoundTripUnixSocketCreds-8      50000	     21897 ns/op	    2593 B/op	      43 allocs/op
PASS
ok  	github.com/stevvooe/ttrpc	1.693s

While this probably isn't a great patch, as making that fd available can probably cause problems with the poller, hopefully this demonstrates the issue with call (*UnixConn).File. If you need a more minimal example, let me know.

What did you expect to see?

I expected no performance degradation when calling (*UnixConn).File.

What did you see instead?

A performance degradation after calling (*UnixConn).File.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Dec 1, 2017

@stevvooe

This comment has been minimized.

Copy link
Author

commented Dec 1, 2017

@bradfitz Thanks for the quick response!

Using the (*syscall.RawConn).Control addresses the performance issue on my end. The results are available in containerd/ttrpc#17.

Is there a problem with dup that is causing this problem? Should applications avoid calling (*net.UnixConn).File? Is there anyway I can help here?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Dec 1, 2017

Doing more work will always be slower than doing less work.

Generally people don't want the thing dup'd. If anything, we could do more with documentation here and point people in the right direction from the various Fd methods.

I'll keep this open as a doc bug.

@bradfitz bradfitz added this to the Go1.11 milestone Dec 1, 2017

@stevvooe

This comment has been minimized.

Copy link
Author

commented Dec 1, 2017

I agree a pointer to the (*syscall.RawConn).Control call for fd access is needed in the docs, as this wasn't immediately obvious. It seems to be the correct way for this application.

Doing more work will always be slower than doing less work.

I'm not sure I made this clear, but the benchmark doesn't include the extra call to (*net.UnixConn).File. All of that is before the timing starts in the benchmark, so the problem manifests itself while reading and writing from the socket (unless my benchmark is setup incorrectly). You can try out containerd/ttrpc@256c17b to experiment.

You can even put a call before the benchmark loop to prove the connection is active and it doesn't affect the results. To ensure that was correct, I applied the following patch to containerd/ttrpc@256c17b:

diff --git a/server_test.go b/server_test.go
index 8be008b..c3fde5c 100644
--- a/server_test.go
+++ b/server_test.go
@@ -388,6 +388,7 @@ func BenchmarkRoundTripUnixSocketCreds(b *testing.B) {
 	// unix socket credentials. See (UnixCredentialsFunc).Handshake for
 	// details.
 
+	fmt.Println("\n\n-------------------------")
 	var (
 		ctx             = context.Background()
 		server          = mustServer(b)(NewServer(WithServerHandshaker(UnixSocketRequireSameUser())))
@@ -406,6 +407,11 @@ func BenchmarkRoundTripUnixSocketCreds(b *testing.B) {
 	defer server.Shutdown(ctx)
 
 	var tp testPayload
+	// do one call to prime the server
+	if _, err := tclient.Test(ctx, &tp); err != nil {
+		b.Fatal(err)
+	}
+	fmt.Println("reset timer")
 	b.ResetTimer()
 
 	for i := 0; i < b.N; i++ {
diff --git a/unixcreds.go b/unixcreds.go
index 5de8b92..e958d90 100644
--- a/unixcreds.go
+++ b/unixcreds.go
@@ -4,6 +4,7 @@ package ttrpc
 
 import (
 	"context"
+	"fmt"
 	"net"
 	"os"
 	"syscall"
@@ -38,6 +39,7 @@ func (fn UnixCredentialsFunc) Handshake(ctx context.Context, conn net.Conn) (net
 		return nil, nil, errors.Wrapf(err, "ttrpc.UnixCredentialsFunc: credential check failed")
 	}
 
+	fmt.Println("handshake complete")
 	return uc, ucred, nil
 }

When running the benchmark, we can both see that the handshake is completed before the timer reset and the result is not affected:

$ go test -run - -benchmem -bench 'Unix'


-------------------------
handshake complete
reset timer
goos: linux
goarch: amd64
pkg: github.com/stevvooe/ttrpc
BenchmarkRoundTripUnixSocketCreds-8   	

-------------------------
handshake complete
reset timer


-------------------------
handshake complete
reset timer
   10000	    139126 ns/op	    2593 B/op	      43 allocs/op
PASS
ok  	github.com/stevvooe/ttrpc	1.406s

Am I missing something?

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Dec 1, 2017

BenchmarkRoundTrip-8                  	   50000	     22078 ns/op	    2593 B/op	      43 allocs/op
BenchmarkRoundTripUnixSocketCreds-8   	   10000	    141333 ns/op	    2593 B/op	      43 allocs/op

The result (6x slower) makes me guess that your code probably may disturb IO multiplexing. If you set the underlying socket of UnixConn blocking by calling the File method of UnixConn and leave the socket as it is, not calling the FileConn function to set the socket non-blocking again, it may be a cause of ineffective IO multiplexing; losing the deadline feature of UnixConn means that you lose half of trigger mechanisms for IO multiplexing on UnixConn. See #21862.

@stevvooe

This comment has been minimized.

Copy link
Author

commented Dec 1, 2017

@mikioh There isn't really anything special about the code. It just calls (*UnixConn).File, grabs the fd then closes the file. Are you saying that calling (*UnixConn).File changes the behavior of the original *UnixConn socket from that point forward? Should the original connection not be used after calling File?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 1, 2017

Currently when you call the File method it expects that you want a descriptor in blocking mode. On Unix systems that unavoidably puts the descriptor that the net package is using into blocking mode as well. Everything continues to work, but reads and writes on the descriptor will now block instead of using the poller, meaning that each operation on the descriptor will tie up an operating thread. That may or may not be the cause of what you are seeing.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2017

@stevvooe,

Should the original connection not be used after calling File?

Try this:

f, err := c.File()
c.Close()
if err != nil {
        // error handling
}
syscall.Setsockopt(int(f.Fd()), ...)
c, err = net.FileConn(f)
f.Close()
if err != nil {
        // error handling
}
n, err := c.Read(...)
@stevvooe

This comment has been minimized.

Copy link
Author

commented Dec 4, 2017

@mikioh I ended up merging https://github.com/stevvooe/ttrpc/pull/17/files, as there was no performance degradation using that method. Per your instructions, I applied the following to containerd/ttrpc@256c17b:

diff --git a/unixcreds.go b/unixcreds.go
index 5de8b92..dbf2096 100644
--- a/unixcreds.go
+++ b/unixcreds.go
@@ -38,7 +38,12 @@ func (fn UnixCredentialsFunc) Handshake(ctx context.Context, conn net.Conn) (net
 		return nil, nil, errors.Wrapf(err, "ttrpc.UnixCredentialsFunc: credential check failed")
 	}
 
-	return uc, ucred, nil
+	fc, err := net.FileConn(fp)
+	if err != nil {
+		return nil, nil, err
+	}
+
+	return fc, ucred, nil
 }
 
 func UnixSocketRequireUidGid(uid, gid int) UnixCredentialsFunc {

Running the benchmarks shows that the performance problem goes away:

goos: linux
goarch: amd64
pkg: github.com/stevvooe/ttrpc
BenchmarkRoundTrip-8                  	   50000	     23390 ns/op
BenchmarkRoundTripUnixSocketCreds-8   	  100000	     25180 ns/op
PASS
ok  	github.com/stevvooe/ttrpc	4.191s

Looking at this code path, it seems like the FileConn fd gets added to the poller, whereas the file returned directly from (*UnixConn).File does not. Also note that in the patch above, whether or not the file connection or original unix connection is returned does not affect the result.

@ianlancetaylor That sounds like a reasonable theory. Is there a way to confirm that?

It sounds like the current docs need to make it clear that calling (*UnixConn).File affects the connection which it is called. This is fairly surprising, since it is the duped fd that has blocking set, not the original.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2017

@stevvooe You can see exactly what is happening by looking at the strace -f output.

On Unix systems whether a file/socket is in blocking mode or not affects all descriptors. It is a per-file flag, not a per-descriptor flag.

Documentation patches welcome but I find it quite hard to document these things concisely and usefully. It may be a better topic for a blog post.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2017

@stevvooe,

I won't take a look at your code, but please don't forget to close transient file descriptors; you make two new transient descriptors referring to the same underlying socket by calling net.UnixConn.File and net.FileConn. In general using syscall.RawConn is a simpler solution (unless the user-defined function breaks something under the hood) and using a pair of File+FileConn is a safer solution (unless the caller makes a resource leak.)

@mikioh mikioh removed the WaitingForInfo label Dec 5, 2017

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Apr 20, 2018

Merging into #24942.

@mikioh mikioh closed this Apr 20, 2018

@golang golang locked and limited conversation to collaborators Apr 20, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.