Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: not possible to set TCP backlog parameter for listener #39000

Open
nemirst opened this issue May 11, 2020 · 9 comments
Open

net: not possible to set TCP backlog parameter for listener #39000

nemirst opened this issue May 11, 2020 · 9 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@nemirst
Copy link

nemirst commented May 11, 2020

What version of Go are you using (go version)?

go version go1.14.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/cppko/.cache/go-build"
GOENV="/home/cppko/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/cppko//Source/gocode"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/snap/go/current"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/snap/go/current/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build230275253=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I'm crossbuilding from Linux to Windows (but that is not important here). When 500 clients connect simultaneously to my Go program (TCP server) it fails to accept them all because backlog queue gets filled. I process them as fast as possible. As a workaround, on Linux I fixed this by setting net.core.somaxconn kernel parameter. But on Windows server (at least on Microsoft Windows Server 2019 Datacenter) I didn't find similar OS based parameter/solution. For Windows I had to rebuild Go from source after changing net.maxListenerBacklog to return custom backlog value (for me SOMAXCONN_HINT(500)=-500 worked): https://github.com/golang/go/blob/master/src/net/sock_windows.go#L16. There are other workarounds probably but this was done just to verify if setting this parameter fixes the problem.

What did you expect to see?

Some kind of way to set backlog parameter before listener starts to accept connections so that clients don't disconnect just because backlog is full in bursty scenarios.

What did you see instead?

Clients disconnected when backlog was full (size is 200 on my system).

@kostix
Copy link

kostix commented May 11, 2020

The discussion which led to creation of this issue.

I'd try to rephrase the issue a bit so it becomes supposedly more understandable.

To figure out the size of the TCP listening backlog, the Go library code reads the relevant system setting once.
Linux (and several other supported platforms) do have such a setting, while others have not.
Windows currently falls into the latter category and so for it this setting is defined as syscall.SOMAXCONN, which is currently defined to be 0x7fffffff.
Still, as some guy behind Windows SDK suggests here, since Windows 8, there is actually a way to specify the backlog when executing a TCP listen syscall: this is done by passing a negative integer as the second argument; ostensibly, the sign tells the Winsock stack this is a direct backlog override. (It's still not clear whether Windows has a system-wide setting to control the cap on the backlog.)

What @nemirst suggests is to have a way to explicitly specify the backlog size parameter when creating a listening TCP socket — something akin to what has been implemented in #9661 for tweaking options of already created sockets.
This would solve the problem on Windows — a programmer could set a value they would like to use, and could be also useful on other platforms — to have more fine-grained control over listening sockets.

@ianlancetaylor ianlancetaylor changed the title Not possible to set TCP backlog parameter for listener net: not possible to set TCP backlog parameter for listener May 11, 2020
@ianlancetaylor
Copy link
Contributor

In general we try to pass the maximum acceptable value as the listen backlog parameter. And it sounds like if we could do that on Windows, it would fix the problem that you are seeing.

From the docs (https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-listen), it looks like using SOMAXCONN sets a "maximum reasonable value." And we can instead use SOMAXCONN_HINT(N) where N is limited to be between 200 and 65535. So what if always use SOMAXCONN_HINT(65535)? It seems that that would fix your problem. Would it have any negative consequences?

@nemirst
Copy link
Author

nemirst commented May 12, 2020

Yes, that would fix my problem completely without any real disadvantages for my case. I'm only not sure if that is appropriate for some other case, like:

  1. In case allocated resources for socket are very important:

When calling the listen function in a Bluetooth application, it is strongly recommended that a much lower value be used for the backlog parameter (typically 2 to 4), since only a few client connections are accepted. This reduces the system resources that are allocated for use by the listening socket. This same recommendation applies to other network applications that expect only a few client connections.

  1. If SYN flood attacks are possible
    https://tangentsoft.net/wskfaq/advanced.html, 4.14 - What is the connection backlog?

Beware that large backlogs make SYN flood attacks much more, shall we say, effective. When Winsock creates the backlog queue, it starts small and grows as required. Since the backlog queue is in non-pageable system memory, a SYN flood can cause the queue to eat a lot of this precious memory resource.
You will note that SYN attacks are dangerous for systems with both very short and very long backlog queues. The point is that a middle ground is the best course if you expect your server to withstand SYN attacks. Either use Microsoft’s dynamic backlog feature, or pick a value somewhere in the 20-200 range and tune it as required.

For me little bit higher memory usage is not important and also SYN flood attacks are not possible as server will not be public.

@acln0
Copy link
Contributor

acln0 commented May 12, 2020

ISTM that it's not possible to use the ListenConfig.Control function to achieve this. Maybe we can add a backlog size knob to net.ListenConfig, then? If the knob is set (!= 0), we use that value. If it's not, we use the value we already use.

@ianlancetaylor
Copy link
Contributor

@acln0 I agree that we could add a knob to ListenConfig, I just want to understand whether we really have to. My general impression of the backlog parameter to listen is that it has historical meaning but is pretty much useless on modern systems. Perhaps I am wrong.

@acln0
Copy link
Contributor

acln0 commented May 12, 2020

@ianlancetaylor You may be right. I don't know about useless, but it's at least a little strange.

I implemented the knob I was talking about in package net, and then tried to write a test for it. The test tried to do something like this, on Linux:

lncfg := net.ListenConfig{Backlog: 1}
ln, _ := lncfg.Listen("tcp", ":0")

done1 := make(chan error)
go func() { done1 <- dial(ln.Addr()) }

time.Sleep(aShortWhile) // to give the first dialer time to race ahead, send a SYN, etc.

done2 := make(chan error)
go func() { done2 <- dial(ln.Addr()) }

select {
case err := <-done2:
	if err == nil {
		// bad
	}
case <-time.After(notVeryLong):
	// took too long, also bad
}

ln.Accept()
<-done1

Naively, I expected this to honor the backlog of 1. It didn't: both dials succeeded immediately. I don't know what this means, and I haven't had time to dig into the kernel sources in order to understand the behavior. Maybe Linux ignores backlog values that are this small.

On the other hand, leaving Linux aside for the moment, I am reading https://docs.microsoft.com/en-us/archive/blogs/winsdk/winsocks-listen-backlog-offers-more-flexibility-in-windows-8, which seems reasonably official, and mentions backlog parameters as low as 2 - 4, so maybe a test like that one would pass on Windows, and would also capture the essence of this issue (and make the knob worth doing).

@acln0
Copy link
Contributor

acln0 commented May 12, 2020

If you think it would help, I could mail a CL of my implementation and the test (with DO NOT SUBMIT, etc.)

@gopherbot
Copy link

Change https://golang.org/cl/233577 mentions this issue: net: add Backlog knob to ListenConfig

@acln0
Copy link
Contributor

acln0 commented May 12, 2020

I mailed it in the end. The relevant test is TestListenConfigBacklog, in listen_windows_test.go.

@cagedmantis cagedmantis added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 18, 2020
@cagedmantis cagedmantis added this to the Backlog milestone May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

6 participants