-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: 1.14 performance regression on mac #36298
Comments
Thank you for this report @erikdubbelboer. Unfortunately, I don't have a mac, so I can't test this. Would you be open to running a git bisect to exactly pin point this issue to the offending commit ? That would help a lot in the investigation. |
Some information I found bisecting.
c1635ad ( |
I'm seeing similar behavior with Go 1.12, Go 1.13, and tip. The program starts out reporting a latency of about 200ms. As the connections get up to the full 9000, the latency starts to increase. For me on macOS 10.15 it seems to settle around 220ms to 250ms. This is an example of what I see. Are these the kinds of numbers you see as well? Do you really see a difference between 1.13 and tip? I'm not seeing it. I see similar delays on GNU/Linux on my laptop.
|
I think part of the problem is that the timer implementation tends to cause all your servers to synchronize. They all write data at the same time and the readers all run into each other, so some see higher latency. If I change the func handler(c net.Conn) {
buff := make([]byte, 4)
for {
if _, err := c.Read(buff); err != nil {
panic(err)
}
off := (rand.Int() % 16) - 8
time.Sleep(responseDelay + time.Duration(off)*time.Millisecond)
if _, err := c.Write(buff); err != nil {
panic(err)
}
}
} |
Any program like this is bound to collapse at some level of connections. As I said above, I'm seeing similar results with Go 1.13 and tip. Can you confirm that you see better results with 1.13? If you push up the number of connections, it will collapse on 1.13; what number of connection causes that collapse? Thanks. |
I have a 2019 Macbook pro (2,3 GHz 8-Core) with This is a run on tip (
|
@ianlancetaylor what minor version of 1.13 did you use? 1.13.2 was the first version where it was fixed. With 1.13.5 the collapsing seems to start happening around 16000 connections on my machine:
|
I was testing with the tip of the 1.13 branch, which is a couple of CLs past 1.13.5. What happens on your machine if you tweak the |
The tip of the 1.13 branch is steady at around 202ms for me at 9000 connections. These are two runs with the
|
When I look at an execution trace on GNU/Linux I see that the program is idle about half the time, so I'm currently still inclined to think that this is more of a timing synchronization problem than a performance problem as such. That is, I'm not yet persuaded that this would be a problem for a real program that does network activity but doesn't use timers. But I'll keep looking. |
cc @eliasnaur @odeke-em as other MacOS devs... @gopherbot remove WaitingForInfo |
@ianlancetaylor for me it's also idle about half the time (with both 1.13 and 1.14) so I agree it should be reworded as a timing problem. But that doesn't change the fact that it seems the new timer code could potentially add a lot of unnecessary delay to high traffic network code on Mac. I'm not running any production environment on Mac so I can't say if any real programs are affected. Maybe a more realistic scenario is having the delay between requests: https://gist.github.com/erikdubbelboer/d493450f38a77fc74b57f4a1cd4f7380
|
Separating the server and client part of the process into separate processes I can still reproduce the issue if the server is running go1.13.5 (without any sleep in the handler) and the client is running tip with a 200ms delay between requests. This I think shows that the issue is with the net code in combination with the sleep. Using go1.13.5 for the client and tip for the server I could never reproduce the issue. |
Another interesting thing which might give a hint is that all runs have a drop in the number of requests handled right before the latency increase. This is with the separate server and client process and a random sleep between requests. run 1connections: 5920 latency: 108.459µs (requests: 29157) connections: 6167 latency: 263.384µs (requests: 30345) connections: 6415 latency: 106.456µs (requests: 31641) connections: 6663 latency: 111.774µs (requests: 32885) connections: 6912 latency: 115.933µs (requests: 34132) connections: 7160 latency: 119.558µs (requests: 35398) connections: 7202 latency: 262.22µs (requests: 6035) <-- here connections: 7453 latency: 1.919001617s (requests: 25186) connections: 7685 latency: 55.58679ms (requests: 31844) connections: 7939 latency: 37.682556ms (requests: 30672) connections: 8158 latency: 39.397319ms (requests: 34357) connections: 8347 latency: 74.686898ms (requests: 26174) connections: 8567 latency: 94.930925ms (requests: 28020) connections: 8782 latency: 108.449594ms (requests: 29237) connections: 8953 latency: 106.607694ms (requests: 29934) run 2connections: 5671 latency: 122.542µs (requests: 27920) connections: 5918 latency: 189.623µs (requests: 29170) connections: 6165 latency: 137.299µs (requests: 30348) connections: 6413 latency: 160.631µs (requests: 31650) connections: 6662 latency: 175.661µs (requests: 32864) connections: 6706 latency: 237.109µs (requests: 5985) <-- here connections: 6942 latency: 1.528003839s (requests: 24469) connections: 7198 latency: 39.526885ms (requests: 31393) connections: 7424 latency: 48.19922ms (requests: 26490) connections: 7682 latency: 54.880246ms (requests: 33168) connections: 7921 latency: 61.12974ms (requests: 27962) run 3connections: 6165 latency: 83.668µs (requests: 30374) connections: 6413 latency: 79.824µs (requests: 31655) connections: 6662 latency: 157.149µs (requests: 32870) connections: 6910 latency: 88.691µs (requests: 34132) connections: 7159 latency: 93.118µs (requests: 35371) connections: 7180 latency: 311.564µs (requests: 2984) <-- here connections: 7412 latency: 1.362051106s (requests: 24037) connections: 7624 latency: 79.527981ms (requests: 26877) connections: 7850 latency: 35.147355ms (requests: 32510) connections: 8089 latency: 17.806359ms (requests: 35870) connections: 8336 latency: 43.741719ms (requests: 35076) connections: 8577 latency: 6.058145ms (requests: 41767) run 4connections: 5173 latency: 65.637µs (requests: 25426) connections: 5422 latency: 71.411µs (requests: 26658) connections: 5670 latency: 66.32µs (requests: 27903) connections: 5919 latency: 64.958µs (requests: 29189) connections: 5947 latency: 136.446µs (requests: 3431) <-- here connections: 6176 latency: 1.189237636s (requests: 20437) connections: 6416 latency: 46.963571ms (requests: 28186) connections: 6640 latency: 45.592822ms (requests: 24123) connections: 6857 latency: 52.164676ms (requests: 29145) connections: 7069 latency: 54.679434ms (requests: 25002) connections: 7259 latency: 58.74997ms (requests: 24658) connections: 7453 latency: 69.040101ms (requests: 25217) Something seems to be happening at these points that probably causes all requests to synchronize around the same time. Even with a random sleep. And this is with But weirdly I don't see this same drop if I run everything in the same process with otherwise the same code. If it's all in the same process the latency just gradually increases. |
But if I'm right this will only happen if the program is using |
I tweaked the runtime to print every time it runs more than 500 timers in a single call to
|
I've convinced myself that this is due to timer synchronization. With the original program I see delays increase at around 7000 connections on my GNU/Linux laptop. Modifying I'm retargeting this for 1.15, but happy to hear counter-arguments. |
@ianlancetaylor I agree. I guess the next step is to investigate why the new timer code so much more likely to synchronize the timers. |
One more comment: if I change ready(arg.(*g), 0, false) the program can run a few more connections before collapsing. The difference is that the current version queues up the goroutine whose sleep time has expired to be the next goroutine that the P runs. That is normally the desired behavior. But when a bunch of sleep timers expire in the same run of |
#38860 has some CLs which might apply here, including https://golang.org/cl/232298 |
Didn't do anything for 1.15, rolling on to 1.16. |
No action for Go1.16, punting to Go1.17. |
I don't think there is anything to do here. Tip is a bit better for me, but times do still increase at some point. I think this is inevitable, and I think that without calling Closing. Please comment if you disagree. |
What version of Go are you using (
go version
)?and
Does this issue reproduce with the latest release?
Yes, the issue started with the latest beta release.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I wrote a simple program that uses the
net
package to use9000
connections and9000*2
goroutines (*2
for both sides of the connection) to send requests and responses every 200 milliseconds. Then I measure the average delay it takes for the response to arrive (so a little bit over 200ms expected).Program: https://gist.github.com/erikdubbelboer/d925c6e271aa0c5c8bb20a6ec16455c5
What did you expect to see?
On
go1.13.5
the average response time is around 206ms. So 6ms more than the expected delay.What did you see instead?
On
go1.14beta1
the average response time is 300ms. That's 94ms extra delay just for upgrading Go.The program only consumes about half of the CPU's I have so it's not a CPU bottleneck.
I have tested the exact same thing on an Ubuntu machine and here the average response time is 200ms (rounded down, it's only a couple of microseconds of delay) for both versions of Go.
The text was updated successfully, but these errors were encountered: