Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: SetDeadline performance is poor #25729

Closed
sandyskies opened this issue Jun 5, 2018 · 31 comments

Comments

Projects
None yet
9 participants
@sandyskies
Copy link

commented Jun 5, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.2 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/svn/jessemjchen/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/svn/jessemjchen"
GORACE=""
GOROOT="/home/svn/go"
GOTMPDIR=""
GOTOOLDIR="/home/svn/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build019705885=/tmp/go-build"

What did you do?

I write a server in golang,and I use SetReadDeadline/SetWriteDeadline to set a timeout . But when I benchmark ,and use pprof to get cpu pprof ,I get a bad performance because of this two function.
image

What did you expect to see?

SetReadDeadline/SetWriteDeadline occupy less cpu time.

What did you see instead?

SetReadDeadline/SetWriteDeadline occupy lots of cpu time.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

Please show us the whole profile and/or show us the code so that we can recreate the problem ourselves. Thanks.

@sandyskies

This comment has been minimized.

Copy link
Author

commented Jun 5, 2018

@ianlancetaylor Can I upload the svg file?

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

@valyala

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

@sandyskies , could you build and profile the same program under go 1.9 ? This may be related to #15133

The svg file shows the program spends 15% of CPU time in a send syscall and 20% of CPU time in conn.SetWriteDeadline. The send syscall is usually fast, so it's usually OK if its' time share is comparable to conn.SetWriteDeadline time share.

It looks like the program calls tars/transport.ConnectHandler.send too frequently (hundreds thousands or even a million calls per second). Probably it would be better optimizing the program by reducing the frequency of tars/transport.ConnectHandler.send calls. For instance, process requests in batches. Additionally, conn.SetWriteDeadline may be called on each connection every timeout / 10 seconds instead of calling it before each send syscall. But even if conn.SetWriteDeadline overhead is completely eliminated, the program will run faster only by 1/(1-0.2) = 1.25 times.

@sandyskies

This comment has been minimized.

Copy link
Author

commented Jun 5, 2018

@valyala Thanks for the answer. I have run it under go 1.9,and have a worse performance, because of the muti cpu timmer problem in the issue that u mentioned, I have to upgrade to 1.10 . I mentioned in my issue than I am performing a benchmark ,it's a sever and client model, which client send some data to server ,and server simply sends back. transport.ConnectHandler.send is the function that server use to sends the data back to client . Because it is a shot connection ,So I have to use SetDeadline in every socket/connection . Benchmark is use in this case ,so I should not change the program logic I used.

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 Jun 19, 2018

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Oct 30, 2018

Hello,

I had an opportunity to spend some time with the OP last week and I believe the following benchmark reproduces the issue they are seeing.

package ptheadwait

import (
        "io"
        "net"
        "testing"
        "time"
)

const BUFSIZ = 4096

func BenchmarkSetDeadline(b *testing.B) {
        l, err := net.Listen("tcp", "127.0.0.1:0")
        check(b, err)
        defer l.Close()
        go func() {
                c, err := l.Accept()
                check(b, err)
                _, err = io.Copy(c, c)
                check(b, err)
        }()
        c, err := net.Dial("tcp", l.Addr().String())
        check(b, err)

        b.ReportAllocs()
        b.SetBytes(BUFSIZ)
        b.ResetTimer()

        var buf [BUFSIZ]byte
        deadline := 1 * time.Second
        for i := 0; i < b.N; i++ {
                c.SetWriteDeadline(time.Now().Add(deadline))
                _, err := c.Write(buf[:])
                check(b, err)
                c.SetReadDeadline(time.Now().Add(deadline))
                _, err = c.Read(buf[:])
                check(b, err)
                deadline += 1 * time.Second
        }
}

func check(tb testing.TB, err error) {
        tb.Helper()
        if err != nil {
                tb.Fatal(err)
        }
}

On this laptop I see 18.95% of the time spent in pthread_cond_wait

zapf(~/src/pthreadwait) % go test -bench=.  -benchtime=5s -cpuprofile=c.p
goos: darwin
goarch: amd64
pkg: pthreadwait
BenchmarkSetDeadline-4            200000             37122 ns/op         110.34 MB/s         320 B/op          4 allocs/op
PASS
ok      pthreadwait     8.021s
zapf(~/src/pthreadwait) % go tool pprof -png c.p
Generating report in profile001.png

profile001

% go version
go version devel +2e9f0817f0 Tue Oct 30 04:39:53 2018 +0000 darwin/amd64

From the spelunking the OP and I did my interpretation of the trace is thrashing between the goroutine that owns the timer and the main goroutine which is needs to wake the former so it can wrestle control of the lock over the timer wheel.

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2018

Some results on a different laptop

(~/src/pthreadtest) % go1.9 test -bench=. -cpu=1,2,4,8 -benchtime=5s
goos: darwin
goarch: amd64
pkg: pthreadtest
BenchmarkSetDeadline              200000             35134 ns/op         116.58 MB/s         288 B/op          4 allocs/op
BenchmarkSetDeadline-2            200000             34476 ns/op         118.80 MB/s         288 B/op          4 allocs/op
BenchmarkSetDeadline-4            200000             30776 ns/op         133.09 MB/s         288 B/op          4 allocs/op
BenchmarkSetDeadline-8            200000             30638 ns/op         133.69 MB/s         288 B/op          4 allocs/op
PASS
ok      pthreadtest     27.563s
(~/src/pthreadtest) % go1.11 test -bench=. -cpu=1,2,4,8 -benchtime=5s
goos: darwin
goarch: amd64
pkg: pthreadtest
BenchmarkSetDeadline              200000             37773 ns/op         108.44 MB/s         320 B/op          4 allocs/op
BenchmarkSetDeadline-2            200000             37212 ns/op         110.07 MB/s         320 B/op          4 allocs/op
BenchmarkSetDeadline-4            200000             33654 ns/op         121.71 MB/s         320 B/op          4 allocs/op
BenchmarkSetDeadline-8            200000             33783 ns/op         121.24 MB/s         320 B/op          4 allocs/op
PASS
ok      pthreadtest     29.961s

/cc @ianlancetaylor @dvyukov

@dvyukov

This comment has been minimized.

Copy link
Member

commented Oct 31, 2018

A bunch of things to improve here. I am on it.

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146342 mentions this issue: runtime: mark poll_runtimeNano and time_runtimeNano as nosplit

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146340 mentions this issue: runtime, time: refactor startNano handling

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146345 mentions this issue: runtime: use StorepNoWB instead of atomicstorep in netpoll

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146339 mentions this issue: runtime: add and use modtimer in netpoll

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146343 mentions this issue: runtime: execute memory barrier conditionally when changing netpoll timers

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146337 mentions this issue: runtime: don't wake timeproc needlessly

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146341 mentions this issue: time: speed up Since and Until

@gopherbot

This comment has been minimized.

Copy link

commented Oct 31, 2018

Change https://golang.org/cl/146338 mentions this issue: tuntime: don't recreate netpoll timers if they don't change

@bradfitz bradfitz changed the title net: SetDeadline perform a poor performance net: SetDeadline performance is poor Oct 31, 2018

@dvyukov

This comment has been minimized.

Copy link
Member

commented Oct 31, 2018

@sandyskies could you post full pprof profile? Or at least the netpoll part. You stripped the most important info. We see that lots of time is spent in descendants of setDeadlineImpl, but we don't see where exactly. setDeadlineImpl itself does not consume any sigfnificant time.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Oct 31, 2018

@davecheney for your benchmark for the whole series I got:

name           old time/op  new time/op  delta
SetDeadline    34.6µs ± 1%  20.4µs ± 1%  -41.11%  (p=0.008 n=5+5)
SetDeadline-6  24.2µs ± 1%  21.1µs ± 0%  -12.96%  (p=0.008 n=5+5)

and for the standard net conn benchmark and SetDeadline stress:

name                  old time/op  new time/op  delta
TCP4OneShotTimeout    99.0µs ± 2%  87.9µs ± 0%  -11.20%  (p=0.008 n=5+5)
TCP4OneShotTimeout-6  18.6µs ± 1%  17.0µs ± 0%   -8.65%  (p=0.008 n=5+5)
SetReadDeadline        320ns ± 0%   204ns ± 1%  -36.14%  (p=0.016 n=4+5)
SetReadDeadline-6      562ns ± 5%   205ns ± 1%  -63.50%  (p=0.008 n=5+5)
@sandyskies

This comment has been minimized.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Nov 1, 2018

@sandyskies The file seems to be corrupted, it gives me:

This page contains the following errors:
error on line 44 at column 89: Specification mandates value for attribute data-pjax-transient
Below is a rendering of the page up to the first error.
test/profile003.svg at master · sandyskies/test
@sandyskies

This comment has been minimized.

Copy link
Author

commented Nov 1, 2018

@dvyukov I try show it as blame in github, and save it as svg file ,and open it with ie explore ,it works fine.

@dvyukov

This comment has been minimized.

Copy link
Member

commented Nov 1, 2018

Whatever I try to open it, it fails.
Chrome fails with:

This page contains the following errors:
error on line 44 at column 89: Specification mandates value for attribute data-pjax-transient
Below is a rendering of the page up to the first error.
test/profile003.svg at master · sandyskies/test

Firefox fails with:


XML Parsing Error: not well-formed
Location: file:///tmp/profile003.svg
Line Number 44, Column 89:  <meta name="request-id" content="9D4D:44ED:488F43:7E3ACF:5BDADD61" data-pjax-transient>
----------------------------------------------------------------------------------------^

ovenmitts fails with:

Error domain 1 code 41 on line 44 column 89 of file:///tmp/profile003.svg: Specification mandate value for attribute data-pjax-transient

and github webpage fails too:
https://github.com/sandyskies/test/blob/master/profile003.svg

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Nov 1, 2018

@sandyskies

This comment has been minimized.

Copy link
Author

commented Nov 2, 2018

I think the reason is that , git covert the svg file automatically. I put it into a zip file , please check again. @davecheney @dvyukov
profile003.zip

@dvyukov

This comment has been minimized.

Copy link
Member

commented Nov 2, 2018

This works.
So large fraction of addTimerLocked goes to notewakeup, also lots of contention on timers lock from timerproc. So my patch series can potentially help. You may test the patch series already.

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: don't wake timeproc needlessly
It's not always necessary to wake timerproc even if we add
a new timer to the top of the heap. Since we don't wake and
reset timerproc when we remove timers, it still can be sleeping
with shorter timeout. It such case it's more profitable to let it
sleep and then update timeout when it wakes on its own rather than
proactively wake it, let it update timeout and go to sleep again.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  18.6µs ± 1%  17.2µs ± 0%   -7.66%  (p=0.008 n=5+5)
SetReadDeadline-6      562ns ± 5%   319ns ± 1%  -43.27%  (p=0.008 n=5+5)

Update #25729

Change-Id: Iec8eacb8563dbc574a82358b3bac7ac479c16826
Reviewed-on: https://go-review.googlesource.com/c/146337
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: don't recreate netpoll timers if they don't change
Currently we always delete both read and write timers and then
add them again. However, if user setups read and write deadline
separately, then we don't need to touch the other one.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  17.2µs ± 0%  17.2µs ± 0%     ~     (p=0.310 n=5+5)
SetReadDeadline-6      319ns ± 1%   274ns ± 2%  -13.94%  (p=0.008 n=5+5)

Update #25729

Change-Id: I4c869c3083521de6d0cd6ca99a7609d4dd84b4e4
Reviewed-on: https://go-review.googlesource.com/c/146338
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: add and use modtimer in netpoll
Currently when netpoll deadline is incrementally prolonged,
we delete and re-add timer each time.
Add modtimer function that does both and use it when we need
to modify an existing netpoll timer to avoid unnecessary lock/unlock.

TCP4OneShotTimeout-6  17.2µs ± 0%  17.0µs ± 0%  -0.82%  (p=0.008 n=5+5)
SetReadDeadline-6      274ns ± 2%   261ns ± 0%  -4.89%  (p=0.008 n=5+5)

Update #25729

Change-Id: I08b89dbbc1785dd180e967a37b0aa23b0c4613a8
Reviewed-on: https://go-review.googlesource.com/c/146339
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime, time: refactor startNano handling
Move startNano from runtime to time package.
In preparation for a subsequent change that speeds up Since and Until.
This also makes code simpler as we have less assembly as the result,
monotonic time handling is better localized in time package.
This changes values returned from nanotime on windows
(it does not account for startNano anymore), current comments state
that it's important, but it's unclear how it can be important
since no other OS does this.

Update #25729

Change-Id: I2275d57b7b5ed8fd0d53eb0f19d55a86136cc555
Reviewed-on: https://go-review.googlesource.com/c/146340
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

time: speed up Since and Until
time.now is somewhat expensive (much more expensive than nanotime),
in the common case when Time has monotonic time we don't actually
need to call time.now in Since/Until as we can do calculation
based purely on monotonic times.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  17.0µs ± 0%  17.1µs ± 1%     ~     (p=0.151 n=5+5)
SetReadDeadline-6      261ns ± 0%   234ns ± 1%  -10.35%  (p=0.008 n=5+5)

Benchmark that only calls Until:

benchmark            old ns/op     new ns/op     delta
BenchmarkUntil       54.0          29.5          -45.37%

Update #25729

Change-Id: I5ac5af3eb1fe9f583cf79299f10b84501b1a0d7d
Reviewed-on: https://go-review.googlesource.com/c/146341
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: move nanotime wrappers to time and poll packages
The nanotime wrappers in runtime introduce a bunch
of unnecessary code onto hot paths, e.g.:

0000000000449d70 <time.runtimeNano>:
  449d70:       64 48 8b 0c 25 f8 ff    mov    %fs:0xfffffffffffffff8,%rcx
  449d77:       ff ff
  449d79:       48 3b 61 10             cmp    0x10(%rcx),%rsp
  449d7d:       76 26                   jbe    449da5 <time.runtimeNano+0x35>
  449d7f:       48 83 ec 10             sub    $0x10,%rsp
  449d83:       48 89 6c 24 08          mov    %rbp,0x8(%rsp)
  449d88:       48 8d 6c 24 08          lea    0x8(%rsp),%rbp
  449d8d:       e8 ae 18 01 00          callq  45b640 <runtime.nanotime>
  449d92:       48 8b 04 24             mov    (%rsp),%rax
  449d96:       48 89 44 24 18          mov    %rax,0x18(%rsp)
  449d9b:       48 8b 6c 24 08          mov    0x8(%rsp),%rbp
  449da0:       48 83 c4 10             add    $0x10,%rsp
  449da4:       c3                      retq
  449da5:       e8 56 e0 00 00          callq  457e00 <runtime.morestack_noctxt>
  449daa:       eb c4                   jmp    449d70 <time.runtimeNano>

Move them to the corresponding packages which eliminates all of this.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  17.1µs ± 1%  17.0µs ± 0%  -0.66%  (p=0.032 n=5+5)
SetReadDeadline-6      234ns ± 1%   232ns ± 0%  -0.77%  (p=0.016 n=5+4)

Update #25729

Change-Id: Iee05027adcdc289ba895c5f5a37f154e451bc862
Reviewed-on: https://go-review.googlesource.com/c/146342
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: execute memory barrier conditionally when changing netpoll t…
…imers

We only need the memory barrier in poll_runtime_pollSetDeadline only
when one of the timers has fired, which is not the expected case.
Memory barrier can be somewhat expensive on some archs,
so execute it only if one of the timers has in fact fired.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  17.0µs ± 0%  17.1µs ± 0%  +0.35%  (p=0.032 n=5+5)
SetReadDeadline-6      232ns ± 0%   230ns ± 0%  -1.03%  (p=0.000 n=4+5)

Update #25729

Change-Id: Ifce6f505b9e7ba3717bad8f454077a2e94ea6e75
Reviewed-on: https://go-review.googlesource.com/c/146343
Reviewed-by: Ian Lance Taylor <iant@golang.org>

gopherbot pushed a commit that referenced this issue Nov 2, 2018

runtime: use StorepNoWB instead of atomicstorep in netpoll
We only need the memory barrier from these stores,
and we only store nil over nil or over a static function value.
The write barrier is unnecessary.

name                  old time/op  new time/op  delta
TCP4OneShotTimeout-6  17.0µs ± 0%  17.0µs ± 0%  -0.43%  (p=0.032 n=5+5)
SetReadDeadline-6      205ns ± 1%   205ns ± 1%    ~     (p=0.683 n=5+5)

Update #25729

Change-Id: I66c097a1db7188697ddfc381f31acec053dfed2c
Reviewed-on: https://go-review.googlesource.com/c/146345
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2018

@dvyukov thank you very much, this patch set has had a substantial impact on my benchmark.

(~/src/pthreadtest) % benchstat old.txt new.txt
name           old time/op    new time/op    delta
SetDeadline      37.7µs ± 0%    27.4µs ± 1%  -27.31%  (p=0.000 n=20+20)
SetDeadline-2    37.3µs ± 1%    27.5µs ± 1%  -26.23%  (p=0.000 n=15+20)
SetDeadline-4    33.4µs ± 1%    27.4µs ± 1%  -17.96%  (p=0.000 n=19+20)
SetDeadline-8    33.5µs ± 1%    27.4µs ± 1%  -18.24%  (p=0.000 n=19+10)

name           old speed      new speed      delta
SetDeadline     109MB/s ± 0%   149MB/s ± 1%  +37.57%  (p=0.000 n=20+20)
SetDeadline-2   110MB/s ± 1%   149MB/s ± 1%  +35.56%  (p=0.000 n=15+20)
SetDeadline-4   123MB/s ± 1%   149MB/s ± 1%  +21.88%  (p=0.000 n=19+20)
SetDeadline-8   122MB/s ± 1%   150MB/s ± 1%  +22.31%  (p=0.000 n=19+10)

name           old alloc/op   new alloc/op   delta
SetDeadline        320B ± 0%      320B ± 0%     ~     (all equal)
SetDeadline-2      320B ± 0%      320B ± 0%     ~     (all equal)
SetDeadline-4      320B ± 0%      320B ± 0%     ~     (all equal)
SetDeadline-8      320B ± 0%      320B ± 0%     ~     (all equal)

name           old allocs/op  new allocs/op  delta
SetDeadline        4.00 ± 0%      4.00 ± 0%     ~     (all equal)
SetDeadline-2      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
SetDeadline-4      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
SetDeadline-8      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2018

@dvyukov I've removed the errant allocations in the benchmark (they were coming from the check helper). The updated results are even more promising

(~/src/pthreadtest) % benchstat old.txt new.txt
name           old time/op    new time/op    delta
SetDeadline      34.3µs ± 1%    24.2µs ± 1%  -29.28%  (p=0.000 n=20+20)
SetDeadline-2    34.8µs ± 0%    24.2µs ± 1%  -30.43%  (p=0.000 n=19+20)
SetDeadline-4    29.6µs ± 1%    24.1µs ± 1%  -18.40%  (p=0.000 n=20+20)
SetDeadline-8    29.6µs ± 0%    24.1µs ± 1%  -18.52%  (p=0.000 n=16+20)

name           old speed      new speed      delta
SetDeadline     119MB/s ± 1%   169MB/s ± 1%  +41.40%  (p=0.000 n=20+20)
SetDeadline-2   118MB/s ± 0%   169MB/s ± 1%  +43.74%  (p=0.000 n=19+20)
SetDeadline-4   138MB/s ± 1%   170MB/s ± 1%  +22.55%  (p=0.000 n=20+20)
SetDeadline-8   138MB/s ± 0%   170MB/s ± 1%  +22.72%  (p=0.000 n=16+20)

name           old alloc/op   new alloc/op   delta
SetDeadline       0.00B          0.00B          ~     (all equal)
SetDeadline-2     0.00B          0.00B          ~     (all equal)
SetDeadline-4     0.00B          0.00B          ~     (all equal)
SetDeadline-8     0.00B          0.00B          ~     (all equal)

name           old allocs/op  new allocs/op  delta
SetDeadline        0.00           0.00          ~     (all equal)
SetDeadline-2      0.00           0.00          ~     (all equal)
SetDeadline-4      0.00           0.00          ~     (all equal)
SetDeadline-8      0.00           0.00          ~     (all equal)

Both the improvements in throughput and latency are far less affected by -cpu values. Nice job!

Before (go 1.11)
go1 11

After (tip)
after

@agnivade

This comment has been minimized.

Copy link
Member

commented Nov 3, 2018

@dvyukov - Did you have anything else in mind for this issue ?

@dvyukov

This comment has been minimized.

Copy link
Member

commented Nov 3, 2018

@agnivade no
@sandyskies please re-open if this does not help, but from your profile it looked like it should

@dvyukov dvyukov closed this Nov 3, 2018

@sandyskies

This comment has been minimized.

Copy link
Author

commented Nov 4, 2018

Thank u all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.