New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: make([]byte, n) becomes much slower compared with go 1.11.1 #28479

Open
lni opened this Issue Oct 30, 2018 · 19 comments

Comments

Projects
None yet
8 participants
@lni

lni commented Oct 30, 2018

What version of Go are you using (go version)?

go version devel +2e9f081 Tue Oct 30 04:39:53 2018 +0000 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lni/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/lni/golang_ws"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build682632202=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I tried the devel version of go for both devel +80b8377 and devel +2e9f081, my go program became slower. When checking the benchmarks, I noticed that make([]byte, n) accessed in parallel is much slower when compared to go 1.11.1

package makebenchmark

import (
  "testing"
)

func benchmarkMakeByteSliceInParalell(b *testing.B, sz int) {
  b.RunParallel(func(pb *testing.PB) {
    for pb.Next() {
      m := make([]byte, sz)
      if len(m) != sz {
        b.Errorf("unexpected len")
      }
    }
  })
}

func BenchmarkMakeByteSliceInParalell512(b *testing.B) {
  benchmarkMakeByteSliceInParalell(b, 512)
}

func BenchmarkMakeByteSliceInParalell8(b *testing.B) {
  benchmarkMakeByteSliceInParalell(b, 8)
}

go 1.11.1:

go test -count=1 -v   -benchmem -run ^$ -bench=BenchmarkMakeByteSliceInParalell
goos: linux
goarch: amd64
pkg: github.com/lni/goplayground/makebenchmark
BenchmarkMakeByteSliceInParalell512-40    	10000000	       239 ns/op	     512 B/op	       1 allocs/op
BenchmarkMakeByteSliceInParalell8-40      	300000000	         4.08 ns/op	       8 B/op	       1 allocs/op
PASS

go version
go version go1.11.1 linux/amd64

go version devel +2e9f081 and devel +80b8377 reported similar results

/home/lni/src/go/bin/go test -count=1 -v   -benchmem -run ^$ -bench=BenchmarkMakeByteSliceInParalell
goos: linux
goarch: amd64
pkg: github.com/lni/goplayground/makebenchmark
BenchmarkMakeByteSliceInParalell512-40    	 3000000	       564 ns/op	     512 B/op	       1 allocs/op
BenchmarkMakeByteSliceInParalell8-40      	300000000	         5.09 ns/op	       8 B/op	       1 allocs/op
PASS

/home/lni/src/go/bin/go version
go version devel +2e9f081 Tue Oct 30 04:39:53 2018 +0000 linux/amd64

What did you expect to see?

make([]byte, n) has similar ns/op when compared with 1.11.1

What did you see instead?

make([]byte, n) is much slower

@agnivade

This comment has been minimized.

Member

agnivade commented Oct 30, 2018

A git bisect would be very helpful.

Also, for reference the actual commit links for 2e9f081 and 80b8377.

@agnivade agnivade changed the title from make([]byte, n) becomes much slower compared with go 1.11.1 to cmd/compile: make([]byte, n) becomes much slower compared with go 1.11.1 Oct 30, 2018

@reusee

This comment has been minimized.

reusee commented Oct 30, 2018

Bisect to 07e738e

@lni

This comment has been minimized.

lni commented Oct 30, 2018

note that the ns/op difference is much more obvious when running with large GOMAXPROCS

@mvdan

This comment has been minimized.

Member

mvdan commented Oct 30, 2018

@mvdan mvdan added the Performance label Oct 30, 2018

@mvdan mvdan added this to the Go1.12 milestone Oct 30, 2018

@lni

This comment has been minimized.

lni commented Oct 30, 2018

also noticed that 9108ae7 made make([]byte, 512) slower as well. not sure whether it is a feature or a bug ;)

/home/lni/src/go/bin/go test -bench .
goos: linux
goarch: amd64
pkg: github.com/lni/goplayground/makebenchmark
BenchmarkMakeByteSliceInParalell512-40    	10000000	       241 ns/op
BenchmarkMakeByteSliceInParalell8-40      	500000000	         4.00 ns/op
PASS
ok  	github.com/lni/goplayground/makebenchmark	5.013s

/home/lni/src/go/bin/go version
go version devel +a2a2901 Tue Oct 2 20:35:19 2018 +0000 linux/amd64
/home/lni/src/go/bin/go test -bench .
goos: linux
goarch: amd64
pkg: github.com/lni/goplayground/makebenchmark
BenchmarkMakeByteSliceInParalell512-40    	 5000000	       308 ns/op
BenchmarkMakeByteSliceInParalell8-40      	300000000	         4.07 ns/op
PASS
ok  	github.com/lni/goplayground/makebenchmark	3.476s

/home/lni/src/go/bin/go version
go version devel +9108ae7 Tue Oct 2 20:35:20 2018 +0000 linux/amd64
@agnivade

This comment has been minimized.

Member

agnivade commented Oct 30, 2018

Thank you @reusee for the bisect !

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Oct 30, 2018

@martisch

This comment has been minimized.

Member

martisch commented Oct 30, 2018

There were also some changes to runtime.makeslice. Modifying the benchmark to do no allocation in runtime.mallocgc shows the expected speed-up between go1.11.1 and tip. So its likely the slowdown to originates in runtime.mallocgc or below.

var LenVar = 0

func BenchmarkMakeSliceCallInParalell(b *testing.B) {
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			m := make([]byte, LenVar)
			if len(m) != LenVar {
				b.Errorf("unexpected len")
			}
		}
	})
}
name                       old time/op  new time/op  delta
MakeSliceCallInParalell-4  3.50ns ± 5%  3.24ns ±10%  -7.35%  (p=0.000 n=17+20)
@martisch

This comment has been minimized.

Member

martisch commented Oct 30, 2018

running the existing runtime MakeSlice benchmarks go1.11.1 vs tip:

MakeSlice/Byte-4       16.7ns ± 1%  15.7ns ± 2%   -5.93%  (p=0.000 n=10+10)
MakeSlice/Int16-4      19.1ns ± 2%  18.4ns ± 1%   -3.51%  (p=0.000 n=10+10)
MakeSlice/Int-4        25.9ns ± 1%  26.3ns ± 1%   +1.54%  (p=0.000 n=10+9)
MakeSlice/Ptr-4        64.1ns ± 2%  66.0ns ± 1%   +3.04%  (p=0.000 n=9+9)
MakeSlice/Struct/24-4  36.4ns ± 2%  35.7ns ± 5%     ~     (p=0.085 n=10+10)
MakeSlice/Struct/32-4  43.5ns ± 2%  39.7ns ± 1%   -8.75%  (p=0.000 n=10+10)
MakeSlice/Struct/40-4  56.5ns ± 1%  45.5ns ± 1%  -19.47%  (p=0.000 n=10+9)

so the slow down may only happen for larger or parallel allocations.

@martisch

This comment has been minimized.

Member

martisch commented Oct 30, 2018

only see a smaller slow down for 512byte and speedup for 8byte on my laptop 2-core/4-thread for go1.11.1 vs go tip. So it is likely easier detectable with higher thread counts:

go1.11.1 vs go tip:

name                          old time/op    new time/op    delta
MakeByteSliceInParalell512-4     101ns ± 4%     118ns ± 7%  +16.54%  (p=0.000 n=19+19)
MakeByteSliceInParalell8-4      11.0ns ±18%    10.4ns ±15%   -5.50%  (p=0.003 n=18+18)

go1.11.1 vs 2e9f081 (commit before 9108ae7):

name                          old time/op    new time/op    delta
MakeByteSliceInParalell512-4     101ns ± 4%      97ns ± 4%  -3.78%  (p=0.000 n=19+20)
MakeByteSliceInParalell8-4      11.0ns ±18%    11.5ns ±28%    ~     (p=0.394 n=18+19)
@mknyszek

This comment has been minimized.

Contributor

mknyszek commented Oct 30, 2018

As reusee@ sleuthed out, it's probably my commit (07e738e). Thank you everyone for looking into this.

I took some time to dig in and understand exactly what was going on. This benchmark exercises the case where we're allocating size-segregated spans with object size 512, which we then free, and re-allocate from. At first, I thought that the regression might have been the result of span allocation having to traverse the treap, instead of walking over a fixed-size array. But, CPU profiling shows that it's more likely a result of freeing spans. Before my change, freeing these size-segregated spans was a very fast operation: indexing into an array followed by a linked list insertion at head. Now, one needs to traverse a treap, allocate a treap node out of a SLAB, and rebalance it. While all of these operations should be reasonably fast, it's definitely more work than what was being done before. I think the regression in performance we see as a result of a higher GOMAXPROCS arises out of the fact that the heap lock is held when a span is freed, and my change causes this benchmark to spend more time holding the heap lock.

So, for a microbenchmark like this which really hammers on the GC and the allocator, it makes sense to me why my change had such an impact. Running other benchmarks (such as the go1 benchmarks, or the garbage benchmark, whose numbers are in my change's commit message) the actual performance hit isn't quite so dramatic. I quickly ran github.com/dr2chase/bent (which runs a number of benchmarks from actual Go programs in the wild) with my change and without it and I haven't yet seen any significant change there either. However, I will run those benchmarks with more iterations and report back with actual numbers.

The motivation behind my change was that it really simplified a number of changes I just landed today for #14045. I'm erring on the side that it was worth it, as it seems as though in practice the performance impact of my change appears to be relatively small.

@mknyszek

This comment has been minimized.

Contributor

mknyszek commented Oct 31, 2018

Here are the numbers I promised: https://perf.golang.org/search?q=upload:20181031.2

The "before" is e508a5f072 (my commit's parent) and the "after" is 07e738ec32 (my commit).

@lni

This comment has been minimized.

lni commented Nov 1, 2018

@mknyszek , thanks for the above benchmark results and the explanation on 07e738e

I had a quick look at those numbers, it seems that they were probably generated using small GOMAXPROCS value (8)?

Below is the throughput results (in million ops/sec) of my library, I tried 4 different go versions and report the results of three runs. Sadly the library is not open sourced yet so I fully understand that these numbers are probably not very convincing/useful - just another data point to demonstrate the problem I am facing.

84b96c2
6.18, 6.12, 6.14

07e738e
6.26, 6.21, 6.23

e508a5f
6.66, 6.46, 6.48

go 1.11.1
6.70, 6.57, 6.54

Given that freeing spans are now more expensive, I am trying to further reduce my allocations. It is not always easy as some of them are caused in code auto generated by 3rd party libraries, e.g. Colfer in my case. I just have the feeling that now users have to pay more attention to all allocations.

@mknyszek

This comment has been minimized.

Contributor

mknyszek commented Nov 2, 2018

Yeah GOMAXPROCS was at 8, and that's a good point. I've re-run the benchmarks with GOMAXPROCS=32. https://perf.golang.org/search?q=upload:20181102.1 The performance does indeed degrade more than with 8, but not by that much.

Thanks for sharing some of your numbers, and for letting me know it's impacting your application/library directly. Would you be willing to share a GC trace for each of the 4 revisions? (GODEBUG=gctrace=1)

@lni

This comment has been minimized.

lni commented Nov 3, 2018

@mknyszek , thanks for the updated benchmark results. yes, I agree that the differences are not huge.

GC traces you mentioned are posted below. Please feel free to let me know if there is any other trace/stats that can help to analyze the issue.

84b96c2

gc 1 @0.060s 0%: 0.009+1.2+0.021 ms clock, 0.36+0.21/2.2/1.0+0.87 ms cpu, 4->5->4 MB, 5 MB goal, 40 P
gc 2 @0.062s 0%: 0.018+1.1+0.033 ms clock, 0.75+0.21/2.3/0.94+1.3 ms cpu, 7->8->8 MB, 9 MB goal, 40 P
gc 3 @0.066s 0%: 0.014+1.1+0.032 ms clock, 0.57+0.19/3.3/0.62+1.2 ms cpu, 14->15->15 MB, 16 MB goal, 40 P
gc 4 @0.070s 0%: 0.012+1.7+0.015 ms clock, 0.50+0.21/4.0/1.0+0.63 ms cpu, 27->29->28 MB, 30 MB goal, 40 P
gc 5 @0.080s 0%: 0.013+1.5+0.043 ms clock, 0.53+0.30/8.8/1.5+1.7 ms cpu, 53->55->55 MB, 57 MB goal, 40 P
gc 6 @0.087s 2%: 1.0+4.5+0.020 ms clock, 40+4.1/29/0.001+0.83 ms cpu, 134->199->198 MB, 135 MB goal, 40 P
gc 7 @0.097s 4%: 1.6+2.1+0.013 ms clock, 64+1.4/12/9.5+0.54 ms cpu, 455->583->582 MB, 584 MB goal, 40 P
gc 8 @0.292s 1%: 0.013+2.5+0.025 ms clock, 0.53+0.18/10/1.5+1.0 ms cpu, 583->584->583 MB, 1165 MB goal, 40 P
gc 9 @9.513s 0%: 0.12+11+0.059 ms clock, 5.1+20/107/2.5+2.3 ms cpu, 874->882->735 MB, 1167 MB goal, 40 P
gc 10 @9.925s 0%: 0.22+17+0.37 ms clock, 9.1+23/125/0.67+15 ms cpu, 1281->1295->776 MB, 1470 MB goal, 40 P
gc 11 @10.209s 0%: 0.11+17+0.17 ms clock, 4.6+36/142/9.9+6.8 ms cpu, 1447->1471->806 MB, 1553 MB goal, 40 P
gc 12 @10.460s 0%: 0.078+18+0.42 ms clock, 3.1+55/140/1.7+17 ms cpu, 1544->1558->815 MB, 1612 MB goal, 40 P
gc 13 @10.693s 0%: 0.15+12+0.056 ms clock, 6.2+135/116/17+2.2 ms cpu, 1588->1598->823 MB, 1630 MB goal, 40 P
gc 14 @10.898s 0%: 0.16+13+0.023 ms clock, 6.4+114/120/13+0.94 ms cpu, 1605->1615->829 MB, 1646 MB goal, 40 P
gc 15 @11.111s 0%: 0.11+12+0.023 ms clock, 4.4+158/111/4.5+0.95 ms cpu, 1617->1627->831 MB, 1659 MB goal, 40 P
gc 16 @11.313s 0%: 0.058+10+0.075 ms clock, 2.3+137/92/39+3.0 ms cpu, 1621->1630->836 MB, 1662 MB goal, 40 P
gc 17 @11.519s 0%: 0.10+13+0.071 ms clock, 4.1+170/120/2.6+2.8 ms cpu, 1630->1641->844 MB, 1672 MB goal, 40 P
gc 18 @11.735s 0%: 0.14+13+0.037 ms clock, 5.9+138/123/13+1.4 ms cpu, 1646->1657->846 MB, 1688 MB goal, 40 P
gc 19 @11.980s 0%: 0.18+11+0.079 ms clock, 7.3+164/107/1.1+3.1 ms cpu, 1650->1661->859 MB, 1692 MB goal, 40 P
gc 20 @12.236s 0%: 1.0+15+0.022 ms clock, 40+116/121/39+0.91 ms cpu, 1676->1686->869 MB, 1718 MB goal, 40 P
gc 21 @12.483s 0%: 0.15+16+0.040 ms clock, 6.0+129/142/3.3+1.6 ms cpu, 1696->1707->876 MB, 1739 MB goal, 40 P
gc 22 @12.694s 0%: 0.15+15+0.089 ms clock, 6.0+109/108/37+3.5 ms cpu, 1710->1721->866 MB, 1753 MB goal, 40 P
gc 23 @12.915s 0%: 0.10+11+0.083 ms clock, 4.1+160/99/21+3.3 ms cpu, 1689->1699->870 MB, 1732 MB goal, 40 P
gc 24 @13.134s 0%: 0.15+10+0.088 ms clock, 6.3+162/97/10+3.5 ms cpu, 1697->1708->861 MB, 1741 MB goal, 40 P
gc 25 @13.346s 0%: 0.20+9.8+0.029 ms clock, 8.1+181/93/1.3+1.1 ms cpu, 1680->1691->871 MB, 1723 MB goal, 40 P
gc 26 @13.554s 0%: 0.12+9.9+0.030 ms clock, 4.9+171/93/2.8+1.2 ms cpu, 1699->1710->878 MB, 1743 MB goal, 40 P
gc 27 @13.771s 0%: 0.13+10+0.033 ms clock, 5.4+165/95/5.3+1.3 ms cpu, 1714->1725->883 MB, 1757 MB goal, 40 P
gc 28 @13.988s 0%: 0.16+10+0.072 ms clock, 6.5+171/96/3.4+2.9 ms cpu, 1723->1733->881 MB, 1766 MB goal, 40 P
gc 29 @14.203s 0%: 0.13+10+0.084 ms clock, 5.4+199/98/0+3.3 ms cpu, 1720->1730->887 MB, 1763 MB goal, 40 P
gc 30 @14.417s 0%: 0.13+11+0.027 ms clock, 5.2+140/105/9.0+1.1 ms cpu, 1731->1742->888 MB, 1775 MB goal, 40 P
gc 31 @14.633s 1%: 0.15+10+0.029 ms clock, 6.3+176/99/2.7+1.1 ms cpu, 1732->1743->895 MB, 1776 MB goal, 40 P
gc 32 @14.848s 1%: 0.11+10+0.075 ms clock, 4.5+159/98/0.66+3.0 ms cpu, 1745->1755->880 MB, 1790 MB goal, 40 P
gc 33 @15.063s 1%: 0.12+11+0.058 ms clock, 5.1+152/98/11+2.3 ms cpu, 1717->1728->890 MB, 1761 MB goal, 40 P
gc 34 @15.278s 1%: 0.10+10+0.054 ms clock, 4.0+190/93/3.8+2.1 ms cpu, 1736->1746->889 MB, 1781 MB goal, 40 P
gc 35 @15.495s 1%: 0.11+11+0.065 ms clock, 4.4+174/95/5.8+2.6 ms cpu, 1735->1746->884 MB, 1779 MB goal, 40 P
gc 36 @15.713s 1%: 0.17+10+0.082 ms clock, 6.9+191/94/3.7+3.2 ms cpu, 1725->1736->889 MB, 1769 MB goal, 40 P
gc 37 @15.928s 1%: 0.093+9.4+0.081 ms clock, 3.7+188/91/0.19+3.2 ms cpu, 1735->1745->878 MB, 1779 MB goal, 40 P
gc 38 @16.141s 1%: 0.11+10+0.18 ms clock, 4.7+185/95/0.94+7.4 ms cpu, 1713->1724->862 MB, 1757 MB goal, 40 P
gc 39 @16.353s 1%: 0.11+10+0.076 ms clock, 4.4+183/98/2.0+3.0 ms cpu, 1682->1693->856 MB, 1725 MB goal, 40 P
gc 40 @16.564s 1%: 0.083+10+0.078 ms clock, 3.3+184/93/2.7+3.1 ms cpu, 1670->1680->860 MB, 1712 MB goal, 40 P
gc 41 @16.771s 1%: 0.14+12+0.030 ms clock, 5.8+175/96/6.0+1.2 ms cpu, 1679->1690->862 MB, 1721 MB goal, 40 P
gc 42 @16.975s 1%: 0.11+11+0.069 ms clock, 4.4+150/106/7.1+2.7 ms cpu, 1682->1692->868 MB, 1724 MB goal, 40 P
gc 43 @17.185s 1%: 0.18+11+0.026 ms clock, 7.4+167/101/6.9+1.0 ms cpu, 1693->1704->869 MB, 1736 MB goal, 40 P
gc 44 @17.401s 1%: 0.15+10+0.080 ms clock, 6.0+149/97/16+3.2 ms cpu, 1696->1708->876 MB, 1739 MB goal, 40 P
gc 45 @17.620s 1%: 0.17+10+0.077 ms clock, 7.1+186/95/6.4+3.1 ms cpu, 1710->1721->883 MB, 1753 MB goal, 40 P
gc 46 @17.836s 1%: 0.11+11+0.075 ms clock, 4.4+152/101/8.1+3.0 ms cpu, 1722->1733->885 MB, 1766 MB goal, 40 P
gc 47 @18.051s 1%: 0.099+10+0.019 ms clock, 3.9+129/98/30+0.79 ms cpu, 1726->1736->887 MB, 1770 MB goal, 40 P
gc 48 @18.267s 1%: 0.11+10+0.064 ms clock, 4.4+165/99/7.8+2.5 ms cpu, 1730->1740->892 MB, 1774 MB goal, 40 P
gc 49 @18.484s 1%: 0.13+10+0.079 ms clock, 5.3+152/98/14+3.1 ms cpu, 1740->1751->896 MB, 1784 MB goal, 40 P
gc 50 @18.704s 1%: 0.15+10+0.056 ms clock, 6.2+187/93/1.8+2.2 ms cpu, 1747->1758->895 MB, 1792 MB goal, 40 P
gc 51 @18.925s 1%: 0.12+11+0.093 ms clock, 4.9+171/98/5.2+3.7 ms cpu, 1747->1758->900 MB, 1791 MB goal, 40 P
gc 52 @19.144s 1%: 0.072+10+0.026 ms clock, 2.8+177/97/1.3+1.0 ms cpu, 1756->1766->893 MB, 1800 MB goal, 40 P
gc 53 @19.361s 1%: 0.054+10+0.086 ms clock, 2.1+185/96/1.5+3.4 ms cpu, 1742->1753->894 MB, 1786 MB goal, 40 P
gc 54 @19.581s 1%: 0.14+10+0.027 ms clock, 5.6+180/92/1.7+1.1 ms cpu, 1743->1754->893 MB, 1788 MB goal, 40 P
gc 55 @19.802s 1%: 0.16+10+0.19 ms clock, 6.4+181/98/2.7+7.7 ms cpu, 1743->1753->894 MB, 1787 MB goal, 40 P
gc 56 @20.023s 1%: 0.13+13+0.016 ms clock, 5.3+176/98/1.6+0.67 ms cpu, 1745->1756->884 MB, 1789 MB goal, 40 P
gc 57 @20.247s 1%: 0.13+10+0.091 ms clock, 5.4+191/92/0.28+3.6 ms cpu, 1725->1734->879 MB, 1769 MB goal, 40 P
gc 58 @20.466s 1%: 0.14+10+0.019 ms clock, 5.8+182/98/5.0+0.79 ms cpu, 1715->1725->864 MB, 1758 MB goal, 40 P
gc 59 @20.679s 1%: 0.11+10+0.18 ms clock, 4.4+188/94/0.39+7.2 ms cpu, 1685->1696->862 MB, 1728 MB goal, 40 P
gc 60 @20.887s 1%: 0.10+10+0.083 ms clock, 4.3+197/93/0.17+3.3 ms cpu, 1681->1692->866 MB, 1724 MB goal, 40 P
gc 61 @21.098s 1%: 0.14+9.9+0.091 ms clock, 5.9+193/94/0.11+3.6 ms cpu, 1690->1700->866 MB, 1733 MB goal, 40 P
gc 62 @21.300s 1%: 0.12+10+0.067 ms clock, 4.8+157/100/7.7+2.6 ms cpu, 1690->1701->867 MB, 1733 MB goal, 40 P
gc 63 @21.508s 1%: 0.16+10+0.094 ms clock, 6.6+200/95/0.13+3.7 ms cpu, 1691->1701->874 MB, 1734 MB goal, 40 P
gc 64 @21.719s 1%: 0.16+10+0.071 ms clock, 6.5+186/94/0.13+2.8 ms cpu, 1706->1717->875 MB, 1749 MB goal, 40 P
gc 65 @21.936s 1%: 0.14+10+0.059 ms clock, 5.6+194/92/0.42+2.3 ms cpu, 1706->1716->873 MB, 1750 MB goal, 40 P
gc 66 @22.151s 1%: 0.14+10+0.082 ms clock, 5.6+194/94/1.8+3.3 ms cpu, 1704->1714->871 MB, 1747 MB goal, 40 P
gc 67 @22.363s 1%: 0.12+9.7+0.084 ms clock, 4.9+182/91/8.1+3.3 ms cpu, 1700->1710->874 MB, 1743 MB goal, 40 P
gc 68 @22.576s 1%: 0.14+10+0.021 ms clock, 5.8+179/97/2.4+0.85 ms cpu, 1706->1717->872 MB, 1749 MB goal, 40 P
gc 69 @22.791s 1%: 0.13+10+0.066 ms clock, 5.2+203/96/1.1+2.6 ms cpu, 1702->1714->872 MB, 1745 MB goal, 40 P
gc 70 @23.005s 1%: 0.19+10+0.026 ms clock, 7.7+208/97/0.61+1.0 ms cpu, 1702->1713->879 MB, 1745 MB goal, 40 P
gc 71 @23.223s 1%: 0.29+10+0.027 ms clock, 11+193/94/0.39+1.1 ms cpu, 1714->1724->864 MB, 1758 MB goal, 40 P
gc 72 @23.438s 1%: 0.18+10+0.074 ms clock, 7.4+190/95/0.30+2.9 ms cpu, 1685->1695->868 MB, 1728 MB goal, 40 P
gc 73 @23.649s 1%: 0.24+11+0.078 ms clock, 9.8+180/99/0.28+3.1 ms cpu, 1694->1706->875 MB, 1737 MB goal, 40 P
gc 74 @23.860s 1%: 0.14+11+0.094 ms clock, 5.7+156/99/11+3.7 ms cpu, 1706->1717->882 MB, 1750 MB goal, 40 P
gc 75 @24.074s 1%: 0.15+11+0.047 ms clock, 6.0+170/101/5.9+1.9 ms cpu, 1720->1731->888 MB, 1764 MB goal, 40 P
gc 76 @24.292s 1%: 0.11+10+0.18 ms clock, 4.7+203/95/0.10+7.4 ms cpu, 1733->1744->901 MB, 1777 MB goal, 40 P
gc 77 @24.520s 1%: 0.21+11+0.025 ms clock, 8.6+182/101/0.34+1.0 ms cpu, 1758->1770->920 MB, 1803 MB goal, 40 P

07e738e

gc 1 @0.069s 0%: 0.012+1.1+0.020 ms clock, 0.50+0.36/2.0/0.54+0.81 ms cpu, 4->5->4 MB, 5 MB goal, 40 P
gc 2 @0.071s 0%: 0.074+1.0+0.022 ms clock, 2.9+0.25/1.9/0.80+0.91 ms cpu, 6->8->8 MB, 8 MB goal, 40 P
gc 3 @0.074s 0%: 0.013+0.96+0.037 ms clock, 0.55+0.19/3.0/0.79+1.5 ms cpu, 13->14->14 MB, 16 MB goal, 40 P
gc 4 @0.079s 0%: 0.013+1.5+0.031 ms clock, 0.54+0.23/5.1/0.96+1.2 ms cpu, 25->28->28 MB, 29 MB goal, 40 P
gc 5 @0.090s 0%: 0.013+1.3+0.031 ms clock, 0.55+0.24/7.5/2.7+1.2 ms cpu, 51->52->52 MB, 57 MB goal, 40 P
gc 6 @0.100s 2%: 0.81+5.7+0.038 ms clock, 32+6.4/25/0+1.5 ms cpu, 102->392->390 MB, 105 MB goal, 40 P
gc 7 @0.110s 3%: 0.89+2.2+0.037 ms clock, 35+1.2/18/11+1.4 ms cpu, 423->583->582 MB, 781 MB goal, 40 P
gc 8 @6.510s 0%: 0.048+8.6+0.020 ms clock, 1.9+2.1/80/129+0.83 ms cpu, 750->757->698 MB, 1165 MB goal, 40 P
gc 9 @27.255s 0%: 0.16+12+0.10 ms clock, 6.5+17/95/0+4.1 ms cpu, 1143->1159->756 MB, 1396 MB goal, 40 P
gc 10 @27.513s 0%: 0.14+13+0.30 ms clock, 5.7+33/102/1.5+12 ms cpu, 1368->1384->785 MB, 1513 MB goal, 40 P
gc 11 @27.777s 0%: 0.093+14+0.20 ms clock, 3.7+38/104/0.13+8.0 ms cpu, 1487->1508->805 MB, 1571 MB goal, 40 P
gc 12 @28.046s 0%: 0.10+9.7+0.033 ms clock, 4.2+93/81/0.29+1.3 ms cpu, 1555->1566->806 MB, 1610 MB goal, 40 P
gc 13 @28.270s 0%: 0.14+10+0.051 ms clock, 5.9+141/93/0+2.0 ms cpu, 1573->1582->816 MB, 1613 MB goal, 40 P
gc 14 @28.483s 0%: 0.17+10+0.081 ms clock, 6.9+139/94/0.017+3.2 ms cpu, 1591->1602->823 MB, 1632 MB goal, 40 P
gc 15 @28.693s 0%: 0.12+8.8+0.048 ms clock, 4.9+154/79/1.5+1.9 ms cpu, 1606->1616->825 MB, 1647 MB goal, 40 P
gc 16 @28.901s 0%: 0.14+10+0.066 ms clock, 5.6+142/91/1.6+2.6 ms cpu, 1610->1620->830 MB, 1651 MB goal, 40 P
gc 17 @29.122s 0%: 0.13+12+0.028 ms clock, 5.5+170/110/0.45+1.1 ms cpu, 1620->1630->834 MB, 1661 MB goal, 40 P
gc 18 @29.344s 0%: 0.10+11+0.072 ms clock, 4.3+158/109/16+2.9 ms cpu, 1628->1639->845 MB, 1669 MB goal, 40 P
gc 19 @29.578s 0%: 0.21+13+0.091 ms clock, 8.5+153/115/19+3.6 ms cpu, 1649->1660->862 MB, 1691 MB goal, 40 P
gc 20 @29.802s 0%: 0.22+12+0.079 ms clock, 9.0+173/111/11+3.1 ms cpu, 1681->1692->865 MB, 1724 MB goal, 40 P
gc 21 @30.017s 0%: 0.11+11+0.031 ms clock, 4.4+192/111/0.92+1.2 ms cpu, 1688->1699->867 MB, 1731 MB goal, 40 P
gc 22 @30.233s 0%: 0.13+11+0.096 ms clock, 5.4+196/105/0.78+3.8 ms cpu, 1691->1703->874 MB, 1734 MB goal, 40 P
gc 23 @30.455s 0%: 0.22+12+0.030 ms clock, 8.8+178/116/5.1+1.2 ms cpu, 1705->1715->877 MB, 1748 MB goal, 40 P
gc 24 @30.681s 0%: 0.14+12+0.065 ms clock, 5.8+176/109/4.8+2.6 ms cpu, 1710->1720->881 MB, 1754 MB goal, 40 P
gc 25 @30.893s 0%: 0.10+11+0.027 ms clock, 4.0+171/109/4.2+1.1 ms cpu, 1720->1730->889 MB, 1763 MB goal, 40 P
gc 26 @31.113s 0%: 0.12+12+0.029 ms clock, 5.1+174/119/18+1.1 ms cpu, 1735->1746->887 MB, 1779 MB goal, 40 P
gc 27 @31.332s 0%: 0.11+10+0.079 ms clock, 4.7+185/96/8.2+3.1 ms cpu, 1732->1741->890 MB, 1775 MB goal, 40 P
gc 28 @31.548s 0%: 0.13+11+0.074 ms clock, 5.2+189/100/5.2+2.9 ms cpu, 1737->1748->889 MB, 1781 MB goal, 40 P
gc 29 @31.765s 0%: 0.14+11+0.059 ms clock, 5.7+183/100/1.6+2.3 ms cpu, 1734->1745->894 MB, 1778 MB goal, 40 P
gc 30 @31.983s 0%: 0.35+10+0.072 ms clock, 14+200/101/2.3+2.8 ms cpu, 1745->1756->900 MB, 1789 MB goal, 40 P
gc 31 @32.200s 0%: 0.12+11+0.073 ms clock, 4.9+196/107/0+2.9 ms cpu, 1755->1766->904 MB, 1800 MB goal, 40 P
gc 32 @32.417s 0%: 0.16+10+0.022 ms clock, 6.7+167/101/5.8+0.88 ms cpu, 1763->1774->898 MB, 1808 MB goal, 40 P
gc 33 @32.634s 0%: 0.12+11+0.089 ms clock, 4.9+202/102/0.058+3.5 ms cpu, 1752->1763->900 MB, 1797 MB goal, 40 P
gc 34 @32.855s 0%: 0.17+11+0.071 ms clock, 7.0+173/103/2.2+2.8 ms cpu, 1756->1767->905 MB, 1801 MB goal, 40 P
gc 35 @33.075s 0%: 0.10+10+0.073 ms clock, 4.3+188/103/10+2.9 ms cpu, 1766->1777->916 MB, 1811 MB goal, 40 P
gc 36 @33.300s 0%: 0.20+10+0.050 ms clock, 8.3+219/102/0.41+2.0 ms cpu, 1787->1795->911 MB, 1832 MB goal, 40 P
gc 37 @33.522s 0%: 0.13+10+0.027 ms clock, 5.5+177/98/7.5+1.0 ms cpu, 1778->1789->906 MB, 1823 MB goal, 40 P
gc 38 @33.746s 0%: 0.11+10+0.033 ms clock, 4.7+205/99/0+1.3 ms cpu, 1767->1779->908 MB, 1812 MB goal, 40 P
gc 39 @33.971s 0%: 0.13+11+0.072 ms clock, 5.2+212/103/2.9+2.9 ms cpu, 1771->1783->899 MB, 1816 MB goal, 40 P
gc 40 @34.195s 0%: 0.11+10+0.069 ms clock, 4.5+217/102/7.0+2.7 ms cpu, 1754->1764->900 MB, 1798 MB goal, 40 P
gc 41 @34.414s 0%: 0.14+10+0.089 ms clock, 5.6+220/104/0.18+3.5 ms cpu, 1757->1768->894 MB, 1801 MB goal, 40 P
gc 42 @34.633s 0%: 0.17+10+0.079 ms clock, 6.8+214/101/0.19+3.1 ms cpu, 1745->1756->886 MB, 1789 MB goal, 40 P
gc 43 @34.853s 0%: 0.11+11+0.027 ms clock, 4.6+197/101/2.7+1.0 ms cpu, 1729->1740->894 MB, 1772 MB goal, 40 P
gc 44 @35.067s 0%: 0.091+11+0.030 ms clock, 3.6+191/102/1.3+1.2 ms cpu, 1744->1755->898 MB, 1788 MB goal, 40 P
gc 45 @35.291s 0%: 0.16+9.8+0.10 ms clock, 6.6+193/96/4.4+4.2 ms cpu, 1753->1764->903 MB, 1797 MB goal, 40 P
gc 46 @35.511s 0%: 0.12+10+0.057 ms clock, 5.1+214/103/1.4+2.2 ms cpu, 1763->1773->903 MB, 1807 MB goal, 40 P
gc 47 @35.733s 0%: 0.13+10+0.027 ms clock, 5.2+208/101/2.4+1.1 ms cpu, 1762->1773->904 MB, 1807 MB goal, 40 P
gc 48 @35.954s 0%: 0.19+9.9+0.034 ms clock, 7.8+185/97/13+1.3 ms cpu, 1764->1776->907 MB, 1809 MB goal, 40 P
gc 49 @36.177s 0%: 0.12+10+0.037 ms clock, 4.8+206/103/1.9+1.4 ms cpu, 1769->1780->915 MB, 1814 MB goal, 40 P
gc 50 @36.399s 0%: 0.12+10+0.082 ms clock, 4.8+212/98/0.15+3.2 ms cpu, 1786->1798->923 MB, 1831 MB goal, 40 P
gc 51 @36.621s 0%: 0.16+12+0.064 ms clock, 6.6+209/115/0.083+2.5 ms cpu, 1800->1810->914 MB, 1846 MB goal, 40 P
gc 52 @36.848s 0%: 0.21+10+0.031 ms clock, 8.5+184/103/4.8+1.2 ms cpu, 1784->1795->900 MB, 1829 MB goal, 40 P
gc 53 @37.074s 0%: 0.18+10+0.025 ms clock, 7.5+194/101/4.4+1.0 ms cpu, 1757->1768->904 MB, 1801 MB goal, 40 P
gc 54 @37.292s 0%: 0.15+10+0.062 ms clock, 6.2+224/100/0.72+2.5 ms cpu, 1763->1774->909 MB, 1808 MB goal, 40 P
gc 55 @37.516s 0%: 0.11+10+0.058 ms clock, 4.5+210/103/3.4+2.3 ms cpu, 1774->1786->913 MB, 1819 MB goal, 40 P
gc 56 @37.740s 0%: 0.20+10+0.029 ms clock, 8.1+194/100/8.9+1.1 ms cpu, 1780->1791->903 MB, 1826 MB goal, 40 P
gc 57 @37.964s 0%: 0.12+10+0.024 ms clock, 5.1+213/101/3.5+0.96 ms cpu, 1761->1773->911 MB, 1806 MB goal, 40 P
gc 58 @38.187s 0%: 0.15+11+0.080 ms clock, 6.3+216/103/0.60+3.2 ms cpu, 1777->1788->915 MB, 1822 MB goal, 40 P
gc 59 @38.413s 0%: 0.12+11+0.077 ms clock, 5.1+213/105/1.1+3.1 ms cpu, 1785->1795->916 MB, 1830 MB goal, 40 P
gc 60 @38.635s 0%: 0.13+10+0.073 ms clock, 5.2+203/103/3.7+2.9 ms cpu, 1787->1797->914 MB, 1832 MB goal, 40 P
gc 61 @38.859s 1%: 0.14+10+0.032 ms clock, 5.9+197/101/0.68+1.2 ms cpu, 1782->1794->918 MB, 1828 MB goal, 40 P
gc 62 @39.084s 1%: 0.12+11+0.083 ms clock, 5.1+216/103/1.6+3.3 ms cpu, 1790->1802->925 MB, 1836 MB goal, 40 P
gc 63 @39.306s 1%: 0.13+10+0.080 ms clock, 5.4+209/101/0.53+3.2 ms cpu, 1805->1817->925 MB, 1851 MB goal, 40 P
gc 64 @39.536s 1%: 0.13+10+0.034 ms clock, 5.2+228/103/0.38+1.3 ms cpu, 1806->1817->920 MB, 1851 MB goal, 40 P
gc 65 @39.764s 1%: 0.12+11+0.098 ms clock, 4.8+201/102/0.34+3.9 ms cpu, 1796->1807->922 MB, 1841 MB goal, 40 P
gc 66 @39.988s 1%: 0.16+10+0.036 ms clock, 6.5+201/99/4.7+1.4 ms cpu, 1800->1812->926 MB, 1845 MB goal, 40 P
gc 67 @40.221s 1%: 0.17+10+0.024 ms clock, 6.9+201/101/0.28+0.96 ms cpu, 1807->1818->928 MB, 1853 MB goal, 40 P
gc 68 @40.446s 1%: 0.12+14+0.086 ms clock, 4.9+187/107/10+3.4 ms cpu, 1811->1822->925 MB, 1857 MB goal, 40 P
gc 69 @40.677s 1%: 0.13+12+0.062 ms clock, 5.4+146/114/11+2.4 ms cpu, 1805->1816->923 MB, 1851 MB goal, 40 P
gc 70 @40.907s 1%: 0.11+11+0.020 ms clock, 4.6+175/105/5.8+0.82 ms cpu, 1801->1811->934 MB, 1847 MB goal, 40 P
gc 71 @41.137s 1%: 0.20+11+0.075 ms clock, 8.2+145/105/23+3.0 ms cpu, 1822->1834->939 MB, 1869 MB goal, 40 P
gc 72 @41.368s 1%: 0.12+11+0.074 ms clock, 5.1+177/103/11+2.9 ms cpu, 1831->1844->948 MB, 1878 MB goal, 40 P
gc 73 @41.598s 1%: 0.12+11+0.071 ms clock, 5.1+166/110/17+2.8 ms cpu, 1849->1860->946 MB, 1896 MB goal, 40 P
gc 74 @41.841s 1%: 0.24+11+0.084 ms clock, 9.6+234/111/0.19+3.3 ms cpu, 1845->1856->964 MB, 1892 MB goal, 40 P
gc 75 @42.098s 1%: 0.19+12+0.038 ms clock, 7.6+226/112/0.17+1.5 ms cpu, 1881->1891->955 MB, 1929 MB goal, 40 P

e508a5f

gc 1 @0.065s 0%: 0.010+1.7+0.030 ms clock, 0.43+0.39/1.1/1.3+1.2 ms cpu, 4->7->6 MB, 5 MB goal, 40 P
gc 2 @0.067s 0%: 0.053+1.8+0.018 ms clock, 2.1+0.10/2.5/0.44+0.73 ms cpu, 7->14->14 MB, 12 MB goal, 40 P
gc 3 @0.071s 0%: 0.015+1.1+0.024 ms clock, 0.60+0.20/3.2/1.3+0.96 ms cpu, 16->17->17 MB, 29 MB goal, 40 P
gc 4 @0.076s 0%: 0.011+1.5+0.038 ms clock, 0.47+0.22/4.3/0.99+1.5 ms cpu, 27->28->27 MB, 35 MB goal, 40 P
gc 5 @0.085s 0%: 0.008+2.4+0.023 ms clock, 0.32+0.23/6.6/0.23+0.95 ms cpu, 49->50->50 MB, 55 MB goal, 40 P
gc 6 @0.095s 1%: 0.66+3.3+0.041 ms clock, 26+0.94/12/0+1.6 ms cpu, 102->231->230 MB, 103 MB goal, 40 P
gc 7 @0.103s 3%: 1.5+2.7+0.038 ms clock, 63+6.2/15/6.5+1.5 ms cpu, 263->519->518 MB, 461 MB goal, 40 P
gc 8 @0.112s 4%: 1.0+2.5+0.020 ms clock, 41+0.15/9.6/0.38+0.83 ms cpu, 550->582->582 MB, 1037 MB goal, 40 P
gc 9 @11.995s 0%: 0.10+10+0.063 ms clock, 4.2+11/93/0+2.5 ms cpu, 844->855->731 MB, 1165 MB goal, 40 P
gc 10 @12.279s 0%: 0.14+17+0.21 ms clock, 5.7+20/120/0+8.6 ms cpu, 1254->1282->784 MB, 1462 MB goal, 40 P
gc 11 @12.505s 0%: 0.15+16+0.20 ms clock, 6.3+49/116/0.019+8.2 ms cpu, 1444->1468->803 MB, 1569 MB goal, 40 P
gc 12 @12.727s 0%: 0.14+13+0.034 ms clock, 5.6+64/109/0+1.3 ms cpu, 1529->1548->813 MB, 1607 MB goal, 40 P
gc 13 @12.954s 0%: 0.17+10+0.042 ms clock, 6.9+108/91/0.070+1.6 ms cpu, 1576->1587->814 MB, 1627 MB goal, 40 P
gc 14 @13.163s 0%: 0.13+11+0.028 ms clock, 5.2+142/104/0.83+1.1 ms cpu, 1588->1597->826 MB, 1628 MB goal, 40 P
gc 15 @13.357s 0%: 0.21+11+0.024 ms clock, 8.5+148/107/2.7+0.99 ms cpu, 1612->1623->833 MB, 1653 MB goal, 40 P
gc 16 @13.550s 0%: 0.14+12+0.042 ms clock, 5.9+138/108/8.2+1.7 ms cpu, 1624->1634->841 MB, 1666 MB goal, 40 P
gc 17 @13.745s 0%: 0.15+11+0.080 ms clock, 6.2+111/108/28+3.2 ms cpu, 1641->1651->845 MB, 1682 MB goal, 40 P
gc 18 @13.938s 0%: 0.11+11+0.089 ms clock, 4.5+136/107/11+3.5 ms cpu, 1649->1659->851 MB, 1691 MB goal, 40 P
gc 19 @14.136s 0%: 0.13+12+0.067 ms clock, 5.2+157/112/3.6+2.6 ms cpu, 1659->1668->849 MB, 1702 MB goal, 40 P
gc 20 @14.343s 0%: 0.14+10+0.026 ms clock, 5.9+136/93/16+1.0 ms cpu, 1656->1667->856 MB, 1698 MB goal, 40 P
gc 21 @14.560s 0%: 1.5+11+0.048 ms clock, 62+167/102/0.34+1.9 ms cpu, 1670->1679->867 MB, 1712 MB goal, 40 P
gc 22 @14.782s 0%: 0.17+13+0.019 ms clock, 7.0+167/97/3.1+0.79 ms cpu, 1691->1701->871 MB, 1734 MB goal, 40 P
gc 23 @14.992s 0%: 0.11+10+0.068 ms clock, 4.4+160/92/18+2.7 ms cpu, 1700->1711->870 MB, 1743 MB goal, 40 P
gc 24 @15.197s 0%: 0.13+10+0.18 ms clock, 5.3+205/96/0+7.5 ms cpu, 1698->1707->858 MB, 1741 MB goal, 40 P
gc 25 @15.398s 0%: 0.26+9.9+0.048 ms clock, 10+188/91/5.1+1.9 ms cpu, 1674->1684->862 MB, 1716 MB goal, 40 P
gc 26 @15.605s 0%: 0.16+10+0.021 ms clock, 6.6+176/95/5.6+0.85 ms cpu, 1681->1691->870 MB, 1724 MB goal, 40 P
gc 27 @15.807s 0%: 0.19+10+0.11 ms clock, 7.7+210/97/0.058+4.6 ms cpu, 1697->1706->870 MB, 1740 MB goal, 40 P
gc 28 @16.010s 0%: 0.23+10+0.019 ms clock, 9.4+191/97/0.94+0.78 ms cpu, 1697->1707->869 MB, 1740 MB goal, 40 P
gc 29 @16.201s 0%: 3.0+10+0.047 ms clock, 120+181/98/2.1+1.9 ms cpu, 1696->1706->873 MB, 1739 MB goal, 40 P
gc 30 @16.400s 0%: 0.093+10+0.081 ms clock, 3.7+169/99/1.1+3.2 ms cpu, 1703->1712->879 MB, 1746 MB goal, 40 P
gc 31 @16.600s 0%: 0.28+11+0.084 ms clock, 11+149/102/10+3.3 ms cpu, 1715->1726->884 MB, 1759 MB goal, 40 P
gc 32 @16.802s 0%: 0.13+10+0.071 ms clock, 5.3+165/93/9.5+2.8 ms cpu, 1725->1736->892 MB, 1768 MB goal, 40 P
gc 33 @17.011s 0%: 0.18+10+0.064 ms clock, 7.5+154/97/5.0+2.5 ms cpu, 1739->1750->891 MB, 1784 MB goal, 40 P
gc 34 @17.215s 1%: 0.075+10+0.025 ms clock, 3.0+158/102/5.3+1.0 ms cpu, 1738->1748->899 MB, 1783 MB goal, 40 P
gc 35 @17.424s 1%: 0.13+10+0.081 ms clock, 5.4+170/95/0.99+3.2 ms cpu, 1754->1764->895 MB, 1799 MB goal, 40 P
gc 36 @17.635s 1%: 0.14+10+0.081 ms clock, 5.7+148/99/11+3.2 ms cpu, 1747->1758->886 MB, 1791 MB goal, 40 P
gc 37 @17.841s 1%: 0.13+11+0.030 ms clock, 5.4+167/102/6.2+1.2 ms cpu, 1729->1739->893 MB, 1773 MB goal, 40 P
gc 38 @18.045s 1%: 0.14+10+0.058 ms clock, 5.6+145/99/9.3+2.3 ms cpu, 1742->1752->902 MB, 1786 MB goal, 40 P
gc 39 @18.253s 1%: 0.13+10+0.086 ms clock, 5.3+162/95/3.6+3.4 ms cpu, 1761->1771->912 MB, 1805 MB goal, 40 P
gc 40 @18.470s 1%: 0.14+10+0.082 ms clock, 5.7+176/100/0.54+3.2 ms cpu, 1779->1789->911 MB, 1824 MB goal, 40 P
gc 41 @18.680s 1%: 0.13+11+0.069 ms clock, 5.5+141/100/24+2.7 ms cpu, 1776->1788->899 MB, 1822 MB goal, 40 P
gc 42 @18.891s 1%: 0.12+10+0.059 ms clock, 4.9+194/96/0.14+2.3 ms cpu, 1754->1763->884 MB, 1798 MB goal, 40 P
gc 43 @19.095s 1%: 0.25+10+0.083 ms clock, 10+193/95/0.18+3.3 ms cpu, 1724->1735->884 MB, 1768 MB goal, 40 P
gc 44 @19.300s 1%: 0.17+10+0.033 ms clock, 7.1+194/96/0.36+1.3 ms cpu, 1726->1736->895 MB, 1769 MB goal, 40 P
gc 45 @19.507s 1%: 0.11+9.9+0.023 ms clock, 4.6+171/94/7.1+0.92 ms cpu, 1745->1756->903 MB, 1790 MB goal, 40 P
gc 46 @19.715s 1%: 0.11+9.9+0.071 ms clock, 4.6+191/93/1.9+2.8 ms cpu, 1762->1773->900 MB, 1807 MB goal, 40 P
gc 47 @19.920s 1%: 0.15+10+0.027 ms clock, 6.0+198/96/0.019+1.1 ms cpu, 1756->1766->893 MB, 1800 MB goal, 40 P
gc 48 @20.130s 1%: 0.31+9.8+0.023 ms clock, 12+178/94/1.2+0.93 ms cpu, 1742->1753->888 MB, 1787 MB goal, 40 P
gc 49 @20.336s 1%: 0.099+10+0.17 ms clock, 3.9+198/98/0.60+7.1 ms cpu, 1732->1743->898 MB, 1776 MB goal, 40 P
gc 50 @20.547s 1%: 0.17+10+0.037 ms clock, 6.9+179/94/1.1+1.5 ms cpu, 1752->1762->889 MB, 1796 MB goal, 40 P
gc 51 @20.750s 1%: 0.10+10+0.17 ms clock, 4.3+180/92/4.5+7.1 ms cpu, 1735->1746->894 MB, 1779 MB goal, 40 P
gc 52 @20.954s 1%: 0.22+9.8+0.078 ms clock, 9.0+192/94/0.58+3.1 ms cpu, 1744->1753->901 MB, 1788 MB goal, 40 P
gc 53 @21.163s 1%: 0.25+10+0.031 ms clock, 10+181/97/4.9+1.2 ms cpu, 1758->1769->898 MB, 1802 MB goal, 40 P
gc 54 @21.364s 1%: 0.14+10+0.078 ms clock, 5.8+197/96/4.9+3.1 ms cpu, 1752->1762->907 MB, 1796 MB goal, 40 P
gc 55 @21.576s 1%: 0.21+10+0.18 ms clock, 8.4+188/95/1.2+7.4 ms cpu, 1770->1779->893 MB, 1814 MB goal, 40 P
gc 56 @21.781s 1%: 0.16+11+0.027 ms clock, 6.7+189/96/1.2+1.1 ms cpu, 1741->1753->889 MB, 1786 MB goal, 40 P
gc 57 @21.986s 1%: 0.11+9.9+0.032 ms clock, 4.4+192/94/0.30+1.3 ms cpu, 1735->1746->892 MB, 1779 MB goal, 40 P
gc 58 @22.195s 1%: 0.22+10+0.027 ms clock, 8.9+179/98/0.67+1.1 ms cpu, 1740->1750->895 MB, 1784 MB goal, 40 P
gc 59 @22.400s 1%: 0.15+10+0.015 ms clock, 6.0+218/103/0.96+0.61 ms cpu, 1745->1757->896 MB, 1790 MB goal, 40 P
gc 60 @22.608s 1%: 0.19+10+0.050 ms clock, 7.9+185/96/0.74+2.0 ms cpu, 1748->1758->898 MB, 1793 MB goal, 40 P
gc 61 @22.818s 1%: 0.094+11+0.078 ms clock, 3.7+166/99/2.1+3.1 ms cpu, 1751->1762->910 MB, 1796 MB goal, 40 P
gc 62 @23.038s 1%: 0.13+10+0.077 ms clock, 5.4+194/99/0.37+3.0 ms cpu, 1776->1785->911 MB, 1821 MB goal, 40 P
gc 63 @23.253s 1%: 0.14+9.8+0.059 ms clock, 5.7+187/94/0+2.3 ms cpu, 1778->1789->912 MB, 1823 MB goal, 40 P
gc 64 @23.466s 1%: 0.13+10+0.082 ms clock, 5.5+186/96/0.030+3.3 ms cpu, 1780->1791->916 MB, 1825 MB goal, 40 P
gc 65 @23.684s 1%: 0.16+10+0.018 ms clock, 6.5+182/98/4.7+0.72 ms cpu, 1787->1799->912 MB, 1833 MB goal, 40 P
gc 66 @23.898s 1%: 0.17+10+0.080 ms clock, 7.1+184/102/0.36+3.2 ms cpu, 1780->1792->918 MB, 1825 MB goal, 40 P
gc 67 @24.114s 1%: 0.17+10+0.069 ms clock, 7.1+189/96/1.1+2.7 ms cpu, 1791->1803->919 MB, 1837 MB goal, 40 P
gc 68 @24.329s 1%: 0.17+10+0.075 ms clock, 6.8+194/93/0.20+3.0 ms cpu, 1793->1804->914 MB, 1838 MB goal, 40 P
gc 69 @24.541s 1%: 0.14+9.8+0.058 ms clock, 5.7+196/96/2.8+2.3 ms cpu, 1783->1793->917 MB, 1828 MB goal, 40 P
gc 70 @24.754s 1%: 0.12+10+0.089 ms clock, 5.1+201/94/0.10+3.5 ms cpu, 1789->1800->918 MB, 1835 MB goal, 40 P
gc 71 @24.964s 1%: 0.24+10+0.075 ms clock, 9.7+195/96/0.096+3.0 ms cpu, 1792->1803->911 MB, 1837 MB goal, 40 P
gc 72 @25.174s 1%: 0.23+10+0.078 ms clock, 9.4+184/96/0.27+3.1 ms cpu, 1777->1787->916 MB, 1822 MB goal, 40 P
gc 73 @25.387s 1%: 0.19+10+0.025 ms clock, 7.8+201/96/2.7+1.0 ms cpu, 1788->1799->915 MB, 1833 MB goal, 40 P
gc 74 @25.597s 1%: 0.15+10+0.087 ms clock, 6.2+204/96/0.043+3.4 ms cpu, 1785->1795->914 MB, 1831 MB goal, 40 P
gc 75 @25.813s 1%: 0.14+10+0.085 ms clock, 5.8+206/100/0+3.4 ms cpu, 1782->1794->915 MB, 1828 MB goal, 40 P
gc 76 @26.040s 1%: 0.29+10+0.027 ms clock, 11+206/101/1.2+1.1 ms cpu, 1785->1796->927 MB, 1831 MB goal, 40 P

go 1.11.1

gc 1 @0.062s 0%: 0.009+1.4+0.097 ms clock, 0.38+0.39/1.5/1.4+3.9 ms cpu, 4->5->4 MB, 5 MB goal, 40 P
gc 2 @0.065s 0%: 0.019+0.88+0.12 ms clock, 0.77+0.060/2.8/0.31+5.0 ms cpu, 7->8->8 MB, 9 MB goal, 40 P
gc 3 @0.069s 0%: 0.018+1.0+0.14 ms clock, 0.73+0.24/3.5/0.37+5.8 ms cpu, 14->14->14 MB, 16 MB goal, 40 P
gc 4 @0.075s 1%: 0.010+1.3+0.16 ms clock, 0.41+0.21/3.8/0.53+6.7 ms cpu, 26->27->27 MB, 28 MB goal, 40 P
gc 5 @0.084s 1%: 0.010+2.0+0.094 ms clock, 0.41+0.18/9.1/0.58+3.7 ms cpu, 51->51->51 MB, 54 MB goal, 40 P
gc 6 @0.095s 2%: 0.71+3.2+0.12 ms clock, 28+4.3/14/0.30+4.9 ms cpu, 102->199->198 MB, 103 MB goal, 40 P
gc 7 @0.102s 4%: 2.2+2.4+0.20 ms clock, 88+0.69/13/13+8.1 ms cpu, 231->583->582 MB, 488 MB goal, 40 P
gc 8 @0.299s 1%: 0.010+1.5+0.15 ms clock, 0.43+0.15/11/2.3+6.2 ms cpu, 583->584->583 MB, 1165 MB goal, 40 P
gc 9 @9.741s 0%: 0.090+11+0.34 ms clock, 3.6+25/94/1.6+13 ms cpu, 874->883->736 MB, 1167 MB goal, 40 P
gc 10 @9.991s 0%: 0.11+12+0.35 ms clock, 4.7+28/104/6.3+14 ms cpu, 1282->1292->770 MB, 1472 MB goal, 40 P
gc 11 @10.208s 0%: 0.11+13+0.30 ms clock, 4.4+37/107/2.7+12 ms cpu, 1437->1446->786 MB, 1541 MB goal, 40 P
gc 12 @10.440s 0%: 0.10+12+0.42 ms clock, 4.1+33/100/12+16 ms cpu, 1514->1523->796 MB, 1573 MB goal, 40 P
gc 13 @10.678s 0%: 0.10+12+0.18 ms clock, 4.0+55/97/2.7+7.5 ms cpu, 1552->1560->800 MB, 1592 MB goal, 40 P
gc 14 @10.893s 0%: 0.10+14+0.34 ms clock, 4.3+108/121/1.1+13 ms cpu, 1562->1571->817 MB, 1601 MB goal, 40 P
gc 15 @11.094s 0%: 0.12+14+0.18 ms clock, 4.9+101/128/3.8+7.2 ms cpu, 1593->1603->822 MB, 1634 MB goal, 40 P
gc 16 @11.294s 0%: 0.12+14+0.30 ms clock, 5.1+110/130/0.25+12 ms cpu, 1604->1615->832 MB, 1645 MB goal, 40 P
gc 17 @11.492s 0%: 0.13+16+0.21 ms clock, 5.5+109/131/10+8.5 ms cpu, 1623->1634->837 MB, 1664 MB goal, 40 P
gc 18 @11.691s 0%: 0.12+13+0.22 ms clock, 5.1+121/113/0.81+8.9 ms cpu, 1633->1643->839 MB, 1674 MB goal, 40 P
gc 19 @11.898s 0%: 0.14+11+0.33 ms clock, 5.9+145/96/0.45+13 ms cpu, 1638->1647->836 MB, 1679 MB goal, 40 P
gc 20 @12.111s 0%: 0.12+10+0.27 ms clock, 5.0+156/90/2.1+10 ms cpu, 1630->1641->850 MB, 1672 MB goal, 40 P
gc 21 @12.332s 0%: 0.10+17+0.27 ms clock, 4.3+115/143/11+10 ms cpu, 1658->1668->861 MB, 1700 MB goal, 40 P
gc 22 @12.537s 0%: 0.10+15+0.49 ms clock, 4.2+100/119/34+19 ms cpu, 1680->1691->867 MB, 1723 MB goal, 40 P
gc 23 @12.736s 0%: 0.12+17+0.16 ms clock, 5.0+84/143/21+6.4 ms cpu, 1691->1703->861 MB, 1734 MB goal, 40 P
gc 24 @12.936s 0%: 0.098+11+0.15 ms clock, 3.9+149/89/1.2+6.3 ms cpu, 1679->1689->863 MB, 1722 MB goal, 40 P
gc 25 @13.132s 0%: 0.096+9.4+0.14 ms clock, 3.8+162/86/0.099+5.8 ms cpu, 1684->1694->865 MB, 1727 MB goal, 40 P
gc 26 @13.331s 0%: 0.18+9.6+0.21 ms clock, 7.3+162/87/1.1+8.4 ms cpu, 1687->1699->866 MB, 1730 MB goal, 40 P
gc 27 @13.525s 0%: 0.13+9.5+0.15 ms clock, 5.4+168/85/1.0+6.0 ms cpu, 1690->1700->878 MB, 1733 MB goal, 40 P
gc 28 @13.724s 0%: 0.20+9.4+0.21 ms clock, 8.1+151/87/8.6+8.5 ms cpu, 1713->1724->875 MB, 1756 MB goal, 40 P
gc 29 @13.920s 0%: 0.27+9.4+0.14 ms clock, 10+166/88/0.26+5.9 ms cpu, 1707->1717->882 MB, 1750 MB goal, 40 P
gc 30 @14.118s 0%: 0.12+8.8+0.15 ms clock, 5.0+165/82/2.2+6.1 ms cpu, 1721->1730->888 MB, 1765 MB goal, 40 P
gc 31 @14.317s 0%: 0.15+9.8+0.20 ms clock, 6.3+142/91/8.2+8.2 ms cpu, 1733->1743->890 MB, 1777 MB goal, 40 P
gc 32 @14.516s 1%: 0.12+9.5+0.25 ms clock, 5.1+159/86/1.6+10 ms cpu, 1737->1748->894 MB, 1781 MB goal, 40 P
gc 33 @14.717s 1%: 0.12+9.3+0.15 ms clock, 5.1+153/85/0.15+6.3 ms cpu, 1744->1755->895 MB, 1788 MB goal, 40 P
gc 34 @14.919s 1%: 0.12+9.6+0.13 ms clock, 5.1+146/88/5.6+5.3 ms cpu, 1746->1757->901 MB, 1790 MB goal, 40 P
gc 35 @15.123s 1%: 0.17+9.9+0.20 ms clock, 6.9+146/91/8.3+8.2 ms cpu, 1757->1768->907 MB, 1802 MB goal, 40 P
gc 36 @15.329s 1%: 0.12+10+0.19 ms clock, 4.8+142/95/1.2+7.6 ms cpu, 1769->1781->903 MB, 1814 MB goal, 40 P
gc 37 @15.532s 1%: 0.10+9.9+0.14 ms clock, 4.0+146/91/1.7+5.8 ms cpu, 1761->1772->905 MB, 1806 MB goal, 40 P
gc 38 @15.734s 1%: 0.12+10+0.20 ms clock, 5.1+140/92/8.2+8.3 ms cpu, 1765->1776->912 MB, 1810 MB goal, 40 P
gc 39 @15.939s 1%: 0.10+10+0.15 ms clock, 4.2+146/91/1.8+6.1 ms cpu, 1778->1789->917 MB, 1824 MB goal, 40 P
gc 40 @16.148s 1%: 0.13+10+0.19 ms clock, 5.4+142/94/1.5+7.9 ms cpu, 1789->1799->923 MB, 1834 MB goal, 40 P
gc 41 @16.358s 1%: 0.16+9.5+0.21 ms clock, 6.6+152/88/5.2+8.7 ms cpu, 1801->1813->925 MB, 1847 MB goal, 40 P
gc 42 @16.569s 1%: 0.10+9.1+0.15 ms clock, 4.0+167/84/0+6.1 ms cpu, 1804->1815->916 MB, 1850 MB goal, 40 P
gc 43 @16.779s 1%: 0.22+9.3+0.18 ms clock, 8.8+169/87/0.22+7.5 ms cpu, 1786->1796->910 MB, 1832 MB goal, 40 P
gc 44 @16.986s 1%: 0.14+9.6+0.19 ms clock, 5.8+151/87/10+7.7 ms cpu, 1776->1787->905 MB, 1821 MB goal, 40 P
gc 45 @17.189s 1%: 0.14+8.9+0.22 ms clock, 5.8+175/79/0.82+8.9 ms cpu, 1766->1777->900 MB, 1811 MB goal, 40 P
gc 46 @17.393s 1%: 0.14+12+0.15 ms clock, 5.6+170/93/0+6.1 ms cpu, 1756->1766->899 MB, 1800 MB goal, 40 P
gc 47 @17.596s 1%: 0.16+9.0+0.24 ms clock, 6.4+167/84/0.24+9.6 ms cpu, 1754->1765->896 MB, 1798 MB goal, 40 P
gc 48 @17.800s 1%: 0.55+9.4+0.15 ms clock, 22+176/89/0.27+6.1 ms cpu, 1748->1758->900 MB, 1792 MB goal, 40 P
gc 49 @18.006s 1%: 0.20+9.7+0.20 ms clock, 8.3+160/88/0.64+8.3 ms cpu, 1755->1766->896 MB, 1800 MB goal, 40 P
gc 50 @18.209s 1%: 0.16+9.8+0.24 ms clock, 6.6+167/87/0+9.6 ms cpu, 1749->1760->898 MB, 1793 MB goal, 40 P
gc 51 @18.414s 1%: 0.11+9.4+0.16 ms clock, 4.4+165/88/0.28+6.5 ms cpu, 1753->1763->890 MB, 1797 MB goal, 40 P
gc 52 @18.615s 1%: 0.14+10+0.15 ms clock, 5.8+165/94/1.6+6.3 ms cpu, 1736->1748->893 MB, 1780 MB goal, 40 P
gc 53 @18.818s 1%: 0.13+9.1+0.20 ms clock, 5.4+171/87/0.016+8.0 ms cpu, 1743->1752->900 MB, 1787 MB goal, 40 P
gc 54 @19.027s 1%: 0.19+8.6+0.23 ms clock, 7.6+165/83/0.14+9.2 ms cpu, 1755->1766->900 MB, 1800 MB goal, 40 P
gc 55 @19.234s 1%: 0.12+11+0.15 ms clock, 4.8+177/92/0.86+6.0 ms cpu, 1757->1767->903 MB, 1801 MB goal, 40 P
gc 56 @19.445s 1%: 0.18+9.8+0.16 ms clock, 7.4+160/90/1.1+6.7 ms cpu, 1762->1773->894 MB, 1807 MB goal, 40 P
gc 57 @19.645s 1%: 0.12+9.6+0.24 ms clock, 5.0+169/86/0.30+9.6 ms cpu, 1745->1756->900 MB, 1789 MB goal, 40 P
gc 58 @19.852s 1%: 0.18+10+0.20 ms clock, 7.3+157/93/1.8+8.2 ms cpu, 1756->1766->906 MB, 1800 MB goal, 40 P
gc 59 @20.060s 1%: 0.12+9.6+0.17 ms clock, 5.1+161/87/0.10+6.9 ms cpu, 1769->1780->915 MB, 1813 MB goal, 40 P
gc 60 @20.268s 1%: 0.14+9.6+0.20 ms clock, 5.7+153/88/0.87+8.0 ms cpu, 1785->1796->916 MB, 1831 MB goal, 40 P
gc 61 @20.479s 1%: 0.18+9.6+0.24 ms clock, 7.2+156/90/0.55+9.8 ms cpu, 1787->1797->916 MB, 1832 MB goal, 40 P
gc 62 @20.690s 1%: 0.11+10+0.21 ms clock, 4.4+144/97/1.2+8.6 ms cpu, 1787->1797->918 MB, 1832 MB goal, 40 P
gc 63 @20.903s 1%: 0.17+10+0.21 ms clock, 6.9+163/90/0.57+8.7 ms cpu, 1792->1802->925 MB, 1837 MB goal, 40 P
gc 64 @21.118s 1%: 0.15+9.6+0.23 ms clock, 6.0+154/92/0.21+9.3 ms cpu, 1805->1816->930 MB, 1851 MB goal, 40 P
gc 65 @21.331s 1%: 0.11+9.6+0.19 ms clock, 4.5+163/89/0.57+7.7 ms cpu, 1814->1824->917 MB, 1860 MB goal, 40 P
gc 66 @21.539s 1%: 0.17+9.6+0.21 ms clock, 7.0+153/85/5.5+8.4 ms cpu, 1790->1801->911 MB, 1835 MB goal, 40 P
gc 67 @21.744s 1%: 0.17+9.8+0.20 ms clock, 7.1+160/86/3.1+8.2 ms cpu, 1777->1788->905 MB, 1822 MB goal, 40 P
gc 68 @21.954s 1%: 0.22+9.9+0.21 ms clock, 8.8+164/91/0.23+8.4 ms cpu, 1766->1776->903 MB, 1811 MB goal, 40 P
gc 69 @22.165s 1%: 0.17+9.4+0.21 ms clock, 6.9+158/85/5.6+8.4 ms cpu, 1763->1774->904 MB, 1807 MB goal, 40 P
gc 70 @22.368s 1%: 0.16+9.2+0.20 ms clock, 6.4+160/87/0.56+8.1 ms cpu, 1764->1774->913 MB, 1809 MB goal, 40 P
gc 71 @22.578s 1%: 0.14+10+0.22 ms clock, 5.8+165/94/1.0+8.8 ms cpu, 1782->1793->899 MB, 1827 MB goal, 40 P
gc 72 @22.785s 1%: 0.17+8.9+0.14 ms clock, 7.0+166/84/0.59+5.9 ms cpu, 1754->1764->904 MB, 1799 MB goal, 40 P
gc 73 @22.988s 1%: 0.16+9.9+0.21 ms clock, 6.6+164/91/0.15+8.4 ms cpu, 1764->1775->908 MB, 1809 MB goal, 40 P
gc 74 @23.195s 1%: 0.18+9.5+0.22 ms clock, 7.4+167/84/1.5+8.8 ms cpu, 1771->1782->905 MB, 1816 MB goal, 40 P
gc 75 @23.399s 1%: 0.24+8.9+0.16 ms clock, 9.6+169/85/0.42+6.6 ms cpu, 1766->1776->910 MB, 1810 MB goal, 40 P
gc 76 @23.616s 1%: 0.26+9.4+0.19 ms clock, 10+161/89/2.4+7.9 ms cpu, 1776->1787->918 MB, 1821 MB goal, 40 P
@gopherbot

This comment has been minimized.

gopherbot commented Nov 27, 2018

Change https://golang.org/cl/151537 mentions this issue: runtime: add iterator abstraction for mTreap

@gopherbot

This comment has been minimized.

gopherbot commented Nov 27, 2018

Change https://golang.org/cl/151538 mentions this issue: runtime: specialize mTreap for spans of size 1

@mknyszek

This comment has been minimized.

Contributor

mknyszek commented Nov 27, 2018

@lni do you mind trying out this patch with your application/library? (https://go-review.googlesource.com/c/go/+/151538)

The original numbers are a little harder to reproduce now as it seems other runtime/compiler improvements are hiding the original performance regression a little bit, but in some preliminary benchmarking it looked like the change above helped with your original benchmark.

I still need to start some longer runs for better statistical significance so I'll update here with those soon.

@lni

This comment has been minimized.

lni commented Nov 28, 2018

@mknyszek , thanks for the patch!

I didn't tag my library used for those initial benchmarks, my bad. Using the latest version of my library and slightly different configurations, I tried it on the following 3 Go revisions. Results below are in million op/sec.

It seems to me that the patch improved its performance.

go1.11.2
8.85, 8.72, 8.78

3ed46a7 (the new patch)
8.70, 8.74, 8.78, 8.88, 8.78, 8.79

57c8eb9 (before the patch)
8.45, 8.81, 8.37, 8.49, 8.66, 8.78

@mknyszek mknyszek self-assigned this Dec 11, 2018

@mknyszek mknyszek changed the title from cmd/compile: make([]byte, n) becomes much slower compared with go 1.11.1 to runtime: make([]byte, n) becomes much slower compared with go 1.11.1 Dec 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment