Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: openbsd-amd64-64 trybots are too slow #29223

Closed
bradfitz opened this issue Dec 13, 2018 · 82 comments

Comments

Projects
None yet
@bradfitz
Copy link
Member

commented Dec 13, 2018

openbsd-amd64-64 trybots are taking 11+ minutes (which causes TryBots as a whole to take 11+ minutes rather than ~5)

We need to figure out what's slow on them, and/or just shard it out more.

/cc @dmitshur @bcmills @andybons

@bradfitz bradfitz added the NeedsFix label Dec 13, 2018

@gopherbot gopherbot added this to the Unreleased milestone Dec 13, 2018

@gopherbot gopherbot added the Builders label Dec 13, 2018

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

SELECT Builder, AVG(Seconds) as Sec FROM builds.Builds WHERE IsTry=True AND StartTime > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 100 HOUR) and Repo = "go" AND FailureURL = "" GROUP BY 1 ORDER BY Sec DESC;


Row | Builder | Sec |  
-- | -- | -- | --
1 | openbsd-amd64-64 | 579.7221071917958 |  
2 | linux-amd64-race | 488.40762166018374 |  
3 | nacl-386 | 434.8139809606734 |  
4 | windows-amd64-2016 | 424.604860819551 |  
5 | nacl-amd64p32 | 418.4696299015715 |  
6 | windows-386-2008 | 414.7469431190204 |  
7 | js-wasm | 371.9747404238125 |  
8 | misc-vet-vetall | 358.80661270393875 |  
9 | linux-386 | 353.81094730244905 |  
10 | linux-amd64 | 345.036077108898 |  
11 | misc-compile | 337.44598333253055 |  
12 | misc-compile-mips | 335.70810570520416 |  
13 | freebsd-amd64-12_0 | 328.52744295724483 |  
14 | misc-compile-openbsd | 293.41003601271416 |  
15 | misc-compile-netbsd | 292.8116776015307 |  
16 | misc-compile-freebsd | 292.80485985481636 |  
17 | misc-compile-nacl | 288.17948818259185 |  
18 | misc-compile-plan9 | 273.5849724516735 |  
19 | misc-compile-ppc | 251.7265086680816
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

SELECT Builder, Event, AVG(Seconds) as Sec FROM builds.Spans WHERE Builder LIKE 'openbsd-amd64%' AND Error='' And IsTry=True AND StartTime > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 100 HOUR) and Repo = "go" GROUP BY 1, 2 ORDER BY Sec DESC;

Row | Builder | Event | Sec |  
-- | -- | -- | -- | --
1 | openbsd-amd64-64 | make_and_test | 534.0490917608572 |  
2 | openbsd-amd64-64 | make | 292.82514376291306 |  
3 | openbsd-amd64-64 | run_test:go_test:cmd/go | 150.02306724040426 |  
4 | openbsd-amd64-64 | run_test:cgo_test | 88.3467336194681 |  
5 | openbsd-amd64-64 | run_test:runtime:cpu124 | 78.4045334334468 |  
6 | openbsd-amd64-64 | run_test:go_test:cmd/compile/internal/gc | 63.46040540948936 |  
7 | openbsd-amd64-64 | run_test:go_test:net | 62.921281942893614 |  
8 | openbsd-amd64-64 | run_test:go_test:runtime | 59.65075333531915 |  
9 | openbsd-amd64-64 | run_test:cgo_errors | 39.74668672670213 |  
10 | openbsd-amd64-64 | get_helper | 36.28992712104489 |  
11 | openbsd-amd64-64 | create_gce_buildlet | 35.651753116401366 |  
12 | openbsd-amd64-64 | get_buildlet | 33.989285913102044 |  
13 | openbsd-amd64-64 | run_test:go_test:cmd/compile/internal/ssa | 31.60199228663415 |  
14 | openbsd-amd64-64 | create_gce_instance | 30.614607733554422 |  
15 | openbsd-amd64-64 | run_test:go_test:cmd/vet | 25.966327777999997 |  
16 | openbsd-amd64-64 | run_test:go_test:net/http | 23.022429393063828 |  
17 | openbsd-amd64-64 | write_snapshot_to_gcs | 19.260539691499996 |  
18 | openbsd-amd64-64 | run_test:doc_progs | 16.33576500525532 |  
19 | openbsd-amd64-64 | run_test:go_test:runtime/pprof | 13.31036519180851 |  
20 | openbsd-amd64-64 | run_test:go_test:reflect | 11.999205834765958 |  
21 | openbsd-amd64-64 | run_test:go_test:time | 11.741334522617022 |  
22 | openbsd-amd64-64 | run_test:sync_cpu | 11.533254003170732 |  
23 | openbsd-amd64-64 | write_snapshot_tar | 10.8773563 |  
24 | openbsd-amd64-64 | run_test:go_test:cmd/compile | 10.752162821125 |  
25 | openbsd-amd64-64 | run_test:go_test:cmd/fix | 10.435803139355556 |  
26 | openbsd-amd64-64 | run_tests_multi | 10.243288797263473 |  
27 | openbsd-amd64-64 | run_test:go_test:strings | 10.110943746 |  
28 | openbsd-amd64-64 | run_test:go_test:cmd/link | 10.061246428116279 |  
29 | openbsd-amd64-64 | run_test:go_test:cmd/link/internal/ld | 10.037645228209302 |  
30 | openbsd-amd64-64 | run_test:go_test:go/types | 9.813017223727273 |  
31 | openbsd-amd64-64 | run_test:go_test:syscall | 9.579807829382979 |  
32 | openbsd-amd64-64 | run_test:go_test:strconv | 9.096669728574469 |  
33 | openbsd-amd64-64 | run_test:nolibgcc:net | 8.353404652658536 |  
34 | openbsd-amd64-64 | run_test:go_test:os/signal | 8.149411695148936 |  
35 | openbsd-amd64-64 | run_test:go_test:os | 8.032081740425532 |  
36 | openbsd-amd64-64 | run_test:go_test:math | 7.8759232913157895 |  
37 | openbsd-amd64-64 | run_test:go_test:net/http/httptrace | 7.7749100352500005 |  
38 | openbsd-amd64-64 | run_test:go_test:math/big | 7.75858940580851 |  
39 | openbsd-amd64-64 | run_test:go_test:cmd/internal/obj/x86 | 7.6362746259500005 |  
40 | openbsd-amd64-64 | get_source_from_gitmirror | 7.515116951666666 |  
41 | openbsd-amd64-64 | get_source | 7.277790430666666 |  
42 | openbsd-amd64-64 | run_test:bench_go1 | 7.0304439564893615 |  
43 | openbsd-amd64-64 | run_test:moved_goroot | 6.851026539365853 |  
44 | openbsd-amd64-64 | run_test:go_test:cmd/nm | 6.5756059088085115 |  
45 | openbsd-amd64-64 | run_test:go_test:cmd/cover | 6.451060486723404 |  
46 | openbsd-amd64-64 | run_test:go_test:cmd/objdump | 6.444223596553191 |  
47 | openbsd-amd64-64 | run_test:go_test:runtime/trace | 6.383058027941177 |  
48 | openbsd-amd64-64 | run_test:go_test:testing | 5.998117573319149 |  
49 | openbsd-amd64-64 | run_test:go_test:cmd/vendor/github.com/google/pprof/internal/driver | 5.980447624906977 |  
50 | openbsd-amd64-64 | run_test:wiki | 5.823946847042554
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

Wow, just running make.bash (which isn't sharded out over N buildlets) is more than twice as slow as other platforms:

SELECT Builder, Event, AVG(Seconds) as Sec FROM builds.Spans WHERE Event = 'make' AND Error='' And IsTry=True AND StartTime > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 100 HOUR) and Repo = "go" GROUP BY 1, 2 ORDER BY Sec DESC;

Row | Builder | Event | Sec |  
-- | -- | -- | -- | --
1 | openbsd-amd64-64 | make | 292.82514376291306 |  
2 | nacl-386 | make | 176.9535785543913 |  
3 | nacl-amd64p32 | make | 169.24032677876087 |  
4 | windows-386-2008 | make | 158.65642536708697 |  
5 | windows-amd64-2016 | make | 142.23586712976086 |  
6 | js-wasm | make | 137.46539279367394 |  
7 | linux-386 | make | 134.50720768395655 |  
8 | freebsd-amd64-12_0 | make | 124.52324519041304 |  
9 | misc-vet-vetall | make | 124.14415335852175 |  
10 | linux-amd64-race | make | 123.95929911093478 |  
11 | linux-amd64 | make | 123.54718755441306
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

Likely suspect: #18314 (use a tmpfs on OpenBSD)

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

I tried doing the memory filesystem on /tmp/ in an OpenBSD 6.4 amd64 instance (via gomote ssh) and it works, but it's still not any faster.

Still 5 minutes ....

bradfitz@gdev:~/src/golang.org/x/build$ time gomote run user-bradfitz-openbsd-amd64-64-0 go/src/make.bash
Building Go cmd/dist using /tmp/workdir/go1.4.
Building Go toolchain1 using /tmp/workdir/go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for openbsd/amd64.
---
Installed Go for openbsd/amd64 in /tmp/workdir/go
Installed commands in /tmp/workdir/go/bin

real    5m3.824s
user    0m0.136s
sys     0m0.024s
bradfitz@gdev:~/src/golang.org/x/build$ time gomote run -system user-bradfitz-openbsd-amd64-64-0 mount
/dev/sd0a on / type ffs (local, wxallowed)
mfs:85198 on /tmp type mfs (asynchronous, local, nodev, nosuid, size=2097152 512-blocks)

real    0m0.108s
user    0m0.064s
sys     0m0.044s
bradfitz@gdev:~/src/golang.org/x/build$ time gomote run -system user-bradfitz-openbsd-amd64-64-0 df
Filesystem  512-blocks      Used     Avail Capacity  Mounted on
/dev/sd0a     18153212   1652976  15592576    10%    /
mfs:85198      2057756   1656516    298356    85%    /tmp

real    0m0.107s
user    0m0.096s
sys     0m0.012s

It sees 4 cores:

buildlet$ sysctl hw.ncpufound
hw.ncpufound=4

buildlet$ sysctl -a | grep cpu  
kern.ccpu=1948
hw.ncpu=4
hw.cpuspeed=2500
hw.ncpufound=4
hw.ncpuonline=4
machdep.cpuvendor=GenuineIntel
machdep.cpuid=0x306e4
machdep.cpufeature=0x1f9bfbff

buildlet$ dmesg | grep -i core
cpu0: smt 0, core 0, package 0
cpu1: smt 0, core 1, package 0
cpu2: smt 1, core 0, package 0
cpu3: smt 1, core 1, package 0

The kernel we're running is:

OpenBSD 6.4 (GENERIC.MP) #364: Thu Oct 11 13:30:23 MDT 2018
    deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Is this Spectre/Meldown mitigations shutting down SMT? Can we disable that for the builders?

/cc @mdempsky

@mdempsky

This comment has been minimized.

Copy link
Member

commented Jan 18, 2019

@bradfitz I think you can try setting "sysctl hw.smt=1" to re-enable hyper threading.

https://man.openbsd.org/sysctl.2#HW_SMT_2

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

It's already enabled:

$ sysctl -a | grep -i smt                                                                                                                                                                    
hw.smt=1

So, that's not it. It's crazy that OpenBSD is 2x slower. If it were 10% slower I'd assume, "Oh, OpenBSD prioritizes security over performance" and be fine with that. But 2x makes me think we have a configuration problem somewhere.

@stmuk

This comment has been minimized.

Copy link

commented Jan 18, 2019

Have you tried increasing login.conf limits (as I suggested on twitter)?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

Which would you increase? We have:

default:\
        :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin /usr/local/sbin:\
        :umask=022:\
        :datasize-max=768M:\
        :datasize-cur=768M:\
        :maxproc-max=256:\
        :maxproc-cur=128:\
        :openfiles-max=1024:\
        :openfiles-cur=512:\
        :stacksize-cur=4M:\
        :localcipher=blowfish,a:\
        :tc=auth-defaults:\
        :tc=auth-ftp-defaults:
@stmuk

This comment has been minimized.

Copy link

commented Jan 18, 2019

The default settings are low. You could try setting datasize-max/cur and stacksize-cur to "unlimited"

@mdempsky

This comment has been minimized.

Copy link
Member

commented Jan 18, 2019

@stmuk Wouldn't the resource limits being too low just cause the build to fail rather than to proceed slowly?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

Yeah. The issue is speed, not failure to build.

@stmuk

This comment has been minimized.

Copy link

commented Jan 18, 2019

@mdempsky Maybe but it's easy enough to try.
@bradfitz How does your build work? Do you build bootstrap 1.4 with OpenBSD clang and then compile go? If so do you see the slow down with both steps?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

@mdempsky Maybe but it's easy enough to try.

This is all very tedious & slow to work on, so I don't eagerly pursue avenues that don't at least make sense. Maybe if I were really desperate. But given limited time, I'd rather spend it on trying to collect system-wide profiling information or otherwise getting more visibility into the problem, rather than just changing random things.

How does your build work? Do you build bootstrap 1.4 with OpenBSD clang and then compile go? If so do you see the slow down with both steps?

We push a pre-built Go 1.4 to it and use that.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Jan 18, 2019

@bradfitz Maybe a first step would be to use cmd/dist's GOBUILDTIMELOGFILE to see if any particular steps are slower, or the whole thing is proportionally slower?

$ GOBUILDTIMELOGFILE=/tmp/buildtime.txt ./make.bash
Building Go cmd/dist using /usr/lib/google-golang.
Building Go toolchain1 using /usr/lib/google-golang.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/amd64.
---
Installed Go for linux/amd64 in /usr/local/google/home/mdempsky/wd/go
Installed commands in /usr/local/google/home/mdempsky/wd/go/bin
$ cat /tmp/buildtime.txt
Fri Jan 18 13:37:01 PST 2019 start make.bash
Fri Jan 18 13:37:03 PST 2019 +2.2s start dist bootstrap
Fri Jan 18 13:37:03 PST 2019 +2.6s build toolchain1
Fri Jan 18 13:37:18 PST 2019 +18.0s build go_bootstrap
Fri Jan 18 13:37:28 PST 2019 +27.9s build toolchain2
Fri Jan 18 13:37:45 PST 2019 +44.1s build toolchain3
Fri Jan 18 13:38:00 PST 2019 +59.9s build toolchain
Fri Jan 18 13:38:11 PST 2019 +70.3s end dist bootstrap
@stmuk

This comment has been minimized.

Copy link

commented Jan 18, 2019

@bradfitz Too many negatives in that for me to parse that or motivate me to try and help further. I just regret wasting my time trying to help.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 18, 2019

@stmuk, sorry, I didn't mean to waste your time. But with me and @mdempsky both thinking that such a tweak wouldn't do anything, it's not a high priority of mine to try. I appreciate you throwing it out there, even if it's not the answer. I at least went and read the OpenBSD man pages for those knobs.

@stmuk

This comment has been minimized.

Copy link

commented Jan 19, 2019

@bradfitz You were right the login cap limit relaxation made no difference whatever.

@mdempsky Running on i5 Gen 5 Vbox host with OEL7.6 and OpenBSD 6.4 guests under vagrant I get the unexpected result of a slightly faster OpenBSD build!

There are different compilers in use to build the 1.4 I bootstrapped tip off. OpenBSD has their patched clang 6 whereas Linux has gcc 4.8.5. OBSD has a noatime mount but otherwise no changes were made.

I'm wondering if we are just seeing differences due to the underlying virtualisation. I may experiment with QEMU and more similar C compilers if I get a chance.

go version devel +5538a9a Fri Jan 18 22:41:47 2019 +0000 linux/amd64
real 3m10.367s
user 2m41.822s
sys 0m14.216s
Sat Jan 19 12:37:19 UTC 2019 start make.bash
Sat Jan 19 12:37:20 UTC 2019 +1.4s start dist bootstrap
Sat Jan 19 12:37:20 UTC 2019 +1.4s build toolchain1
Sat Jan 19 12:37:38 UTC 2019 +19.0s build go_bootstrap
Sat Jan 19 12:37:57 UTC 2019 +38.2s build toolchain2
Sat Jan 19 12:38:57 UTC 2019 +98.2s build toolchain3
Sat Jan 19 12:39:48 UTC 2019 +149.1s build toolchain
Sat Jan 19 12:40:29 UTC 2019 +190.6s end dist bootstrap

go version devel +5538a9a Fri Jan 18 22:41:47 2019 +0000 openbsd/amd64
real 2m41.425s
user 1m55.670s
sys 2m0.150s
Sat Jan 19 04:51:44 PST 2019 start make.bash
Sat Jan 19 04:51:46 PST 2019 +2.2s start dist bootstrap
Sat Jan 19 04:51:46 PST 2019 +2.3s build toolchain1
Sat Jan 19 04:52:07 PST 2019 +23.6s build go_bootstrap
Sat Jan 19 04:52:25 PST 2019 +41.4s build toolchain2
Sat Jan 19 04:53:14 PST 2019 +90.0s build toolchain3
Sat Jan 19 04:53:51 PST 2019 +127.2s build toolchain
Sat Jan 19 04:54:26 PST 2019 +162.1s end dist bootstrap

@juanfra684

This comment has been minimized.

Copy link

commented Jan 20, 2019

The problem is not the compiler or the VM software or the FS. I'm the maintainer of BaCon which also runs a big bash script and it's slow as hell. Something happens between bash and the OpenBSD base which makes the bash scripts slow. Maybe something related to the memory protections.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 21, 2019

@juanfra684, our bash script is a few lines that just calls into a Go program that does the rest. Our performance issue is not bash related.

@juanfra684

This comment has been minimized.

Copy link

commented Jan 21, 2019

You're right, sorry for the misunderstanding.

I've built the go (only make, no tests) port on 6.4 and -current, and there is a -14% of difference:

6.4
real       205.50
user       303.09
sys        138.58

-current
real       186.69
user       300.99
sys         73.94

Recently the malloc code was changed to work better with multithreaded programs. (I reverted the malloc changes and the result is the same.)

OpenBSD doesn't have magic knobs to speedup things but you could tune a few thing to help the bootstrap. Firstly, if the VM host is using flash drives for storage, forget mfs. It's not an equivalent in speed to Linux tmpfs and you can usually run the FS operations faster in a simple consumer grade SSD.

About the mount options, use noatime, softdep. Linux is using relatime and a log backed FS by default, so the comparison with a plain OpenBSD installation is not fair.

You could add also a few entries to /etc/sysctl.conf:

  • kern.pool_debug=0: 0 is the default for stable versions but you could forget it if you're comparing the performance of stable with current.
  • kern.bufcachepercent=80: the default is 20 and the maximum is 80. The correct percentage depends of how much RAM the system has.
  • kern.maxvnodes: check what value the system has and increase it. The default value is very very conservative. It's the limit of vnodes which the system can maintain in memory.
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 22, 2019

Firstly, if the VM host is using flash drives for storage, forget mfs. It's not an equivalent in speed to Linux tmpfs and you can usually run the FS operations faster in a simple consumer grade SSD.

Sounds good. Yes, we're using GCE's SSDs, which are fast.

About the mount options, use noatime, softdep. Linux is using relatime and a log backed FS by default, so the comparison with a plain OpenBSD installation is not fair.

We already had noatime, but not softdep.

kern.bufcachepercent=80: the default is 20 and the maximum is 80. The

We had 20.

kern.maxvnodes: check what value the system has and increase it. The default value is very very conservative. It's the limit of vnodes which the system can maintain in memory.

We had kern.maxvnodes=37669 but during a build I see it only go up to kern.numvnodes=25417 so it doesn't appear we're running into that limit.

@juanfra684

This comment has been minimized.

Copy link

commented Jan 22, 2019

About the mount options, use noatime, softdep. Linux is using relatime and a log backed FS by default, so the comparison with a plain OpenBSD installation is not fair.

We already had noatime, but not softdep.

One of your commands shows: /dev/sd0a on / type ffs (local, wxallowed). That's why I suggested noatime.

kern.bufcachepercent=80: the default is 20 and the maximum is 80. The

We had 20.

If you have a few GB of RAM, try with 80. If the system crashes, select something lower. The reason why OpenBSD is not using 80 by default is because there is a bug somewhere which crashes the system when the value is high. Anyway, the most of users will never have that crash.

kern.maxvnodes: check what value the system has and increase it. The default value is very very conservative. It's the limit of vnodes which the system can maintain in memory.

We had kern.maxvnodes=37669 but during a build I see it only go up to kern.numvnodes=25417 so it doesn't appear we're running into that limit.

A better way to check the correct value for maxvnodes is to use systat -s1 during the build:

Namei         Sys-cache    Proc-cache    No-cache
    Calls     hits    %    hits     %    miss   %
      185      131   71      25    14      29  16  

No-cache will tell you if you have enough vnodes or not. Probably kern.maxvnodes=100000 is a safe value for your system.

Anyway, those things will only speedup slightly the process. There is an underlying problem on OpenBSD slowing down the build and tests.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 22, 2019

One of your commands shows: /dev/sd0a on / type ffs (local, wxallowed). That's why I suggested noatime.

We remount it when the build daemon starts up:

https://github.com/golang/build/blob/c61dc90b4c2e3fd9aff9ce059a985cab89ef9e1b/cmd/buildlet/buildlet.go#L1685

That runs before any build.

@juanfra684

This comment has been minimized.

Copy link

commented Jan 22, 2019

async is faster than softdep and you can't use both, so forget my suggestion about that.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

What is making OpenBSD choose acpitimer0 instead of tsc?

TSC frequency isn't specified on older CPUs, so OpenBSD tries to compute the frequency from another timecounter. However, there are several ways that this bootstrapping can fail, and unfortunately tsc.c doesn't log why it fails.

If I had to guess, the problem is that tsc.c assumes something went wrong if a timecounter read takes more than 50,000 cycles (RECALIBRATE_SMI_THRESHOLD), and acpitimer is so slow under kvm that it always takes more than this, so tsc.c is never able to bootstrap successfully from it.

Should we be worried about any problems by forcing this?

Since tsc.c isn't able to calibrate the TSC frequency, it's assuming it's the same as the CPU frequency. As long as this assumption is right, it's fine. But otherwise, there's a chance of clock skew. In particular, if you're measuring the 2m24s build time on the VM itself, there's a chance it simply looks faster because the clock is bad.

I would try a simple validity test of running rdate -p -v time.nist.gov; sleep 60; rdate -p -v time.nist.gov and just make sure the rdate results advanced by a minute. As long as that's the case, then I think we're good to use tsc on GCE.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

@reyk

A low priority says that a TSC has been found but did not indicate the invariant flag or did not calibrate correctly.

My reading of tsc.c is that we wouldn't even call tc_init(&tsc_timecounter); if the TSC isn't invariant. I think being registered but having a -1000 quality (as seen in @bradfitz's sysctl output above) can only happen if it's invariant/constant but can't be calibrated from another timecounter?

@reyk

This comment has been minimized.

Copy link

commented Jan 30, 2019

@mdempsky

Calibration is the hardest part of those TSCs. This code has been tweaked multiple times.

We have to figure out the base frequency of the TSC and this is measured with an algorithm that uses delay() and RDTSC on older CPUs. If the calibration fails it leaves the priority at -1000.

Some Skylake-and-later CPUs provide CPUIDs to read the constant TSC frequency and to skip the calibration. This is the most reliable method but it is also not very generally applicable as it needs a table with hardcoded defaults. See tsc_freq_req() - it potentially needs to be updated for newer CPU models. I guess that this might be the problem in what you’ve seen.

Fun fact: I’ve seen kernel code in FreeBSD that parses the frequency in the CPUID model string as a fallback. But this only worked on old constant TSCs that had the same base frequency as the CPU. For example, it would parse 1.7GHz from “Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz” while this CPU has a TSC that runs at about 2.3GHz.

gopherbot pushed a commit to golang/build that referenced this issue Jan 30, 2019

cmd/debugnewvm: add --just-make flag, more timing info in log
For debugging builder make.bash speed.

Updates golang/go#29223

Change-Id: I030c61ec3fdd7af45c6a96ea5cede0bbb54f97bc
Reviewed-on: https://go-review.googlesource.com/c/160317
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

In particular, if you're measuring the 2m24s build time on the VM itself, there's a chance it simply looks faster because the clock is bad.

No, it's being measured from a program (on Linux) that drives the buildlet over the network.

I would try a simple validity test of running rdate -p -v time.nist.gov; sleep 60; rdate -p -v time.nist.gov and just make sure the rdate results advanced by a minute. As long as that's the case, then I think we're good to use tsc on GCE.

Doesn't look great:

SERIAL: 2019/01/30 07:59:18 Running /usr/local/bin/bash with args ["bash" "-c" "rdate -p -v time.nist.gov; sleep 60; rdate -p -v time.nist.gov"] and env [...] in dir /tmp/workdir
Wed Jan 30 07:59:11 PST 2019
rdate: adjust local clock by -7.503640 seconds
Wed Jan 30 08:00:07 PST 2019
rdate: adjust local clock by -21.513980 seconds

Getting off by 21.5 seconds in 60 seconds looks like the TSC calibration definitely didn't work, which explains why it wasn't selected in the first place.

But then I ran it again (new VM boot, different CPU type?) and:

2019/01/30 16:10:40 SERIAL: 2019/01/30 08:10:38 [0xc0001f2c60] Running /usr/local/bin/bash with args ["bash" "-c" "rdate -p -v time.nist.gov; sleep 60; rdate -p -v time.nist.gov"] and env ["HOME=/" "PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin" "PWD=/" "INRC=1" "_=/bin/sh" "WORKDIR=/tmp/workdir" "GOROOT_BOOTSTRAP=/tmp/workdir/go1.4" "TMPDIR=/tmp/workdir/tmp" "GOCACHE=/tmp/workdir/gocache"] in dir /tmp/workdir
Wed Jan 30 08:10:54 PST 2019
rdate: adjust local clock by 0.507475 seconds
Wed Jan 30 08:11:52 PST 2019  
rdate: adjust local clock by 0.530718 seconds
2019/01/30 16:11:52 sleep: <nil>, <nil>

There the adjustments were much more reasonable, but from the client's point of view it took from 2019/01/30 16:10:40 to 2019/01/30 16:11:52 ... 72 seconds. But maybe rdate takes 6 seconds per run?

I wonder if we're not specifying which CPU type we want so we're getting something random. (https://cloud.google.com/compute/docs/cpu-platforms)

I'd just run -CURRENT with pvclock instead of worrying about TSC, but even though -CURRENT is effectively 6.4 right now, I worry that in a month or two, CURRENT might introduce some breaking change that Go isn't ready for and we'll be unable to rebuild our builder's VM image if we need something new in it.

So I'd like to do one of:

  • fix the TSC calibration:
  • somehow backport the pvclock driver to 6.4
  • modify our VM-building make.bash to force a particular snapshot date (say, 2019-01-30) instead of relying on the snapshot du jour. But not sure the mirrors even retain all those old snapshots. Likely not.

Recommendations?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

I wonder if we're not specifying which CPU type we want so we're getting something random.

Nevermind, it's not random. The docs say:

If you do not actively specify a desired CPU platform, the instance uses the default platform for the zone where the instance will run. You can see a list of zones and their default CPU platforms in the Regions and Zones page.

We run in us-central1-f, which says:

Available CPU platforms:
Intel Xeon E5 v2 (Ivy Bridge) platform (default)
Intel Xeon E5 v3 (Haswell) platform (default from April 5th, 2019)
Intel Xeon E5 v4 (Broadwell) platform
Intel Xeon (Skylake) platform

So we're on Ivy Bridge for now, and will switch to Haswell by default in a couple months here.

We could request a different CPU type if the TSC/OpenBSD is better in one of those.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

On third run, the whole rdate+sleep 60+rdate took exactly 60 seconds from client's perspective, with 9.4 second drift:

2019/01/30 16:15:48 WorkDir: /tmp/workdir, <nil>
2019/01/30 16:15:48 SERIAL: 2019/01/30 08:15:51 [0xc0001f4b00] Running /usr/local/bin/bash with args ["bash" "-c" "rdate -p -v time.nist.gov; sleep 60; rdate -p -v time.nist.gov"] and env ["HOME=/" "PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin" "PWD=/" "INRC=1" "_=/bin/sh" "WORKDIR=/tmp/workdir" "GOROOT_BOOTSTRAP=/tmp/workdir/go1.4" "TMPDIR=/tmp/workdir/tmp" "GOCACHE=/tmp/workdir/gocache"] in dir /tmp/workdir
Wed Jan 30 08:15:49 PST 2019
rdate: adjust local clock by -2.915089 seconds
Wed Jan 30 08:16:48 PST 2019
rdate: adjust local clock by -9.437488 seconds
2019/01/30 16:16:48 sleep: <nil>, <nil>
@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

Hi,

I agree with the analysis by @mdempsky that PMTimer is too slow on GCE which prevents TSC calibration.

So I'd like to do one of:

Probably the easiest until 6.5 arrives in May.

  • pass some addition sysctl to force a certain TSC frequency?

You cannot adjust TSC frequency via a sysctl interface. It would be useless for our users in the long run anyway.

  • somehow backport the pvclock driver to 6.4

Backporting is straightforward, but would you be able to compile and store the kernel yourself? OpenBSD doesn't backport drivers to -stable due to a short development cycle.

  • modify our VM-building make.bash to force a particular snapshot date (say, 2019-01-30) instead of relying on the snapshot du jour. But not sure the mirrors even retain all those old snapshots. Likely not.

Mirrors don't retain snapshots. If you're able to store and fetch the snapshot yourself that might be a way to go. Then you can schedule updates on your own. FYI, right now LLVM 7 is being integrated into OpenBSD-current.

Recommendations?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

Probably the easiest until 6.5 arrives in May.

@mbelop, I take that to mean that OpenBSD does do better on Skylake?

At least my test seems to confirm. On Skylake it boots and says:

Automatic boot in progress: starting file system checks.
                            /dev/sd0a (f56cff0d5491ef20.a): file system is clean; not checking
                            setting tty flags
                            pf enabled
                            hw.smt: 0 -> 1
                            kern.timecounter.hardware: tsc -> tsc

So looks like it had already selected TSC on its own with Skylake and the sysctl was a no-op.

But the rdate result wasn't great:

Wed Jan 30 09:02:10 PST 2019
rdate: adjust local clock by -1.932407 seconds
Wed Jan 30 09:03:10 PST 2019
rdate: adjust local clock by -8.596295 seconds

8.5 seconds off in 60 seconds.

But fastest build yet: 2m11s. So, yay Skylake?

Maybe the wall clock drift or rdate result doesn't matter.

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

Probably the easiest until 6.5 arrives in May.

@mbelop, I take that to mean that OpenBSD does do better on Skylake?

At least my test seems to confirm. On Skylake it boots and says:

Automatic boot in progress: starting file system checks.
                            /dev/sd0a (f56cff0d5491ef20.a): file system is clean; not checking
                            setting tty flags
                            pf enabled
                            hw.smt: 0 -> 1
                            kern.timecounter.hardware: tsc -> tsc

So looks like it had already selected TSC on its own with Skylake and the sysctl was a no-op.

As long as your hypervisor sets the CPU model of the VCPU to whatever value Skylake uses,
we don't calibrate TSC at all, since Skylake and Kaby Lake CPUs provide TSC frequency in
one of the CPUID outputs.

But the rdate result wasn't great:

Supposedly because KVM doesn't compensate for VM switches.
That's what PVCLOCK driver is supposed to do (the shared page is updated on VMENTER)
and that's why PVCLOCK is the preferred clock source.

Wed Jan 30 09:02:10 PST 2019
rdate: adjust local clock by -1.932407 seconds
Wed Jan 30 09:03:10 PST 2019
rdate: adjust local clock by -8.596295 seconds

8.5 seconds off in 60 seconds.

But fastest build yet: 2m11s. So, yay Skylake?

Maybe the wall clock drift or rdate result doesn't matter.

I don't think it matters a whole lot for building and running regress tests especially since you
probably don't have long running VMs (i.e. you're probably tearing VM down after a single build?)

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

(i.e. you're probably tearing VM down after a single build?)

Yes.

@paulzhol

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Should the FreeBSD builders be made to stop using TSC as well? Can Broadwell be set as the minimum CPU type (I think it's available in all the regions)?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

So Skylake alone doesn't seem to do the trick for OpenBSD.

I tried booting our original OpenBSD 6.4 image (without the sysctl to set kern.timecounter.hardware) and it comes up and says...

cpu0: Intel(R) Xeon(R) CPU @ 2.00GHz, 1999.32 MHz, 06-55-03
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16
C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,IBRS,IBPB,STIBP,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache 
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 909MHz  

(The 2.0 GHz to me implies Skylake based on https://cloud.google.com/compute/docs/cpu-platforms at least)

But then when I run sysctl kern.timecounter.hardware:

kern.timecounter.hardware=acpitimer0

And it's slow.

So it seems I still need to force it to TSC?

Or is GCE not giving me Skylake, despite (seemingly?) asking for it?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

@paulzhol, I don't know. You want to investigate and open a separate bug? But they're not slow (same speed as Linux, which does use the kvm clock), so I'm not worried. Whatever FreeBSD is doing by default seems to work.

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

So Skylake alone doesn't seem to do the trick for OpenBSD.

I tried booting our original OpenBSD 6.4 image (without the sysctl to set kern.timecounter.hardware) and it comes up and says...

cpu0: Intel(R) Xeon(R) CPU @ 2.00GHz, 1999.32 MHz, 06-55-03
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16
C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,AVX512F,AVX512DQ,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,AVX512CD,AVX512BW,AVX512VL,IBRS,IBPB,STIBP,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache 
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 909MHz  

(The 2.0 GHz to me implies Skylake based on https://cloud.google.com/compute/docs/cpu-platforms at least)

As you can see from 06-55-03 the CPU model is 0x55 (most likely a Skylake X) while models recognized by the kernel are 0x4e (mobile) or 0x5e (desktop). Skylake X is no longer part of the fixed frequency list since this commit: openbsd/src@c0b7f8b. Unfortunately, I don't have any further information as to why that is so.

But then when I run sysctl kern.timecounter.hardware:

kern.timecounter.hardware=acpitimer0

And it's slow.

So it seems I still need to force it to TSC?

Or is GCE not giving me Skylake, despite (seemingly?) asking for it?

@gopherbot

This comment has been minimized.

Copy link

commented Jan 30, 2019

Change https://golang.org/cl/160457 mentions this issue: buildlet, dashboard: add min CPU platform knob, require Skylake for OpenBSD

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

As you can see from 06-55-03 the CPU model is 0x55 (most likely a Skylake X) while models recognized by the kernel are 0x4e (mobile) or 0x5e (desktop). Skylake X is no longer part of the fixed frequency list since this commit: openbsd/src@c0b7f8b. Unfortunately, I don't have any further information as to why that is so.

Linux fix is torvalds/linux@b511203

https://lwn.net/Articles/752159/ says:

* Intel Core X-Series (Skylake) processors use a hardcoded Time Stamp
Counter (TSC) frequency of 25 MHz. In some cases this can be imprecise and
lead to timing-related problems such as time drift, timers being triggered
early, or TSC clock instability. This update mitigates these problems by no
longer using the "native_calibrate_tsc()" function to define the TSC
frequency. Refined calibration is now used to update the clock rate
accordingly in these cases. (BZ#1547854)

So Linux redoes calibration on this CPU. OpenBSD gives up on it.

I'm starting to think mirroring ~today's CURRENT snapshot to our own storage for reproducible VM image builds might be the best path forward, rather than backporting pvclock to 6.4 or worrying about the TSC / CPU issues.

@paulzhol

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

@paulzhol, I don't know. You want to investigate and open a separate bug? But they're not slow (same speed as Linux, which does use the kvm clock), so I'm not worried. Whatever FreeBSD is doing by default seems to work.

@bradfitz FreeBSD is not doing the right thing by default, I've forced it to use TSC in golang/build@2e001a4#diff-ca6ebc387c9f22d5808dfac3060344bd

based on https://lists.freebsd.org/pipermail/freebsd-cloud/2017-January/000080.html (so the 10.4, 11.2 and 12.0 images are affected).

That's why I'm asking if we should revert (or maybe you can request Broadwell/Skylake machines for FreeBSD builders as well, no kvmclock is being used).

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

As you can see from 06-55-03 the CPU model is 0x55 (most likely a Skylake X) while models recognized by the kernel are 0x4e (mobile) or 0x5e (desktop). Skylake X is no longer part of the fixed frequency list since this commit: openbsd/src@c0b7f8b. Unfortunately, I don't have any further information as to why that is so.

Linux fix is torvalds/linux@b511203

https://lwn.net/Articles/752159/ says:

* Intel Core X-Series (Skylake) processors use a hardcoded Time Stamp
Counter (TSC) frequency of 25 MHz. In some cases this can be imprecise and
lead to timing-related problems such as time drift, timers being triggered
early, or TSC clock instability. This update mitigates these problems by no
longer using the "native_calibrate_tsc()" function to define the TSC
frequency. Refined calibration is now used to update the clock rate
accordingly in these cases. (BZ#1547854)

Doesn't really explain much (-;
I take this ^^^^ back since the commit does explain what happened.

So Linux redoes calibration on this CPU. OpenBSD gives up on it.

This isn't correct. We don't give up immediately, we attempt to recalibrate with a PM timer or an HPET timer, except that it doesn't work for you. Then we give up.

I'm starting to think mirroring ~today's CURRENT snapshot to our own storage for reproducible VM image builds might be the best path forward, rather than backporting pvclock to 6.4 or worrying about the TSC / CPU issues.

I can provide a backport patch for 6.4, but I don't know how would you go about integrating it in your test system. I totally support your decision to move to -current however.

@paulzhol

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

A relevant discussion for about TSC calibration (on FreeBSD): http://freebsd.1045724.x6.nabble.com/TSC-calibration-in-virtual-machines-td6266249.html#a6266308

It seems that TSC calibration in virtual machines sometimes can do more harm
than good. Should we default to trusting the information provided by a hypervisor?

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

A relevant discussion for about TSC calibration (on FreeBSD): http://freebsd.1045724.x6.nabble.com/TSC-calibration-in-virtual-machines-td6266249.html#a6266308

It seems that TSC calibration in virtual machines sometimes can do more harm
than good. Should we default to trusting the information provided by a hypervisor?

Pardon me if this is just taken out of context, but this question sounds silly, since if you're running under a hypervisor that has total control of the guest memory and execution flow you're trusting it by the very definition. There's no question.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

This isn't correct. We don't give up immediately, we attempt to recalibrate with a PM timer or an HPET timer, except that it doesn't work for you. Then we give up.

Ah, okay. Thanks.

It'd be nice of course if it worked, or at least whitelisted GCE as having a correctly advertised value like it sounds like FreeBSD is considering.

I'll probably go the -current route.

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

This isn't correct. We don't give up immediately, we attempt to recalibrate with a PM timer or an HPET timer, except that it doesn't work for you. Then we give up.

Ah, okay. Thanks.

It'd be nice of course if it worked, or at least whitelisted GCE as having a correctly advertised value like it sounds like FreeBSD is considering.

Correctly advertised value of what?

I'll probably go the -current route.

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

Correctly advertised value of what?

It was my impression from that FreeBSD thread that some hypervisors advertise the TSC frequency. For instance, VMware does:

https://svnweb.freebsd.org/base/head/sys/x86/x86/tsc.c?r1=221178&r2=221214&pathrev=221214

That thread made it sound like GCE does something similar.

@mbelop

This comment has been minimized.

Copy link

commented Jan 30, 2019

Correctly advertised value of what?

It was my impression from that FreeBSD thread that some hypervisors advertise the TSC frequency. For instance, VMware does:

https://svnweb.freebsd.org/base/head/sys/x86/x86/tsc.c?r1=221178&r2=221214&pathrev=221214

That thread made it sound like GCE does something similar.

Yeah, I don't think that's the case. But if you do, please let us know.

@paulzhol

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

Yeah, I don't think that's the case. But if you do, please let us know.

Doesn't GCE (kvm) has this commit qemu/qemu@9954a15 ?

@bradfitz

This comment has been minimized.

Copy link
Member Author

commented Jan 30, 2019

Doesn't GCE (kvm) has this commit qemu/qemu@9954a15 ?

GCE uses KVM but with its own implementation; it doesn't use QEMU. So it might do something like that, but that QEMU code wouldn't tell us whether it does.

@paulzhol

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

So if I'm reading https://www.kernel.org/doc/Documentation/virtual/kvm/timekeeping.txt correctly

3.7) TSC virtualization - VMX

VMX provides conditional trapping of RDTSC, RDMSR, WRMSR and RDTSCP
instructions, which is enough for full virtualization of TSC in any manner. In
addition, VMX allows passing through the host TSC plus an additional TSC_OFFSET
field specified in the VMCS. Special instructions must be used to read and
write the VMCS field.

My understanding is that if the hypervisor reports a TSC to the guest, then I assume the hypervisor does all the validations required (trapping RDTSC instructions if necessary, keeping the vcpus TSC's in sync etc.) and no calibration should be required (or possible) in the guest. So we are safe to keep forcing TSC on Broadwell/Skylake hosts until something like https://svnweb.freebsd.org/base/head/sys/x86/x86/tsc.c?r1=221178&r2=221214&pathrev=221214 is applied upstream for GCE/KVM?

gopherbot pushed a commit to golang/build that referenced this issue Jan 30, 2019

buildlet, dashboard: add min CPU platform knob, require Skylake for O…
…penBSD

I thought this would be enough for OpenBSD to select the TSC on its own without
being forced to (as in CL 160319), but apparently it is not:

    golang/go#29223 (comment)

So it seems like we want both this CL and CL 160319.

Updates golang/go#29223

Change-Id: I0a092d62881d8dcce0ef1129d8d32d8f4025b6ac
Reviewed-on: https://go-review.googlesource.com/c/160457
Reviewed-by: Andrew Bonventre <andybons@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.