Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: monitor/graph GCE instance-create-to-buildlet latencies #21148

bradfitz opened this issue Jul 24, 2017 · 5 comments

x/build: monitor/graph GCE instance-create-to-buildlet latencies #21148

bradfitz opened this issue Jul 24, 2017 · 5 comments


Copy link

@bradfitz bradfitz commented Jul 24, 2017

In the past few days our Windows GCE instances seem to create, but then the buildlet doesn't come up in 5 minutes.


Also, we need to monitor & alert on this.

/cc @adams-sarah @cybrcodr @johnsonj

@gopherbot gopherbot added this to the Unreleased milestone Jul 24, 2017
Copy link
Contributor Author

@bradfitz bradfitz commented Jul 24, 2017

(The build system does retry, though, and it seems to eventually work. But something's being flaky and thus our builds and trybots are slow.)

Copy link

@johnsonj johnsonj commented Jul 24, 2017

+1 on monitor/alert. Looks like the buildlet process starts but then nothing:

Serial console output for buildlet-windows-amd64-2012-rnb5c1b2a

 SeaBIOS (version 1.8.2-20170419_170401-google)
Total RAM Size = 0x00000000e6600000 = 3686 MiB
CPUs found: 4     Max CPUs supported: 4
found virtio-scsi at 0:3
virtio-scsi vendor='Google' product='PersistentDisk' rev='1' type=0 removable=0
virtio-scsi blksize=512 sectors=104857600 = 51200 MiB
drive 0x000f31a0: PCHS=0/0/0 translation=lba LCHS=1024/255/63 s=104857600
Booting from Hard Disk 0...
7/24/2017 7:55:26 PM UTC: GCE Agent started (version
7/24/2017 7:55:28 PM UTC: Starting startup scripts (version
7/24/2017 7:55:33 PM UTC: Finished running startup scripts.
2017/07/24 19:55:51 buildlet starting.
Copy link

@johnsonj johnsonj commented Jul 24, 2017

Created a builder and captured console output:

2017/07/24 20:31:07 network is up.
2017/07/24 20:31:07 Downloading to .\buildlet.exe ...
2017/07/24 20:31:07 Downloaded .\buildlet.exe (7617536 bytes)
fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0xffffffffffffffff pc=0x427e42]

runtime stack:
runtime.throw(0x7620f5, 0x2a)
        /home/bradfitz/go/src/runtime/panic.go:605 +0x9c
        /home/bradfitz/go/src/runtime/signal_windows.go:155 +0x184
runtime.netpoll(0xc042019901, 0xc042019901)
        /home/bradfitz/go/src/runtime/netpoll_windows.go:105 +0x332
runtime.findrunnable(0xc042016000, 0x0)
        /home/bradfitz/go/src/runtime/proc.go:2107 +0x610
        /home/bradfitz/go/src/runtime/proc.go:2245 +0x13a
        /home/bradfitz/go/src/runtime/proc.go:2396 +0x24b
        /home/bradfitz/go/src/runtime/asm_amd64.s:286 +0x5e

goroutine 1 [select]:
net/http.(*Transport).getConn(0x8fe240, 0xc0421720f0, 0x0, 0xc04217e000, 0x4, 0x
c04216c120, 0x12, 0x0, 0x0, 0xc042067648)
        /home/bradfitz/go/src/net/http/transport.go:948 +0x5c6
net/http.(*Transport).RoundTrip(0x8fe240, 0xc042182000, 0x8fe240, 0x0, 0x0)
        /home/bradfitz/go/src/net/http/transport.go:400 +0x6ad
net/http.send(0xc042182000, 0x8c92c0, 0x8fe240, 0x0, 0x0, 0x0, 0xc04216e020, 0x1
00, 0xc0420679c8, 0x1)
        /home/bradfitz/go/src/net/http/client.go:249 +0x1b0
net/http.(*Client).send(0x8f92a0, 0xc042182000, 0x0, 0x0, 0x0, 0xc04216e020, 0x0
, 0x1, 0x4)
        /home/bradfitz/go/src/net/http/client.go:173 +0x104
net/http.(*Client).Do(0x8f92a0, 0xc042182000, 0xa, 0x757505, 0x11)
        /home/bradfitz/go/src/net/http/client.go:602 +0x294, 0xc04216c100, 0x1c, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0)
+0x1f7, 0x1c, 0x14, 0x7543c6, 0x8
, 0xc04216c100)
+0x48, 0x8, 0xc04
2067d98, 0x42b07d, 0x76ce20, 0xc042067da8)
main.metadataValue(0x7543c6, 0x8, 0x0, 0x0)
        /home/bradfitz/src/ +0x40
main.defaultListenAddr(0x757af6, 0x12)
        /home/bradfitz/src/ +0x4e
        /home/bradfitz/src/ +0x8b

goroutine 49 [IO wait]:
internal/poll.runtime_pollWait(0x2d4e40, 0x77, 0xc04218a0b8)
        /home/bradfitz/go/src/runtime/netpoll.go:173 +0x5e
internal/poll.(*pollDesc).wait(0xc04218a158, 0x77, 0xc04216a000, 0x0, 0x0)
        /home/bradfitz/go/src/internal/poll/fd_poll_runtime.go:85 +0xb5
internal/poll.(*ioSrv).ExecIO(0x900e28, 0xc04218a0b8, 0x76c4b8, 0xc04214b1a8, 0x
c04214b1b0, 0xc04214b1a0)
        /home/bradfitz/go/src/internal/poll/fd_windows.go:191 +0x126
internal/poll.(*FD).ConnectEx(0xc04218a000, 0x8c9b00, 0xc04216c140, 0xc042162240
, 0xc04218a000)
        /home/bradfitz/go/src/internal/poll/fd_windows.go:721 +0x80
net.(*netFD).connect(0xc04218a000, 0x8cdf80, 0xc042162240, 0x0, 0x0, 0x8c9b00, 0
xc04216c140, 0x0, 0x0, 0x0, ...)
        /home/bradfitz/go/src/net/fd_windows.go:116 +0x243
net.(*netFD).dial(0xc04218a000, 0x8cdf80, 0xc042162240, 0x8cf240, 0x0, 0x8cf240,
 0xc0421721b0, 0xc04214b3a0, 0x56a395)
        /home/bradfitz/go/src/net/sock_posix.go:142 +0xf3
net.socket(0x8cdf80, 0xc042162240, 0x7533a6, 0x3, 0x2, 0x1, 0x0, 0x0, 0x8cf240,
0x0, ...)
        /home/bradfitz/go/src/net/sock_posix.go:93 +0x1c1
net.internetSocket(0x8cdf80, 0xc042162240, 0x7533a6, 0x3, 0x8cf240, 0x0, 0x8cf24
0, 0xc0421721b0, 0x1, 0x0, ...)
        /home/bradfitz/go/src/net/ipsock_posix.go:141 +0x158
net.doDialTCP(0x8cdf80, 0xc042162240, 0x7533a6, 0x3, 0x0, 0xc0421721b0, 0x920f40
, 0x0, 0x0)
        /home/bradfitz/go/src/net/tcpsock_posix.go:62 +0xc0
net.dialTCP(0x8cdf80, 0xc042162240, 0x7533a6, 0x3, 0x0, 0xc0421721b0, 0xbe55b423
665a8374, 0x77d9bc38, 0x902ee0)
        /home/bradfitz/go/src/net/tcpsock_posix.go:58 +0xeb
net.dialSingle(0x8cdf80, 0xc042162240, 0xc042180080, 0x8cbd00, 0xc0421721b0, 0x0
, 0x0, 0x0, 0x0)
        /home/bradfitz/go/src/net/dial.go:547 +0x3e9
net.dialSerial(0x8cdf80, 0xc042162240, 0xc042180080, 0xc042186090, 0x1, 0x1, 0x0
, 0x0, 0x0, 0x0)
        /home/bradfitz/go/src/net/dial.go:515 +0x24e
net.(*Dialer).DialContext(0xc042096120, 0x8cdf40, 0xc04204c078, 0x7533a6, 0x3, 0
xc04216c120, 0x12, 0x0, 0x0, 0x0, ...)
        /home/bradfitz/go/src/net/dial.go:397 +0x6f5
net.(*Dialer).Dial(0xc042096120, 0x7533a6, 0x3, 0xc04216c120, 0x12, 0x1240042176
120, 0x110, 0x110, 0xc042188000)
        /home/bradfitz/go/src/net/dial.go:320 +0x7c
net.(*Dialer).Dial-fm(0x7533a6, 0x3, 0xc04216c120, 0x12, 0xc042186060, 0xc042117
998, 0x403580, 0x60)
        /home/bradfitz/src/ +
net/http.(*Transport).dial(0x8fe240, 0x8cdf40, 0xc04204c078, 0x7533a6, 0x3, 0xc0
4216c120, 0x12, 0x0, 0x0, 0x0, ...)
        /home/bradfitz/go/src/net/http/transport.go:887 +0x82
net/http.(*Transport).dialConn(0x8fe240, 0x8cdf40, 0xc04204c078, 0x0, 0xc04217e0
00, 0x4, 0xc04216c120, 0x12, 0x0, 0xc042130120, ...)
        /home/bradfitz/go/src/net/http/transport.go:1060 +0x1d69
net/http.(*Transport).getConn.func4(0x8fe240, 0x8cdf40, 0xc04204c078, 0xc0421721
20, 0xc042176060)
        /home/bradfitz/go/src/net/http/transport.go:943 +0x7f
created by net/http.(*Transport).getConn
        /home/bradfitz/go/src/net/http/transport.go:942 +0x39a

goroutine 50 [select]:
net.(*netFD).connect.func2(0x8cdf80, 0xc042162240, 0xc04218a000, 0xc0421761e0)
        /home/bradfitz/go/src/net/fd_windows.go:105 +0xf9
created by net.(*netFD).connect
        /home/bradfitz/go/src/net/fd_windows.go:104 +0x218
2017/07/24 20:31:07 Error running buildlet: exit status 2
2017/07/24 20:31:07 (sleeping for 1 minute before failing)
Copy link

@gopherbot gopherbot commented Jul 24, 2017

CL mentions this issue.

Copy link
Contributor Author

@bradfitz bradfitz commented Jul 24, 2017

A few days ago I'd replaced with the Windows buildlet with a Go 1.9-built one.

I've reverted it to a Go 1.8-built one and it now seems to work again.

That's disconcerting, so I'm hoping I had unrelated code changes in there too. I'm going to try to repro in staging. I really hope we don't have Go 1.9-on-Windows/GCE problems.

@golang golang locked and limited conversation to collaborators Jul 31, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants