New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: monitor all the parts of the build system #15760

Open
bradfitz opened this Issue May 19, 2016 · 9 comments

Comments

Projects
None yet
6 participants
@bradfitz
Member

bradfitz commented May 19, 2016

The watcher is not pushing new commits to build.golang.org.

I see:

  949 ?        Ssl    8:51 docker daemon --host=fd:// --selinux-enabled
17215 ?        Ssl   15:55  \_ /usr/local/bin/watcher -role=watcher -watcher.repo=https://go.googlesource.com/go -watcher.dash=https://build.golang.org/ -watcher.poll=10s -watcher.http=127.0.0.1:21536 -watcher.mirror=htt
 5570 ?        Z      0:00      \_ [git-remote-http] <defunct>
 5608 ?        Z      0:00      \_ [git-remote-http] <defunct>
 5618 ?        Z      0:00      \_ [git-remote-http] <defunct>
16843 ?        S      1:34      \_ git push -f --mirror dest
18371 ?        S      0:00      |   \_ git-remote-https dest https://gopherbot:[redacted]@github.com/golang/go
18178 ?        S      0:00      \_ git push -f --mirror dest
18389 ?        S      0:00      |   \_ git-remote-https dest https://gopherbot:[redacted]@github.com/golang/mobile
18335 ?        S      0:00      \_ git push -f --mirror dest
18387 ?        R      0:00          \_ git-remote-https dest https://gopherbot:[redacted]@github.com/golang/oauth2

And http://farmer.golang.org/debug/watcher at the end says only:

2016/05/19 18:43:40 go: sending commit to dashboard: ---40hexomitted---[master]("build: unset GOBIN during build")
2016/05/19 19:29:24 exp: found new commit ---40hexomitted---("shiny/widget/flex: add HTML printing of tests")
2016/05/19 19:29:24 exp: updated branch head: "master"(Head: ---40hexomitted---[master]("shiny/widget/flex: add HTML printing of tests") LastSeen: ---40hexomitted---[master]("shiny/driver/x11driver: have textureImpl.draw use an opaque mask."))
2016/05/19 19:29:24 exp: sending commit to dashboard: ---40hexomitted---[master]("shiny/widget/flex: add HTML printing of tests")
2016/05/19 19:36:46 exp: found new commit ---40hexomitted---("shiny/widget/flex: basics of flex algorithm")
2016/05/19 19:36:46 exp: updated branch head: "master"(Head: ---40hexomitted---[master]("shiny/widget/flex: basics of flex algorithm") LastSeen: ---40hexomitted---[master]("shiny/widget/flex: add HTML printing of tests"))
2016/05/19 19:36:46 exp: sending commit to dashboard: ---40hexomitted---[master]("shiny/widget/flex: basics of flex algorithm")
2016/05/19 22:45:58 exp: found new commit ---40hexomitted---("shiny/driver/x11driver: tighten the textureImpl.draw dst rectangle.")
2016/05/19 22:45:58 exp: updated branch head: "master"(Head: ---40hexomitted---[master]("shiny/driver/x11driver: tighten the textureImpl.draw dst rectangle.") LastSeen: ---40hexomitted---[master]("shiny/widget/flex: basics of flex algorithm"))
2016/05/19 22:45:58 exp: sending commit to dashboard: ---40hexomitted---[master]("shiny/driver/x11driver: tighten the textureImpl.draw dst rectangle.")

Server time is:

$ date
Thu May 19 23:06:28 UTC 2016

So it's just the go repo that's hung?

But the git subprocess for the go repo keeps coming & going. It's not hung there.

We should have a debugging endpoint to get the watcher's goroutines too.

I'll kick it for now.

This also needs to be monitored.

@bradfitz bradfitz added the Builders label May 19, 2016

@bradfitz bradfitz added this to the Unreleased milestone May 19, 2016

@adg adg changed the title from x/build: watcher is hung to x/build: monitor that the watcher is doing its job May 20, 2016

@adg

This comment has been minimized.

Contributor

adg commented May 20, 2016

The watcher has caught up again, so this issue is now about watching the watcher.

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Jul 15, 2016

This sounds like the fd leak I noticed recently. Do you remember if you noticed a large number of fds when you saw this in May?

@adg

This comment has been minimized.

Contributor

adg commented Jul 15, 2016

I didn't notice, sorry.

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Sep 29, 2016

The FD leak that caused most of these problems was fixed with golang/build@44e74d5

Can we close this issue for now and reopen if we have different problems with the watcher again?

@bradfitz

This comment has been minimized.

Member

bradfitz commented Sep 29, 2016

This bug is about setting up monitoring, which we still don't have.

@bradfitz bradfitz changed the title from x/build: monitor that the watcher is doing its job to x/build: monitor all the parts of the build system Feb 25, 2017

@gopherbot

This comment has been minimized.

gopherbot commented Feb 25, 2017

CL https://golang.org/cl/37457 mentions this issue.

gopherbot pushed a commit to golang/build that referenced this issue Feb 25, 2017

cmd/coordinator: clean up reverse buildlet code, export status JSON
Will be used for dynamic creation/destruction of Mac VMs in subsequent CL.

Updates golang/go#9495 (Mac virtualization)
Updates golang/go#15760 (monitoring)

Change-Id: I48b17589b258d5d742bad5a3ddae18de98778149
Reviewed-on: https://go-review.googlesource.com/37457
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

@bradfitz bradfitz assigned cybrcodr and adams-sarah and unassigned adg Apr 12, 2017

gopherbot pushed a commit to golang/build that referenced this issue Apr 12, 2017

cmd/coordinator: break up status into active vs pending builds
Currently all builds start and think they're running, but most are
just fighting over a mutex to grab a builder. That will be fixed, but
in the meantime it's nice to see what's actually working vs what's
waiting on e.g. arm5 hardware which won't be available for hours.

This is a baby step towards more monitoring. Currently this is just HTML
output, but the same data could be exported via JSON or something else later
for graphing.

Updates golang/go#19178 (add a buildlet scheduler)
Updates golang/go#15760 (monitor everything)

Change-Id: I36e16ea0919afe8023fe7fedd981f2e857f0d6df
Reviewed-on: https://go-review.googlesource.com/40397
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot

This comment has been minimized.

gopherbot commented Apr 17, 2017

CL https://golang.org/cl/40397 mentions this issue.

@gopherbot

This comment has been minimized.

gopherbot commented Jul 5, 2017

CL https://golang.org/cl/47490 mentions this issue.

@gopherbot

This comment has been minimized.

gopherbot commented Jul 10, 2017

CL https://golang.org/cl/47934 mentions this issue.

gopherbot pushed a commit to golang/build that referenced this issue Jul 20, 2017

cmd/coordinator: add basic monitoring for reverse buildlets
Carried over from https://golang.org/cl/47490.

Updates golang/go#15760

Change-Id: I8b4cc007dea8e32a23cac4cb13bb313d9ec5d4ac
Reviewed-on: https://go-review.googlesource.com/47934
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment