Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: reverse pool locking problem in the coordinator #10750

bradfitz opened this issue May 7, 2015 · 2 comments


None yet
2 participants
Copy link

commented May 7, 2015

@crawshaw, is hanging. Interesting stack goroutine:

goroutine 46046 [semacquire, 18 minutes]:
    /home/bradfitz/go/src/runtime/sema.go:43 +0x2d
    /home/bradfitz/go/src/sync/mutex.go:82 +0x1c7
main.(*reverseBuildletPool).WriteHTMLStatus(0xc733c0, 0x7f27f805d588, 0xc20883aaf0)
    /home/bradfitz/src/ +0x16a
main.handleStatus(0x7f27f806bfb8, 0xc208503970, 0xc20821b1e0)
    /home/bradfitz/src/ +0xf61


goroutine 132 [semacquire, 23 minutes]:
    /home/bradfitz/go/src/runtime/sema.go:43 +0x2d
    /home/bradfitz/go/src/sync/mutex.go:82 +0x1c7
main.(*reverseBuildletPool).tryToGrab(0xc733c0, 0x9f50f0, 0xe, 0x0, 0x0, 0x0)
    /home/bradfitz/src/ +0x64
main.(*reverseBuildletPool).GetBuildlet(0xc733c0, 0x9f50f0, 0xe, 0xc2083df4a0, 0x28, 0x7f27f80739c0, 0xc2082addc0, 0xc2083c0a00, 0x0, 0x0)
    /home/bradfitz/src/ +0x67
main.(*buildStatus).build(0xc2082addc0, 0x0, 0x0)
    /home/bradfitz/src/ +0x1f6
    /home/bradfitz/src/ +0x33
created by main.(*buildStatus).start
    /home/bradfitz/src/ +0x7e


This comment has been minimized.

Copy link
Member Author

commented May 7, 2015

In the reverse buildlet healthcheck, this channel receive is blocking for 33+ minutes while holding the mutex:

// reverseHealthCheck requests the status page of each idle buildlet.                                                                                              
// If the buildlet fails to respond promptly, it is removed from the pool.                                                                                         
func (p *reverseBuildletPool) reverseHealthCheck() {
        responses := make(map[*reverseBuildlet]chan error)
        for _, b := range p.buildlets {
                if b.inUseAs == "health" { // sanity check                                                                                                         
                        panic("previous health check still running")
                if b.inUseAs != "" {
                        continue // skip busy buildlets                                                                                                            
                b.inUseAs = "health"
                res := make(chan error, 1)
                responses[b] = res
                client := b.client
                go func() {
                        _, err := client.Status()
                        res <- err
        time.Sleep(5 * time.Second) // give buildlets time to respond                                                                                             

        var buildlets []*reverseBuildlet
        for _, b := range p.buildlets {
                res := responses[b]
                if b.inUseAs != "health" || res == nil {
                        // buildlet skipped or registered after health check                                                                                       
                        buildlets = append(buildlets, b)
                b.inUseAs = ""
                err, done := <-res     // <------------ HERE

(that final line)


This comment has been minimized.

Copy link

commented May 7, 2015

CL mentions this issue.

@golang golang locked and limited conversation to collaborators Jun 25, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.