New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: tight loop hangs process completely after some time #15442

Closed
creker opened this Issue Apr 26, 2016 · 26 comments

Comments

Projects
None yet
10 participants
@creker

creker commented Apr 26, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version go1.6.2 windows/amd64
  2. What operating system and processor architecture are you using (go env)?
    Windows 10.0.10586 am64
  3. What did you do?
    Ran this code
package main

import (
    "log"
    "runtime"
)

func main() {
    runtime.GOMAXPROCS(2)
    ch := make(chan bool)

    go func() {
        for {
            ch <- true
            log.Println("sent")
        }
    }()

    go func() {
        for {
            <-ch
            log.Println("received")
        }
    }()

    for {   
    }
}
  1. What did you expect to see?
    Process printing "sent" and "received" until terminated
  2. What did you see instead?
    Process runs and prints as expected for about 2 seconds and then hangs. Nothing is printed after that, process just eats up CPU. No panics or anything.

I put runtime.GOMAXPROCS(2) to make sure that there're multiple threads that goroutines can ran on. Obviously with runtime.GOMAXPROCS(1) process would hang immediately as expected - for loop will not yield execution.

I tried to replace the for loop with this so that main goroutine can yield execution:

go func() {
    for {
    }
}()

select {}

But exactly the same thing happens. Now, if I put time.Sleep(10 * time.Millisecond) or longer after log.Println("sent") then process no longer hangs. I ran it for a minute and it's just kept going. Don't know, maybe it will still hang much later. If I change it to 2 ms then it hangs after 30 seconds. I tried to collect trace data but it looks like it gets corrupted because trace doesn't finish. When I try to view the trace it says "failed to parse trace: no EvFrequency event".

Everything behaves exactly the same on Mac OSX El capitan 10.11.4 (15E65) Go 1.6.2

I read the #10958 but here the weird thing is that it actually runs for awhile completely fine and only after that it hangs.

@ianlancetaylor ianlancetaylor changed the title from Tight loop hangs process completely after some time to runtime: tight loop hangs process completely after some time Apr 26, 2016

@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor Apr 26, 2016

Contributor

I can not recreate the problem on GNU/Linux (using the select {} version; I don't think the for {} version is interesting for us). I don't see how this could be Windows-specific, but could somebody with a WIndows machine try to recreate the problem on Windows? Thanks.

Contributor

ianlancetaylor commented Apr 26, 2016

I can not recreate the problem on GNU/Linux (using the select {} version; I don't think the for {} version is interesting for us). I don't see how this could be Windows-specific, but could somebody with a WIndows machine try to recreate the problem on Windows? Thanks.

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker Apr 26, 2016

It's not Windows-specific. The same thing happens on OS X.

Just tested both versions on Ubuntu 14.04 LTS 3.13.0-24-generic virtual machine with Go 1.6.2 64-bit. Both versions hang after 20 seconds. Adding time.Sleep(10 * time.Millisecond) gives the same result as on other OSes.

creker commented Apr 26, 2016

It's not Windows-specific. The same thing happens on OS X.

Just tested both versions on Ubuntu 14.04 LTS 3.13.0-24-generic virtual machine with Go 1.6.2 64-bit. Both versions hang after 20 seconds. Adding time.Sleep(10 * time.Millisecond) gives the same result as on other OSes.

@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor Apr 26, 2016

Contributor

I just ran the program using select {} on GNU/Linux for over six minutes without a problem. This was on a native kernel, not a VM, on Ubuntu 14.04.

When the program hangs on GNU/Linux, kill it by typing ^\. That should dump a complete stack backtrace. Attach that here. Thanks.

Contributor

ianlancetaylor commented Apr 26, 2016

I just ran the program using select {} on GNU/Linux for over six minutes without a problem. This was on a native kernel, not a VM, on Ubuntu 14.04.

When the program hangs on GNU/Linux, kill it by typing ^\. That should dump a complete stack backtrace. Attach that here. Thanks.

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker Apr 26, 2016

Another interesting find. I was running the program through ssh and it caused program to output more slowly. And process was no longer hanging. Once I ran it in VM terminal itself it did hang. Tried to output to a file instead of the console to remove the bottleneck - hangs within a second. So it looks like execution speed affects this issue.

Source

package main

import (
    "log"
    "runtime"
)

func main() {
    runtime.GOMAXPROCS(2)
    ch := make(chan bool)

    go func() {
        for {
            ch <- true
            log.Println("sent")
        }
    }()

    go func() {
        for {
            <-ch
            log.Println("received")
        }
    }()

    go func() {
        for {
        }
    }()

    select {
    }
}

Linux backtrace

SIGQUIT: quit
PC=0x401310 m=0

goroutine 7 [running]:
main.main.func3()
        /home/uweb/gowork/src/issue/main.go:27 fp=0xc820022fc0 sp=0xc820022fb8
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820022fc8 sp=0xc820022fc0
created by main.main
        /home/uweb/gowork/src/issue/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
        /home/uweb/gowork/src/issue/main.go:31 +0xa3

goroutine 5 [running]:
        goroutine running on other thread; stack unavailable
created by main.main
        /home/uweb/gowork/src/issue/main.go:17 +0x64

goroutine 6 [chan receive]:
main.main.func2(0xc8200140c0)
        /home/uweb/gowork/src/issue/main.go:21 +0x42
created by main.main
        /home/uweb/gowork/src/issue/main.go:24 +0x86

rax    0x0
rbx    0x401310
rcx    0xc820022800
rdx    0x52e288
rdi    0x42f690
rsi    0x589b60
rbp    0x0
rsp    0xc820022fb8
r8     0x589ea0
r9     0x0
r10    0x0
r11    0x0
r12    0x2c
r13    0x52d8e4
r14    0x0
r15    0x8
rip    0x401310
rflags 0x206
cs     0x33
fs     0x0
gs     0x0
exit status 2

OS X backtrace

SIGQUIT: quit
PC=0x2350 m=0

goroutine 7 [running]:
main.main.func3()
    /Users/creker/Documents/Projects/go/src/hello/main.go:27 fp=0xc82002afc0 sp=0xc82002afb8
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82002afc8 sp=0xc82002afc0
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
    /Users/creker/Documents/Projects/go/src/hello/main.go:31 +0xa3

goroutine 5 [chan send]:
main.main.func1(0xc8200140c0)
    /Users/creker/Documents/Projects/go/src/hello/main.go:14 +0x4b
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:17 +0x64

goroutine 6 [running]:
    goroutine running on other thread; stack unavailable
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:24 +0x86

rax    0x0
rbx    0x2350
rcx    0xc82002a800
rdx    0x12c7b0
rdi    0x303f0
rsi    0x1875c0
rbp    0x0
rsp    0xc82002afb8
r8     0x187900
r9     0x0
r10    0x0
r11    0x0
r12    0x2c
r13    0x12be30
r14    0x0
r15    0x8
rip    0x2350
rflags 0x206
cs     0x2b
fs     0x0
gs     0x0
exit status 2

creker commented Apr 26, 2016

Another interesting find. I was running the program through ssh and it caused program to output more slowly. And process was no longer hanging. Once I ran it in VM terminal itself it did hang. Tried to output to a file instead of the console to remove the bottleneck - hangs within a second. So it looks like execution speed affects this issue.

Source

package main

import (
    "log"
    "runtime"
)

func main() {
    runtime.GOMAXPROCS(2)
    ch := make(chan bool)

    go func() {
        for {
            ch <- true
            log.Println("sent")
        }
    }()

    go func() {
        for {
            <-ch
            log.Println("received")
        }
    }()

    go func() {
        for {
        }
    }()

    select {
    }
}

Linux backtrace

SIGQUIT: quit
PC=0x401310 m=0

goroutine 7 [running]:
main.main.func3()
        /home/uweb/gowork/src/issue/main.go:27 fp=0xc820022fc0 sp=0xc820022fb8
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820022fc8 sp=0xc820022fc0
created by main.main
        /home/uweb/gowork/src/issue/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
        /home/uweb/gowork/src/issue/main.go:31 +0xa3

goroutine 5 [running]:
        goroutine running on other thread; stack unavailable
created by main.main
        /home/uweb/gowork/src/issue/main.go:17 +0x64

goroutine 6 [chan receive]:
main.main.func2(0xc8200140c0)
        /home/uweb/gowork/src/issue/main.go:21 +0x42
created by main.main
        /home/uweb/gowork/src/issue/main.go:24 +0x86

rax    0x0
rbx    0x401310
rcx    0xc820022800
rdx    0x52e288
rdi    0x42f690
rsi    0x589b60
rbp    0x0
rsp    0xc820022fb8
r8     0x589ea0
r9     0x0
r10    0x0
r11    0x0
r12    0x2c
r13    0x52d8e4
r14    0x0
r15    0x8
rip    0x401310
rflags 0x206
cs     0x33
fs     0x0
gs     0x0
exit status 2

OS X backtrace

SIGQUIT: quit
PC=0x2350 m=0

goroutine 7 [running]:
main.main.func3()
    /Users/creker/Documents/Projects/go/src/hello/main.go:27 fp=0xc82002afc0 sp=0xc82002afb8
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82002afc8 sp=0xc82002afc0
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
    /Users/creker/Documents/Projects/go/src/hello/main.go:31 +0xa3

goroutine 5 [chan send]:
main.main.func1(0xc8200140c0)
    /Users/creker/Documents/Projects/go/src/hello/main.go:14 +0x4b
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:17 +0x64

goroutine 6 [running]:
    goroutine running on other thread; stack unavailable
created by main.main
    /Users/creker/Documents/Projects/go/src/hello/main.go:24 +0x86

rax    0x0
rbx    0x2350
rcx    0xc82002a800
rdx    0x12c7b0
rdi    0x303f0
rsi    0x1875c0
rbp    0x0
rsp    0xc82002afb8
r8     0x187900
r9     0x0
r10    0x0
r11    0x0
r12    0x2c
r13    0x12be30
r14    0x0
r15    0x8
rip    0x2350
rflags 0x206
cs     0x2b
fs     0x0
gs     0x0
exit status 2
@rhedile

This comment has been minimized.

Show comment
Hide comment
@rhedile

rhedile Apr 27, 2016

I can confirm the behaviour on 14.04 on a KVM with 3 VPUs. go is 1.6.0

This is the scheduler as the programm begins to spin.

2016/04/27 05:48:11 received
2016/04/27 05:48:11 sent
SCHED 1016ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0
idlethreads=2 runqueue=0 gcwaiting=1 n
midlelocked=0 stopwait=1 sysmonwait=0
P0: status=3 schedtick=25 syscalltick=163151 m=4 runqsize=0 gfreecnt=0
P1: status=1 schedtick=2 syscalltick=0 m=0 runqsize=0 gfreecnt=0
M4: p=0 curg=20 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=fals
e lockedg=-1
M3: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M0: p=1 curg=21 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
G1: status=4(select (no cases)) m=-1 lockedm=-1
G2: status=4(force gc (idle)) m=-1 lockedm=-1
G17: status=4(GC sweep wait) m=-1 lockedm=-1
G18: status=4(finalizer wait) m=-1 lockedm=-1
G19: status=4(chan send) m=-1 lockedm=-1
G20: status=2(chan receive) m=4 lockedm=-1
G21: status=2() m=0 lockedm=-1
G3: status=4(GC worker (idle)) m=-1 lockedm=-1
G4: status=4(GC worker (idle)) m=-1 lockedm=-1

On 26 April 2016 at 22:38, Antonenko Artem notifications@github.com wrote:

Another interesting find. I was running the program through ssh and it
caused program to output more slowly. And process was no longer hanging.
Once I ran it in VM terminal itself it did hang. Tried to output to a file
instead of the console to remove the bottleneck - hangs within a second. So
it looks like execution speed affects this issue.

Source

package main

import (
"log"
"runtime"
)

func main() {
runtime.GOMAXPROCS(2)
ch := make(chan bool)

go func() {
    for {
        ch <- true
        log.Println("sent")
    }
}()

go func() {
    for {
        <-ch
        log.Println("received")
    }
}()

go func() {
    for {
    }
}()

select {
}

}

Linux backtrace

SIGQUIT: quit
PC=0x401310 m=0

goroutine 7 [running]:
main.main.func3()
/home/uweb/gowork/src/issue/main.go:27 fp=0xc820022fc0 sp=0xc820022fb8
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820022fc8 sp=0xc820022fc0
created by main.main
/home/uweb/gowork/src/issue/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
/home/uweb/gowork/src/issue/main.go:31 +0xa3

goroutine 5 [running]:
goroutine running on other thread; stack unavailable
created by main.main
/home/uweb/gowork/src/issue/main.go:17 +0x64

goroutine 6 [chan receive]:
main.main.func2(0xc8200140c0)
/home/uweb/gowork/src/issue/main.go:21 +0x42
created by main.main
/home/uweb/gowork/src/issue/main.go:24 +0x86

rax 0x0
rbx 0x401310
rcx 0xc820022800
rdx 0x52e288
rdi 0x42f690
rsi 0x589b60
rbp 0x0
rsp 0xc820022fb8
r8 0x589ea0
r9 0x0
r10 0x0
r11 0x0
r12 0x2c
r13 0x52d8e4
r14 0x0
r15 0x8
rip 0x401310
rflags 0x206
cs 0x33
fs 0x0
gs 0x0
exit status 2

OS X backtrace

SIGQUIT: quit
PC=0x2350 m=0

goroutine 7 [running]:
main.main.func3()
/Users/creker/Documents/Projects/go/src/hello/main.go:27 fp=0xc82002afc0 sp=0xc82002afb8
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82002afc8 sp=0xc82002afc0
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
/Users/creker/Documents/Projects/go/src/hello/main.go:31 +0xa3

goroutine 5 [chan send]:
main.main.func1(0xc8200140c0)
/Users/creker/Documents/Projects/go/src/hello/main.go:14 +0x4b
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:17 +0x64

goroutine 6 [running]:
goroutine running on other thread; stack unavailable
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:24 +0x86

rax 0x0
rbx 0x2350
rcx 0xc82002a800
rdx 0x12c7b0
rdi 0x303f0
rsi 0x1875c0
rbp 0x0
rsp 0xc82002afb8
r8 0x187900
r9 0x0
r10 0x0
r11 0x0
r12 0x2c
r13 0x12be30
r14 0x0
r15 0x8
rip 0x2350
rflags 0x206
cs 0x2b
fs 0x0
gs 0x0
exit status 2


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15442 (comment)

rhedile commented Apr 27, 2016

I can confirm the behaviour on 14.04 on a KVM with 3 VPUs. go is 1.6.0

This is the scheduler as the programm begins to spin.

2016/04/27 05:48:11 received
2016/04/27 05:48:11 sent
SCHED 1016ms: gomaxprocs=2 idleprocs=0 threads=5 spinningthreads=0
idlethreads=2 runqueue=0 gcwaiting=1 n
midlelocked=0 stopwait=1 sysmonwait=0
P0: status=3 schedtick=25 syscalltick=163151 m=4 runqsize=0 gfreecnt=0
P1: status=1 schedtick=2 syscalltick=0 m=0 runqsize=0 gfreecnt=0
M4: p=0 curg=20 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=fals
e lockedg=-1
M3: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
M0: p=1 curg=21 mallocing=0 throwing=0 preemptoff= locks=0 dying=0
helpgc=0 spinning=false blocked=false lockedg=-1
G1: status=4(select (no cases)) m=-1 lockedm=-1
G2: status=4(force gc (idle)) m=-1 lockedm=-1
G17: status=4(GC sweep wait) m=-1 lockedm=-1
G18: status=4(finalizer wait) m=-1 lockedm=-1
G19: status=4(chan send) m=-1 lockedm=-1
G20: status=2(chan receive) m=4 lockedm=-1
G21: status=2() m=0 lockedm=-1
G3: status=4(GC worker (idle)) m=-1 lockedm=-1
G4: status=4(GC worker (idle)) m=-1 lockedm=-1

On 26 April 2016 at 22:38, Antonenko Artem notifications@github.com wrote:

Another interesting find. I was running the program through ssh and it
caused program to output more slowly. And process was no longer hanging.
Once I ran it in VM terminal itself it did hang. Tried to output to a file
instead of the console to remove the bottleneck - hangs within a second. So
it looks like execution speed affects this issue.

Source

package main

import (
"log"
"runtime"
)

func main() {
runtime.GOMAXPROCS(2)
ch := make(chan bool)

go func() {
    for {
        ch <- true
        log.Println("sent")
    }
}()

go func() {
    for {
        <-ch
        log.Println("received")
    }
}()

go func() {
    for {
    }
}()

select {
}

}

Linux backtrace

SIGQUIT: quit
PC=0x401310 m=0

goroutine 7 [running]:
main.main.func3()
/home/uweb/gowork/src/issue/main.go:27 fp=0xc820022fc0 sp=0xc820022fb8
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820022fc8 sp=0xc820022fc0
created by main.main
/home/uweb/gowork/src/issue/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
/home/uweb/gowork/src/issue/main.go:31 +0xa3

goroutine 5 [running]:
goroutine running on other thread; stack unavailable
created by main.main
/home/uweb/gowork/src/issue/main.go:17 +0x64

goroutine 6 [chan receive]:
main.main.func2(0xc8200140c0)
/home/uweb/gowork/src/issue/main.go:21 +0x42
created by main.main
/home/uweb/gowork/src/issue/main.go:24 +0x86

rax 0x0
rbx 0x401310
rcx 0xc820022800
rdx 0x52e288
rdi 0x42f690
rsi 0x589b60
rbp 0x0
rsp 0xc820022fb8
r8 0x589ea0
r9 0x0
r10 0x0
r11 0x0
r12 0x2c
r13 0x52d8e4
r14 0x0
r15 0x8
rip 0x401310
rflags 0x206
cs 0x33
fs 0x0
gs 0x0
exit status 2

OS X backtrace

SIGQUIT: quit
PC=0x2350 m=0

goroutine 7 [running]:
main.main.func3()
/Users/creker/Documents/Projects/go/src/hello/main.go:27 fp=0xc82002afc0 sp=0xc82002afb8
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82002afc8 sp=0xc82002afc0
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:29 +0x9e

goroutine 1 [select (no cases)]:
main.main()
/Users/creker/Documents/Projects/go/src/hello/main.go:31 +0xa3

goroutine 5 [chan send]:
main.main.func1(0xc8200140c0)
/Users/creker/Documents/Projects/go/src/hello/main.go:14 +0x4b
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:17 +0x64

goroutine 6 [running]:
goroutine running on other thread; stack unavailable
created by main.main
/Users/creker/Documents/Projects/go/src/hello/main.go:24 +0x86

rax 0x0
rbx 0x2350
rcx 0xc82002a800
rdx 0x12c7b0
rdi 0x303f0
rsi 0x1875c0
rbp 0x0
rsp 0xc82002afb8
r8 0x187900
r9 0x0
r10 0x0
r11 0x0
r12 0x2c
r13 0x12be30
r14 0x0
r15 0x8
rip 0x2350
rflags 0x206
cs 0x2b
fs 0x0
gs 0x0
exit status 2


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15442 (comment)

@davecheney

This comment has been minimized.

Show comment
Hide comment
@davecheney

davecheney Apr 27, 2016

Contributor

@creker I'm sorry but we cannot accept a bug with a for {} infinite loop.

The reason this program stalls is the for {} will consume a proc, and this proc will not stop for garbage collection.

I am going to close this issue as I do not believe there is an issue. I recommend if you want to discuss this further please take this to another forum, such as the mailing list.

Contributor

davecheney commented Apr 27, 2016

@creker I'm sorry but we cannot accept a bug with a for {} infinite loop.

The reason this program stalls is the for {} will consume a proc, and this proc will not stop for garbage collection.

I am going to close this issue as I do not believe there is an issue. I recommend if you want to discuss this further please take this to another forum, such as the mailing list.

@davecheney davecheney closed this Apr 27, 2016

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Apr 27, 2016

Contributor

@davecheney it looks from my skimming of the issue that it also reproduces with select {}.

Contributor

josharian commented Apr 27, 2016

@davecheney it looks from my skimming of the issue that it also reproduces with select {}.

@davecheney

This comment has been minimized.

Show comment
Hide comment
@davecheney

davecheney Apr 27, 2016

Contributor

@josharian i think there is still a for {} in there,

    go func() {
        for {
        }
    }()

    select {
    }

If this issue can be reproduced without a for {} then I am happy to see this issue reopened and investigated further.

Contributor

davecheney commented Apr 27, 2016

@josharian i think there is still a for {} in there,

    go func() {
        for {
        }
    }()

    select {
    }

If this issue can be reproduced without a for {} then I am happy to see this issue reopened and investigated further.

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Apr 27, 2016

Contributor

Hmm. The original report doesn't match the later one. Those who can reproduce this: Does it reproduce without any for {} loops?

Contributor

josharian commented Apr 27, 2016

Hmm. The original report doesn't match the later one. Those who can reproduce this: Does it reproduce without any for {} loops?

@rhedile

This comment has been minimized.

Show comment
Hide comment
@rhedile

rhedile Apr 27, 2016

Using an empty select has other side effects. Being very old and just a
user I am very uncomfortable with the thought that "tick,tock" constructs
accepted by the compiler and vet lead to a program that initially works as
intended then enters a undefined condition without panic.
Naturally this doesn't spin the runtime.
package main

import (
// "fmt"
)

func main() {
ch := make(chan int)
exit := make(chan bool)

    go func() {
            for {
                    ch <- 1
                    //                      fmt.Println("sent i is ", i)

            }
    }()

    go func() {
            var i int = 0
            for {
                    i += <-ch
                    //                      fmt.Println("received i

is", i)
if i > 1000000 {
exit <- true
}
}

    }()

    <-exit

}

On 27 April 2016 at 06:33, Josh Bleecher Snyder notifications@github.com
wrote:

Hmm. The original report doesn't match the later one. Those who can
reproduce this: Does it reproduce without any for {} loops?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

rhedile commented Apr 27, 2016

Using an empty select has other side effects. Being very old and just a
user I am very uncomfortable with the thought that "tick,tock" constructs
accepted by the compiler and vet lead to a program that initially works as
intended then enters a undefined condition without panic.
Naturally this doesn't spin the runtime.
package main

import (
// "fmt"
)

func main() {
ch := make(chan int)
exit := make(chan bool)

    go func() {
            for {
                    ch <- 1
                    //                      fmt.Println("sent i is ", i)

            }
    }()

    go func() {
            var i int = 0
            for {
                    i += <-ch
                    //                      fmt.Println("received i

is", i)
if i > 1000000 {
exit <- true
}
}

    }()

    <-exit

}

On 27 April 2016 at 06:33, Josh Bleecher Snyder notifications@github.com
wrote:

Hmm. The original report doesn't match the later one. Those who can
reproduce this: Does it reproduce without any for {} loops?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

@davecheney

This comment has been minimized.

Show comment
Hide comment
@davecheney

davecheney Apr 27, 2016

Contributor

Using an empty select has other side effects.

What other side effects ?

Contributor

davecheney commented Apr 27, 2016

Using an empty select has other side effects.

What other side effects ?

@rhedile

This comment has been minimized.

Show comment
Hide comment
@rhedile

rhedile Apr 27, 2016

On 27 April 2016 at 07:14, Dave Cheney notifications@github.com wrote:

Using an empty select has other side effects.

What other side effects ?

Caveat: knowledge state go 1.6

for same reason Banks still use COBOL heap sorts for some tasks;
predictability.

select{}, correctly, reads the channel list on entry. Then, correctly,
checks the senders/receivers for the state of its cases. Unfortunate time
spent reading the channel list ist undefined. The channel list is protected
by mutexes . If the rate of channel creation is proportional to load and
the time spent waiting to read exceeds a gc cycle then pseudo random
determines the read list complete. Every time select{} is woken it can
block for an undefined period to time. We had a similar discussion last
year. The consensus was "do not use defaults in select". One real reason
was the read message in the select case was being held until the select
exited. However, time spent reentering the select in a for{} under load was
the main cause of our performance loss.

rgds, Nigel Vickers


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

rhedile commented Apr 27, 2016

On 27 April 2016 at 07:14, Dave Cheney notifications@github.com wrote:

Using an empty select has other side effects.

What other side effects ?

Caveat: knowledge state go 1.6

for same reason Banks still use COBOL heap sorts for some tasks;
predictability.

select{}, correctly, reads the channel list on entry. Then, correctly,
checks the senders/receivers for the state of its cases. Unfortunate time
spent reading the channel list ist undefined. The channel list is protected
by mutexes . If the rate of channel creation is proportional to load and
the time spent waiting to read exceeds a gc cycle then pseudo random
determines the read list complete. Every time select{} is woken it can
block for an undefined period to time. We had a similar discussion last
year. The consensus was "do not use defaults in select". One real reason
was the read message in the select case was being held until the select
exited. However, time spent reentering the select in a for{} under load was
the main cause of our performance loss.

rgds, Nigel Vickers


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

@davecheney

This comment has been minimized.

Show comment
Hide comment
@davecheney

davecheney Apr 27, 2016

Contributor

I'm sorry this seems unrelated to the original issue. The reason for using select {} over for {} is they both block the current goroutine from making any further progress, but the former does it by removing the goroutine from the scheduler (as none if its zero cases are selectable), the latter does so by spinning in a loop which cannot be interrupted.

If you believe there is a bug, can you please produce a runnable sample that does not use a for {} loop, preferably on play.golang.org, that demonstrates the issue.

Contributor

davecheney commented Apr 27, 2016

I'm sorry this seems unrelated to the original issue. The reason for using select {} over for {} is they both block the current goroutine from making any further progress, but the former does it by removing the goroutine from the scheduler (as none if its zero cases are selectable), the latter does so by spinning in a loop which cannot be interrupted.

If you believe there is a bug, can you please produce a runnable sample that does not use a for {} loop, preferably on play.golang.org, that demonstrates the issue.

@rhedile

This comment has been minimized.

Show comment
Hide comment
@rhedile

rhedile Apr 27, 2016

I confirm that the behaviour experienced using for{} in main() in the test
code was not experienced when replaced by select{} in our environment.

On 27 April 2016 at 08:35, Dave Cheney notifications@github.com wrote:

I'm sorry this seems unrelated to the original issue. The reason for using select
{} over for {} is they both block the current goroutine from making any
further progress, but the former does it by removing the goroutine from the
scheduler (as none if it's zero cases are selectable), the latter does so
by spinning in a loop which cannot be stopped.

If you believe there is a bug, can you please produce a runnable sample
that does not use a for {} loop, preferably on play.golang.org, that
demonstrates the issue.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

rhedile commented Apr 27, 2016

I confirm that the behaviour experienced using for{} in main() in the test
code was not experienced when replaced by select{} in our environment.

On 27 April 2016 at 08:35, Dave Cheney notifications@github.com wrote:

I'm sorry this seems unrelated to the original issue. The reason for using select
{} over for {} is they both block the current goroutine from making any
further progress, but the former does it by removing the goroutine from the
scheduler (as none if it's zero cases are selectable), the latter does so
by spinning in a loop which cannot be stopped.

If you believe there is a bug, can you please produce a runnable sample
that does not use a for {} loop, preferably on play.golang.org, that
demonstrates the issue.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15442 (comment)

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker Apr 27, 2016

The reason this program stalls is the for {} will consume a proc, and this proc will not stop for garbage collection.

Thank you, that does explain why this is happening. If I insert runtime.GC() in one of the goroutines but not the one with the for loop then program hangs upon calling it for the first time.

It still look like a strange behaviour to lock entire process but at least I understand why it's happening. Hope that #10958 would be fixed as it does look like it may affect real production code.

creker commented Apr 27, 2016

The reason this program stalls is the for {} will consume a proc, and this proc will not stop for garbage collection.

Thank you, that does explain why this is happening. If I insert runtime.GC() in one of the goroutines but not the one with the for loop then program hangs upon calling it for the first time.

It still look like a strange behaviour to lock entire process but at least I understand why it's happening. Hope that #10958 would be fixed as it does look like it may affect real production code.

@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor Apr 27, 2016

Contributor

@rhedile A literal select {} does not have any channels. It is compiled into a call to the runtime function block. The function does not acquire any mutexes, it simply blocks forever.

Contributor

ianlancetaylor commented Apr 27, 2016

@rhedile A literal select {} does not have any channels. It is compiled into a call to the runtime function block. The function does not acquire any mutexes, it simply blocks forever.

@dr2chase

This comment has been minimized.

Show comment
Hide comment
@dr2chase

dr2chase Apr 29, 2016

Contributor

I'm starting to think that if the compiler sees an (obviously) infinite loop, it could arrange to insert a call to select{}

Contributor

dr2chase commented Apr 29, 2016

I'm starting to think that if the compiler sees an (obviously) infinite loop, it could arrange to insert a call to select{}

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux May 1, 2016

Member
Member

minux commented May 1, 2016

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker May 1, 2016

Maybe instead compiler should generate an error if he encounters an infinite loop? Now program just locks up without any diagnostic messages and to understand why you need to understand how goroutines are scheduled. And in case of this issue even that didn't help me, I didn't know that GC could also do that.

for {} is not usuable for anything, it's just generates the issue. Even if for {} has a body compiler probably can detect that it will never call the runtime. For example, if every function call (which also doesn't call the runtime) is inlined then scheduler will not be called on function entry. But I suspect it will require much more complex analysis. On the other hand, if loop body has anything useful then it's no longer an issue because it will eventually call the runtime.

creker commented May 1, 2016

Maybe instead compiler should generate an error if he encounters an infinite loop? Now program just locks up without any diagnostic messages and to understand why you need to understand how goroutines are scheduled. And in case of this issue even that didn't help me, I didn't know that GC could also do that.

for {} is not usuable for anything, it's just generates the issue. Even if for {} has a body compiler probably can detect that it will never call the runtime. For example, if every function call (which also doesn't call the runtime) is inlined then scheduler will not be called on function entry. But I suspect it will require much more complex analysis. On the other hand, if loop body has anything useful then it's no longer an issue because it will eventually call the runtime.

@cznic

This comment has been minimized.

Show comment
Hide comment
@cznic

cznic May 1, 2016

Contributor

Maybe instead compiler should generate an error if he encounters an infinite loop?

Then there would be no way to write a CPU baking program.

On a more serious note, empty for loop is a legal language concept, sending SIGQUIT diagnoses it easily if needed.

Contributor

cznic commented May 1, 2016

Maybe instead compiler should generate an error if he encounters an infinite loop?

Then there would be no way to write a CPU baking program.

On a more serious note, empty for loop is a legal language concept, sending SIGQUIT diagnoses it easily if needed.

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker May 1, 2016

Well, it didn't help me. SIGQUIT didn't output anything that would tell me that it's GC that locked up the process. The stacktrace doesn't even mention any relevant Go runtime sources so that I could at least start somewhere.

Yes, for {} loop is legal but it leads to program that locks up without telling why. You have to understand Go runtime to know why and not even the basics of it. There're 3 solutions that I can think of right now:

  1. Leave everything as it is but output better diagnostic messages so that the cause of the issue is obvious.
  2. Insert runtime.Gosched() call.
  3. Don't allow infinite loops at all.

creker commented May 1, 2016

Well, it didn't help me. SIGQUIT didn't output anything that would tell me that it's GC that locked up the process. The stacktrace doesn't even mention any relevant Go runtime sources so that I could at least start somewhere.

Yes, for {} loop is legal but it leads to program that locks up without telling why. You have to understand Go runtime to know why and not even the basics of it. There're 3 solutions that I can think of right now:

  1. Leave everything as it is but output better diagnostic messages so that the cause of the issue is obvious.
  2. Insert runtime.Gosched() call.
  3. Don't allow infinite loops at all.
@cznic

This comment has been minimized.

Show comment
Hide comment
@cznic

cznic May 1, 2016

Contributor

If SIGQUIT doesn't show the for {} loop line, it's probably worth filing an issue. Meanwhile, grep 'for {}' to the rescue. Most programs should not have that line, ever.

Contributor

cznic commented May 1, 2016

If SIGQUIT doesn't show the for {} loop line, it's probably worth filing an issue. Meanwhile, grep 'for {}' to the rescue. Most programs should not have that line, ever.

@creker

This comment has been minimized.

Show comment
Hide comment
@creker

creker May 1, 2016

It does show it but it doesn't tell the reason. for {} is not the reason, it's how GC works is what causes the issue. for {} just triggers it. The whole point here is to understand why.

I agree and as I said, for {} is useless in real code. What I forgot to mention is it's not me who found that issue http://stackoverflow.com/questions/36826622/why-is-the-following-code-sample-stuck-after-some-iterations/ I couldn't understand why it behaves like it does, started playing with it and decided to open the issue to help me and everyone else understand what's going on.

It's an edge case when people learning Go. And most of the time they are about goroutines scheduling. For example, you insert for {} and suddenly your goroutines are no longer scheduled because GOMAXPROCS=1 and scheduler is never given a chance to execute any other goroutine. People still have difficulties with that but at least SO has many great answers that cover exactly why it works like that. There're blog posts that cover the scheduler and from that it's obvious why.

But the issue here is not covered anywhere. Which leads to a bigger problem - the lack of good diagnostic messages when process locks up and people don't understand why. Yes, it's useless non-production code but it's very important when you're learning new stuff. You're playing with it, deliberately triggering edge cases to understand the limitations. And it's good when program tells you that you reached the limit. Right now your program just hangs. To understand why you either need to ask another question on SO which will be closed as duplicate or left unanswered or you google anything on Go runtime, read blog posts, Go team mail lists and Google Docs. Here it didn't help me. No one gave an answer to that SO question, accepted answer is wrong. And it's not like there isn't anyone who understands Go well - many answers are from Google employees themselves.

So it would be great to either print somehow a diagnostic message which might be not very easy in these cases. Or insert runtime.Gosched() and solve these issues once and for all. Right now it's like C++ - something is broken but only a few chosen ones understand why. For me, that's not what Go is about.

Sorry for such a long comment.

creker commented May 1, 2016

It does show it but it doesn't tell the reason. for {} is not the reason, it's how GC works is what causes the issue. for {} just triggers it. The whole point here is to understand why.

I agree and as I said, for {} is useless in real code. What I forgot to mention is it's not me who found that issue http://stackoverflow.com/questions/36826622/why-is-the-following-code-sample-stuck-after-some-iterations/ I couldn't understand why it behaves like it does, started playing with it and decided to open the issue to help me and everyone else understand what's going on.

It's an edge case when people learning Go. And most of the time they are about goroutines scheduling. For example, you insert for {} and suddenly your goroutines are no longer scheduled because GOMAXPROCS=1 and scheduler is never given a chance to execute any other goroutine. People still have difficulties with that but at least SO has many great answers that cover exactly why it works like that. There're blog posts that cover the scheduler and from that it's obvious why.

But the issue here is not covered anywhere. Which leads to a bigger problem - the lack of good diagnostic messages when process locks up and people don't understand why. Yes, it's useless non-production code but it's very important when you're learning new stuff. You're playing with it, deliberately triggering edge cases to understand the limitations. And it's good when program tells you that you reached the limit. Right now your program just hangs. To understand why you either need to ask another question on SO which will be closed as duplicate or left unanswered or you google anything on Go runtime, read blog posts, Go team mail lists and Google Docs. Here it didn't help me. No one gave an answer to that SO question, accepted answer is wrong. And it's not like there isn't anyone who understands Go well - many answers are from Google employees themselves.

So it would be great to either print somehow a diagnostic message which might be not very easy in these cases. Or insert runtime.Gosched() and solve these issues once and for all. Right now it's like C++ - something is broken but only a few chosen ones understand why. For me, that's not what Go is about.

Sorry for such a long comment.

@RLH

This comment has been minimized.

Show comment
Hide comment
@RLH

RLH May 4, 2016

Contributor

The GC needs to preempt a goroutine in a timely fashion. Preemption happens
at GC safepoints which include function calls as well as various channel
and scheduler commands. If the time between these safepoints is large then
the GC may not be able to make progress. For loops such as "for {}" that do
not contain a safepoint this can hang the system.

There are a couple of ways to avoid this issue, one is to accept the fact
that the GC may be delayed until the loop is exited and if need be add a
runtime.Gosched call in the loop. Another is to teach the compiler to
detect loops that do not contain a GC safepoint and insert a check and a
safepoint. This overhead may adversely affect the performance of tight
loops that folks care a lot about. At the cost of increasing the size of
binaries the compiler could unroll the loop to improve performance.
Unfortunately a compiler can't tell how long a loop will run and in fact
whether or not it will exit. Any fix will have a downside.

At the end of the day it is a matter of where the community wants to put
its resources. Education seems to be the best way forward for now. Write
programs that terminate is a good first bit of advice. Another piece is to
avoid tight loops that execute for a long time that do not contain function
calls, yields, or channel operations.

On Sun, May 1, 2016 at 10:25 AM, Antonenko Artem notifications@github.com
wrote:

It does show it but it doesn't tell the reason. for {} is not the reason,
it's how GC works is what causes the issue. for {} just triggers it. The
whole point here is understand why.

I agree and as I said, for {} is useless in real code. What I forgot to
mention is it's not me who found that issue
http://stackoverflow.com/questions/36826622/why-is-the-following-code-sample-stuck-after-some-iterations/
I couldn't understand why it behaves like it does, started playing with it
and decided to open an issue.

It's an edge case when people learning Go. And most of the time they are
about goroutines scheduling. For example, you insert for {} and suddenly
your goroutines are no longer scheduled because GOMAXPROCS=1 and scheduler
is never given a chance to execute any other goroutine. People still have
difficulties with that but at least SO has many great answers that cover
exactly why it works like that. There're many blog posts that cover the
scheduler and from that it's obvious why.

But the issue here is not covered anywhere. Which leads to a bigger
problem - the lack of good diagnostic messages when process locks up and
people don't understand why. Yes, it's useless non-production code but it's
very important when you're learning new stuff. You're playing with it,
deliberately triggering edge cases to understand the limitations. And it's
good when program tells you that you reached the limit. Right now your
program just hangs. To understand why you either need to ask another
question on SO which will be closed as duplicate or left unanswered or you
google anything on Go runtime, read blog posts, Go team mail lists and
Google Docs. Here it didn't help me. No one gave an answer to that SO
question, accepted answer is wrong. And it's not like there isn't anyone
who understands Go well - many answers are from Google employees themselves.

So it would be goos to either print diagnostic messages which might not be
very easy in these cases. Or insert runtime.Gosched() and solve these
issues once and for all.

Sorry for such a long comment.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15442 (comment)

Contributor

RLH commented May 4, 2016

The GC needs to preempt a goroutine in a timely fashion. Preemption happens
at GC safepoints which include function calls as well as various channel
and scheduler commands. If the time between these safepoints is large then
the GC may not be able to make progress. For loops such as "for {}" that do
not contain a safepoint this can hang the system.

There are a couple of ways to avoid this issue, one is to accept the fact
that the GC may be delayed until the loop is exited and if need be add a
runtime.Gosched call in the loop. Another is to teach the compiler to
detect loops that do not contain a GC safepoint and insert a check and a
safepoint. This overhead may adversely affect the performance of tight
loops that folks care a lot about. At the cost of increasing the size of
binaries the compiler could unroll the loop to improve performance.
Unfortunately a compiler can't tell how long a loop will run and in fact
whether or not it will exit. Any fix will have a downside.

At the end of the day it is a matter of where the community wants to put
its resources. Education seems to be the best way forward for now. Write
programs that terminate is a good first bit of advice. Another piece is to
avoid tight loops that execute for a long time that do not contain function
calls, yields, or channel operations.

On Sun, May 1, 2016 at 10:25 AM, Antonenko Artem notifications@github.com
wrote:

It does show it but it doesn't tell the reason. for {} is not the reason,
it's how GC works is what causes the issue. for {} just triggers it. The
whole point here is understand why.

I agree and as I said, for {} is useless in real code. What I forgot to
mention is it's not me who found that issue
http://stackoverflow.com/questions/36826622/why-is-the-following-code-sample-stuck-after-some-iterations/
I couldn't understand why it behaves like it does, started playing with it
and decided to open an issue.

It's an edge case when people learning Go. And most of the time they are
about goroutines scheduling. For example, you insert for {} and suddenly
your goroutines are no longer scheduled because GOMAXPROCS=1 and scheduler
is never given a chance to execute any other goroutine. People still have
difficulties with that but at least SO has many great answers that cover
exactly why it works like that. There're many blog posts that cover the
scheduler and from that it's obvious why.

But the issue here is not covered anywhere. Which leads to a bigger
problem - the lack of good diagnostic messages when process locks up and
people don't understand why. Yes, it's useless non-production code but it's
very important when you're learning new stuff. You're playing with it,
deliberately triggering edge cases to understand the limitations. And it's
good when program tells you that you reached the limit. Right now your
program just hangs. To understand why you either need to ask another
question on SO which will be closed as duplicate or left unanswered or you
google anything on Go runtime, read blog posts, Go team mail lists and
Google Docs. Here it didn't help me. No one gave an answer to that SO
question, accepted answer is wrong. And it's not like there isn't anyone
who understands Go well - many answers are from Google employees themselves.

So it would be goos to either print diagnostic messages which might not be
very easy in these cases. Or insert runtime.Gosched() and solve these
issues once and for all.

Sorry for such a long comment.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15442 (comment)

@minux

This comment has been minimized.

Show comment
Hide comment
@minux

minux May 4, 2016

Member
Member

minux commented May 4, 2016

@davecheney

This comment has been minimized.

Show comment
Hide comment
@davecheney

davecheney May 4, 2016

Contributor

I agree. I don't think this is a problem that needs to be solved in code.

On Wed, May 4, 2016 at 8:14 AM, Minux Ma notifications@github.com wrote:

I just don't think we need to solve the problem.

Tight loops are created for a reason, and the
compiler should respect that.

for {} is troublesome, but most of them are used
in toy examples.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#15442 (comment)

Contributor

davecheney commented May 4, 2016

I agree. I don't think this is a problem that needs to be solved in code.

On Wed, May 4, 2016 at 8:14 AM, Minux Ma notifications@github.com wrote:

I just don't think we need to solve the problem.

Tight loops are created for a reason, and the
compiler should respect that.

for {} is troublesome, but most of them are used
in toy examples.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#15442 (comment)

@golang golang locked and limited conversation to collaborators May 4, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.