Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime, net: OS X 10.9 kernel dumpens quicker spinning applications down by default #7582

Closed
gopherbot opened this issue Mar 19, 2014 · 15 comments

Comments

Projects
None yet
6 participants
@gopherbot
Copy link

commented Mar 19, 2014

by jake.net:

I have the following code, and when I run it after a certain amount of time I it
crashes. Could this be be a file descriptor exhaustion problem? 

What does 'go version' print?
go version go1.2.1 darwin/386

Mac OS X 10.9

What steps reproduce the problem?
If possible, include a link to a program on play.golang.org.

http://play.golang.org/p/-74yJshfTk

package main

import (
    "log"
    "net/http"
    "sync"
)

const MaxOutstanding int = 2000

var semaphore = make(chan int, MaxOutstanding)
var wg sync.WaitGroup

func init() {
    for i := 0; i < MaxOutstanding; i++ {
        semaphore <- 1
    }
}

func main() {
    log.Println("start")
    for i := 0; i < 5000; i++ {
        wg.Add(1)
        go handle(i)
    }
    wg.Wait()
    log.Println("finish")
}

func handle(i int) {
    <-semaphore
    process(i)
    semaphore <- 1
}

func process(i int) {
    resp, err := http.Get("http://localhost:3000";)
    panicIf(err)
    defer resp.Body.Close()

    log.Println("handle", i)
    wg.Done()
}

func panicIf(err error) {
    if err != nil {
        panic(err)
    }
}

What happened?
Sometimes it says: 
panic: Get http://localhost:3000: dial tcp 127.0.0.1:3000: connection reset by peer
panic: Get http://localhost:3000: dial tcp 127.0.0.1:3000: can't assign requested address

What should have happened instead?
It should not panic :)

Please provide any additional information below.
I have asked on the golang-nuts mailing list 
https://groups.google.com/forum/#!topic/golang-nuts/NY7NMx1jAVo
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

Comment 1:

What does `ulimit -a` print on your system ?
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

Comment 2:

Labels changed: added release-go1.3, repo-main, os-macosx.

@gopherbot

This comment has been minimized.

Copy link
Author

commented Mar 19, 2014

Comment 4 by jake.net:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 2560
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

Comment 5:

@jake you are opening 5,000 http connections and only have 2,560 file descriptors
available.
@gopherbot

This comment has been minimized.

Copy link
Author

commented Mar 19, 2014

Comment 6 by jake.net:

No even with `const MaxOutstanding int = 200` I can get that error.
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

Comment 7:

Can you try to make a smaller example ? If this is about DNS exhaustion you don't need
to bring in the entire HTTP package, just use net.Dial("tcp", "localhost:3000")
@gopherbot

This comment has been minimized.

Copy link
Author

commented Mar 20, 2014

Comment 8 by jake.net:

I can still get the error just using net.Dial
Every time I run it there is a different result. Like I can set MaxOutstanding to 1 or
10 and still get an error. 
package main
import (
    "log"
    "net"
    "sync"
)
const MaxOutstanding int = 300
var semaphore = make(chan int, MaxOutstanding)
var wg sync.WaitGroup
func init() {
    for i := 0; i < MaxOutstanding; i++ {
        semaphore <- 1
    }
}
func main() {
    log.Println("start")
    for i := 0; i < 10000; i++ {
        wg.Add(1)
        go handle(i)
    }
    wg.Wait()
    log.Println("finish")
}
func handle(i int) {
    <-semaphore
    process(i)
    semaphore <- 1
}
func process(i int) {
    conn, err := net.Dial("tcp", "localhost:3000")
    panicIf(err)
    defer conn.Close()
    log.Println("handle", i)
    wg.Done()
}
func panicIf(err error) {
    if err != nil {
        panic(err)
    }
}
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2014

Comment 9:

I can't reproduce this, 
http://play.golang.org/p/rqHasq8tbW
^ your example slightly shortened. The default number of files on my system is 256,
setting MaxOutstanding to 200 causes the test to pass. Well, until panicIf complains
because there is nothing listening on port 3000
@gopherbot

This comment has been minimized.

Copy link
Author

commented Mar 20, 2014

Comment 10 by jake.net:

I spun up a Martini web server on 3000
Then ran your example with const MaxOutstanding int = 100 and it works first run.
The run it again and it fails.
@davecheney

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2014

Comment 11:

What error does it fail with ?
@gopherbot

This comment has been minimized.

Copy link
Author

commented Mar 20, 2014

Comment 12 by jake.net:

panic: dial tcp 127.0.0.1:3000: can't assign requested address
@crawshaw

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2014

Comment 13:

The concurrency is not necessary to replicate this. Here is a minimal version:
package main
import (
    "fmt"
    "net"
)
func do() {
    for i := 0; i < 200; i++ {
        conn, err := net.Dial("tcp", ":3000")
        if err != nil {
            panic(err)
        }
        if err := conn.Close(); err != nil {
            panic(err)
        }
    }
}
func main() {
    for i := 0; i < 100; i++ {
        fmt.Println("loop", i)
        do()
    }
}
With tip
    go version devel +9eacb9c0d810 Thu Apr 24 12:24:22 2014 -0700 + darwin/amd64
and starting a server
    godoc -http=:3000
this regularly panics about half way through with:
panic: dial tcp :3000: can't assign requested address
goroutine 1 [running]:
runtime.panic(0xf0700, 0xc2100460c0)
    /usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
main.do()
    /Users/crawshaw/junk2.go:12 +0xaa
main.main()
    /Users/crawshaw/junk2.go:24 +0xe2
exit status 2
@mikioh

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2014

Comment 14:

Looks like this is a kinda resource exhaustion issue on the latest OS X but not related
to the number of file/socket descriptors; so you don't need to tweak launchd. I just
tried to repro #13 on OS X and got the following:
/var/log/systemlog:
process issue7582[3042] caught causing excessive wakeups. Observed wakeups rate (per
sec): 10143; Maximum permitted wakeups rate (per sec): 150; Observation period: 300
seconds; Task lifetime number of wakeups: 45005
So certainly adding time.Sleep(an appropriate value) into the for-loop appears a
different result, but not sure what we could do for quicker spinning applications on OS
X 10.9 and beyond.

Labels changed: removed release-go1.3.

Status changed to HelpWanted.

@ja30278

This comment has been minimized.

Copy link

commented Apr 28, 2014

Comment 15:

After some digging, I'm fairly certain that this is just ephemeral port exhaustion. OSX
only allocates  16k ports for dynamic use, as compared to ~32k on linux.
Linux:
jonallie@foo:~$ cat /proc/sys/net/ipv4/ip_local_port_range
32768   61000
OSX
jonallie-macbookpro2:gophercon jonallie$ sysctl net.inet.ip.portrange
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535
Tellingly, the repro script runs ~100 loops that open 200 connections each..and usually
fails (for me) around loop 80. Increasing the port range via:
jonallie-macbookpro$ sudo sysctl -w net.inet.ip.portrange.first=32768
net.inet.ip.portrange.first: 49152 -> 32768
jonallie-macbookpro$ sudo sysctl -w net.inet.ip.portrange.hifirst=32768
net.inet.ip.portrange.hifirst: 49152 -> 32768
allows the test script to complete a full 100 loops, and increasing it to 200 loops
causes it to fail as expected.
@mikioh

This comment has been minimized.

Copy link
Contributor

commented Apr 28, 2014

Comment 16:

Good catch!

Status changed to Retracted.

@golang golang locked and limited conversation to collaborators Jun 25, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.