New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: preemption takes longer on OpenBSD #17712

Open
mdempsky opened this Issue Oct 31, 2016 · 16 comments

Comments

Projects
None yet
10 participants
@mdempsky
Member

mdempsky commented Oct 31, 2016

TestGCFairness2 (0.12s)
    proc_test.go:351: want OK
	    , got goroutine 1 did not run

https://storage.googleapis.com/go-build-log/bc7889a5/openbsd-amd64-gce58_a614fab1.log

/cc @aclements @RLH

I don't see any evidence that this is a known flake in the issue tracker, so filing a bug and tagging Go1.8Maybe in case it's of interest.

@mdempsky mdempsky added the OS-OpenBSD label Oct 31, 2016

@mdempsky mdempsky added this to the Go1.8Maybe milestone Oct 31, 2016

@mundaym

This comment has been minimized.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Nov 1, 2016

I've got this on trybots a number of times in the past day.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 1, 2016

@bradfitz All openbsd-amd64?

@bradfitz

This comment has been minimized.

@RLH

This comment has been minimized.

Contributor

RLH commented Nov 1, 2016

I can not duplicate it on my Ubuntu box. I suspect that it is openbsd time
or time slice problem. Any history of problems in this area?

On Tue, Nov 1, 2016 at 1:22 PM, Brad Fitzpatrick notifications@github.com
wrote:

IIRC. Email finds at least:

https://storage.googleapis.com/go-build-log/a5085c96/
openbsd-amd64-gce58_66ea63b8.log
https://storage.googleapis.com/go-build-log/9c33aed7/
openbsd-amd64-gce58_313f81fd.log

(they failed slightly differently)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#17712 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA7Wny9ILYzUtS2VIZYU2nzma6ZRF7Fsks5q53VJgaJpZM4KlpiC
.

@quentinmit

This comment has been minimized.

@RLH

This comment has been minimized.

Contributor

RLH commented Nov 2, 2016

Looks like BSD is the only place that flakes.

On Tue, Nov 1, 2016 at 6:30 PM, Quentin Smith notifications@github.com
wrote:

Just flaked for me at https://storage.googleapis.
com/go-build-log/ad82e14d/openbsd-amd64-gce58_81897bfd.log


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#17712 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA7Wn0Ok8ge9MaXYFjYm86U6NUp1GwQFks5q572dgaJpZM4KlpiC
.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 2, 2016

@RLH I'm not aware of clock time precision problems on OpenBSD, but OpenBSD's scheduler has been the culprit of flakes in the past. For example, see #14183.

@martisch

This comment has been minimized.

Member

martisch commented Nov 2, 2016

and another flake of TestGCFairness2 on OpenBSD AMD64: https://storage.googleapis.com/go-build-log/a9a27137/openbsd-amd64-gce58_5d2f625d.log

@gopherbot

This comment has been minimized.

gopherbot commented Nov 2, 2016

CL https://golang.org/cl/32634 mentions this issue.

@quentinmit

This comment has been minimized.

Contributor

quentinmit commented Nov 2, 2016

@mdempsky I spent the past few hours investigating this with @RLH. This is indeed fallout from OpenBSD's usleep granularity. Austin's d35dfd4 causes the test to no longer block in GC assists, which means only preemptions cause goroutine switches. Those preemptions are supposed to happen every 10ms, but on OpenBSD the sysmon thread is trying to sleep for 20µs and is actually sleeping for 20ms, so preemptions are taking 50+ms to happen.

I just sent a CL that allows up to 1s for the test to complete, which should deflake it on OpenBSD. I'll repurpose this bug for investigating the OpenBSD scheduler behavior; I don't understand why sysmon requires 2+ passes through the loop (at 20ms each) before it decides to preempt a thread that has been running for >10ms.

@quentinmit quentinmit changed the title from runtime: TestGCFairness2 "want OK, got goroutine 1 did not run" flake on OpenBSD to runtime: preemption takes forever on OpenBSD Nov 2, 2016

@mdempsky mdempsky changed the title from runtime: preemption takes forever on OpenBSD to runtime: preemption takes longer on OpenBSD Nov 2, 2016

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 2, 2016

@quentinmit Ah, yeah, that explanation makes sense. I was wondering how scheduler granularity could affect the test, since it didn't obviously rely on sleeping (other than the 30ms sleep, which is itself fine).

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 2, 2016

Looks like a lot of runtime's usleep calls might be significantly oversleeping on OpenBSD, and maybe Windows based on the comment in runqgrab.

gopherbot pushed a commit that referenced this issue Nov 3, 2016

runtime/testdata/testprog: increase GCFairness2 timeout to 1s
OpenBSD's scheduler causes preemption to take 20+ms, so 30ms is not
enough time for 3 goroutines to run. This change continues to sleep for
30ms, but if it finds that the 3 goroutines have not run, it sleeps for
an additional 1s before declaring failure.

Updates #17712

Change-Id: I3e886e40d05192b7cb71b4f242af195836ef62a8
Reviewed-on: https://go-review.googlesource.com/32634
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Quentin Smith <quentin@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@bradfitz

This comment has been minimized.

Member

bradfitz commented Nov 3, 2016

@mdempsky

This comment has been minimized.

Member

mdempsky commented Nov 3, 2016

@bradfitz That looks like a trybot failure? Do you happen to know the parent revision? (In particular, if it's before or after 21c114e.)

@bradfitz

This comment has been minimized.

Member

bradfitz commented Nov 3, 2016

No, its git parent is well before that test tweak.

(Btw, the failure log contains a git hash at the top you can paste into Gerrit's search.)

@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Nov 11, 2016

@aclements aclements modified the milestones: Go1.10, Go1.9 Jun 9, 2017

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jul 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment