New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: repeated syscalls inhibit periodic preemption #28701

Open
benesch opened this Issue Nov 9, 2018 · 13 comments

Comments

Projects
None yet
5 participants
@benesch
Contributor

benesch commented Nov 9, 2018

What version of Go are you using (go version)?

go version go1.11 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/benesch/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/benesch/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.11/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.11/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/tw/njf8lc695t70f1lkhts5d0nr0000gn/T/go-build056467710=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Ran the following program repeatedly.

package cgotest

// #include <unistd.h>
import "C"

import (
	"context"
	"math/rand"
	// "runtime"
	"sync"
	"testing"
	"time"
)

func test27660(t *testing.T) {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()
	ints := make([]int, 100)
	locks := make([]sync.Mutex, 100)
	// Slowly create threads so that ThreadSanitizer is forced to
	// frequently resize its SyncClocks.
	for i := 0; i < 100; i++ {
		go func() {
			for ctx.Err() == nil {
				// Sleep in C for long enough that it is likely that the runtime
				// will retake this goroutine's currently wired P.
				C.usleep(1000 /* 1ms */)
				// runtime.Gosched()
			}
		}()
		go func() {
			// Trigger lots of synchronization and memory reads/writes to
			// increase the likelihood that the race described in #27660
			// results in corruption of ThreadSanitizer's internal state
			// and thus an assertion failure or segfault.
			for ctx.Err() == nil {
				j := rand.Intn(100)
				locks[j].Lock()
				ints[j]++
				locks[j].Unlock()
			}
		}()
		time.Sleep(time.Millisecond)
	}
}

The program usually completes in 100ms, but occasionally takes upwards of 60s, and sometimes more than 10m (i.e., maybe it never completes). It appears to be caused by the scheduler starving out the outer goroutine, as uncommenting the runtime.Gosched call makes the test pass reliably in under 60s.

It's possible that this is #10958, but I think that the call to ctx.Err() should allow for preemption, as should the call into cgo. I'm filing as a separate bug so I have an issue to reference when I add this test in http://golang.org/cl/148717. I'll dig a little deeper in a bit.

/cc @dvyukov

@benesch

This comment has been minimized.

Contributor

benesch commented Nov 9, 2018

Gathered some evidence that this isn't related to preemptibility.

Here's a trace of a good run:

image

And here's a trace of a bad run:

image

Besides the vastly different time scales (130ms vs 90s), one other big difference stands out. The good run slowly increases the number of threads in syscalls towards 100 as the number of sleeping goroutines grows to 100. This is what allows the non-sleeping goroutines to make progress. In the bad run, the number of threads is fixed at eight, and so the sleeping goroutines hog the Ps and grind everything to a halt.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Nov 9, 2018

Try to add some debug logging to retake function, it is supposed to retake Ps blocked in syscalls. Does it run? Does it see Ps in syscalls? Why does not it retake them? Or does it retake them but they don't run new goroutines somehow?

@benesch

This comment has been minimized.

Contributor

benesch commented Nov 9, 2018

Huh, it appears not to be retaking them at all. With this patch

diff --git a/src/runtime/proc.go b/src/runtime/proc.go
index 864efcdfed..f236f36a5e 100644
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@@ -4413,6 +4413,7 @@ func retake(now int64) uint32 {
 			// Otherwise the M from which we retake can exit the syscall,
 			// increment nmidle and report deadlock.
 			incidlelocked(-1)
+			println("retaking ", _p_.id)
 			if atomic.Cas(&_p_.status, s, _Pidle) {
 				if trace.enabled {
 					traceGoSysBlock(_p_)

I see a continual stream of retaking messages:

retaking  2
retaking  4
retaking  7
...

In a bad run I see no retaking messages at all. Digging in further now.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Nov 9, 2018

sysmon can completely block and stop doing anything, if it thinks there is nothing for it to do. But exitsyscall should wake it.

@benesch

This comment has been minimized.

Contributor

benesch commented Nov 9, 2018

Yeah, I added some logging to the beginning retake which indicated that it was, in fact, getting invoked frequently but was choosing not to actually retake any Ps. There's something very fishy about this code:

go/src/runtime/proc.go

Lines 4397 to 4402 in e496e61

t := int64(_p_.syscalltick)
if int64(pd.syscalltick) != t {
pd.syscalltick = uint32(t)
pd.syscallwhen = now
continue
}

In bad runs we see that these ticks are permanently misaligned:

retake starts
can't retake p  0  yet;  1452  vs  1461
can't retake p  1  yet;  1460  vs  1469
can't retake p  2  yet;  1450  vs  1461
can't retake p  3  yet;  1455  vs  1464
can't retake p  4  yet;  1448  vs  1457
can't retake p  5  yet;  1448  vs  1457
can't retake p  6  yet;  1450  vs  1459
can't retake p  7  yet;  1399  vs  1408
retake starts
can't retake p  0  yet;  1461  vs  1470
can't retake p  1  yet;  1469  vs  1478
can't retake p  2  yet;  1461  vs  1470
can't retake p  3  yet;  1464  vs  1473
can't retake p  4  yet;  1457  vs  1466
can't retake p  5  yet;  1457  vs  1466
can't retake p  6  yet;  1459  vs  1468
can't retake p  7  yet;  1408  vs  1417
retake starts
can't retake p  0  yet;  1470  vs  1479
can't retake p  1  yet;  1478  vs  1487
can't retake p  2  yet;  1470  vs  1479
can't retake p  3  yet;  1473  vs  1482
can't retake p  4  yet;  1466  vs  1475
can't retake p  5  yet;  1466  vs  1475
can't retake p  6  yet;  1468  vs  1477
can't retake p  7  yet;  1417  vs  1426
@benesch

This comment has been minimized.

Contributor

benesch commented Nov 9, 2018

Oh, this is fascinating. If the sysmon delay creeps up to 10ms, retake is never going to be called frequently enough to see the same syscall twice. In fact we can see that on each call to retake, each p's syscallticks is always about ten greater than its last syscallticks, since about ten calls to sleep(1ms) have occurred.

I think the solution is actually pretty simple. We need to consider Ps in syscalls for preemption based on schedticks, and not just syscallticks.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Nov 9, 2018

If retake sees that a P makes constant progress executing syscalls, then it must not retake it. It's meant to retake Ps from syscalls blocked for a long period of time. If a goroutine is running, making progress and returns from all syscalls in a timely manner then it should continue to run.

What's failing here seems to be goroutine preemption, the next check in retake (s == _Prunning). The goroutine is running for too long, but we don't preempt it, since we don't see P in _Prunning state (right?). What if we try to retake P for _Psyscall status, and then, if that fails, try to preempt it for s == _Prunning || s == _Psyscall?

@benesch

This comment has been minimized.

Contributor

benesch commented Nov 9, 2018

@dvyukov

This comment has been minimized.

Member

dvyukov commented Nov 9, 2018

Speaking of which
https://go-review.googlesource.com/c/go/+/148717/2/src/runtime/proc.go
would break this logic, because we clear p.m even for _Psyscall Ps. Need to be careful there.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Nov 9, 2018

Yes, preemptone would require keeping p.m (or p.oldm).

@gopherbot

This comment has been minimized.

gopherbot commented Nov 11, 2018

Change https://golang.org/cl/148901 mentions this issue: runtime: preempt goroutines that make repeated cgo calls

@bcmills bcmills added the Performance label Nov 19, 2018

@bcmills bcmills changed the title from runtime: go routine starvation to runtime: cgo calls disable preemption of tight loops Nov 19, 2018

@bcmills bcmills added this to the Go1.12 milestone Nov 19, 2018

@bcmills

This comment has been minimized.

Member

bcmills commented Nov 19, 2018

@dr2chase, @aclements: is it important to include the fix for this in 1.12 alongside #10958? If not, should we mention the caveat in the release notes?

@aclements aclements changed the title from runtime: cgo calls disable preemption of tight loops to runtime: repeated syscalls inhibit periodic preemption Dec 11, 2018

@aclements

This comment has been minimized.

Member

aclements commented Dec 11, 2018

Given that this has been a bug for quite a while, I'm not too worried about fixing it in 1.12 (obviously it would be nice to). CL 148901 seems rather complex and could have performance implications.

Have we considered not clearing pp.m in reentersyscall and instead delaying this until we actually preempt a syscall? Then that information would be available to preemptone, without having to add oldm. I don't know if this would end up being simpler or not, but it's a thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment