Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime, internal/poll: darwin: ensure that no thread is consumed, nor a syscall.Read if FD isn't yet ready for I/O #33953

Closed
odeke-em opened this issue Aug 29, 2019 · 11 comments

Comments

@odeke-em
Copy link
Member

@odeke-em odeke-em commented Aug 29, 2019

Coming here from #32326.
That test was meant to ensure that an (*os.File).Read which is backed by internal/poll would ONLY consume a thread for a read syscall if I/O was ready.

Examining @MichaelTJones' crash dump in reveals in deed that when it crashed

$ grep 'os_test.go' stack | sort | uniq
/Users/mtj/go/src/os/os_test.go:2283 
/Users/mtj/go/src/os/os_test.go:2283 +0x370 
/Users/mtj/go/src/os/os_test.go:2286 +0x84 
/Users/mtj/go/src/os/os_test.go:2286 +0x84 fp=0xc00025afa8 sp=0xc00025af48 
/Users/mtj/go/src/os/os_test.go:2297 +0x3aa 

non of the code had gotten past blocking on a syscall.Read as per

<-creading

and also from his execution traces
image

Somehow the poller is tripping out and not blocking until the file descriptor is ready for I/O.

In that test, our high water mark for the max number of threads was 50 but for regular machines where GOMAXPROCS=2,4,8, the maximum value of threads if every call was doing a syscall.Read would be 10, 18, 34 if we follow

        P := runtime.GOMAXPROCS(0)
        n := threads
        maxThreads := 0
        if n/P-1 >= P {
                maxThreads = 2 + 4*P
        } else {
                maxThreads = 1 + (7 * P / 4) + (n / P)
        }

but in his case, given GOMAXPROCS=32,36, it would be 61, 67.

Hence why his test failed but it went undetected for usually small values of GOMAXPROCS. Let's also examine the behaviors between versions of Go to see when things regressed.

@odeke-em odeke-em added this to the Go1.14 milestone Aug 29, 2019
@odeke-em odeke-em changed the title runtime, internal/poll: ensure that nothread is consumed if FD isn't yet ready for I/O runtime, internal/poll: ensure that no thread is consumed, nor a syscall.Read if FD isn't yet ready for I/O Aug 29, 2019
@julieqiu
Copy link
Contributor

@julieqiu julieqiu commented Aug 29, 2019

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 29, 2019

Change https://golang.org/cl/192342 mentions this issue: os: skip TestPipeThreads on GOOS=darwin

@ianlancetaylor ianlancetaylor changed the title runtime, internal/poll: ensure that no thread is consumed, nor a syscall.Read if FD isn't yet ready for I/O runtime, internal/poll: darwin: ensure that no thread is consumed, nor a syscall.Read if FD isn't yet ready for I/O Aug 31, 2019
gopherbot pushed a commit that referenced this issue Aug 31, 2019
Updates #32326.
Updates #33953.

Change-Id: I97a1cbe682becfe9592e19294d4d94f5e5b16c21
Reviewed-on: https://go-review.googlesource.com/c/go/+/192342
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Aug 31, 2019

Change https://golang.org/cl/192757 mentions this issue: os: skip TestPipeThreads on GOOS=darwin

tomocy added a commit to tomocy/go that referenced this issue Sep 1, 2019
Updates golang#32326.
Updates golang#33953.

Change-Id: I97a1cbe682becfe9592e19294d4d94f5e5b16c21
Reviewed-on: https://go-review.googlesource.com/c/go/+/192342
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@odeke-em
Copy link
Member Author

@odeke-em odeke-em commented Sep 2, 2019

I found the commit that introduced this problem, commit a3b0144#diff-38e21adab3460688e3ce0b71fa9422d0 aka CL https://go-review.googlesource.com/c/141639 which is a part of Go1.12.

This program should be able to run alright with GOMAXPROCS=8 and a max of 8 threads before that commit

package main

import (
	"log"
	"os"
	"runtime/debug"
	"sync"
	"syscall"
	"time"
)

func main() {
	n := 100

	var rl, wl []*os.File
	for i := 0; i < n; i++ {
		prc, pwc, err := os.Pipe()
		if err != nil {
			// Cleanup.
			for j := 0; j < i; j++ {
				wl[i].Close()
				rl[i].Close()
			}
		}
		rl = append(rl, prc)
		wl = append(rl, pwc)
	}

	defer debug.SetMaxThreads(debug.SetMaxThreads(8))

	var wg sync.WaitGroup
	for i := 0; i < n; i++ {
		wg.Add(1)
		go func(i int) {
			bs := make([]byte, 4)
			wg.Done()
			prc := rl[i]
			if _, err := prc.Read(bs); err != nil {
				log.Fatalf("Failed to read %d: %v\n", i, err)
			}
		}(i)
	}

	wg.Wait()
	println("Waiting now")
	for {
		<-time.After(5 * time.Second)
		if true {
			return
		}
		proc, _ := os.FindProcess(os.Getpid())
		proc.Signal(syscall.SIGQUIT)
	}

	for i := 0; i < n; i++ {
		if _, err := wl[i].Write([]byte("Hello")); err != nil {
			log.Fatalf("Write #%d failed: %v", err)
		}
	}
}

Kindly pinging @randall77.
For Go1.13, we've disabled the test on Darwin but after we've fixed the issue, we'll need to backport to Go1.12 and Go1.13 as well.

@odeke-em odeke-em added NeedsFix and removed NeedsInvestigation labels Sep 2, 2019
gopherbot pushed a commit that referenced this issue Sep 3, 2019
Updates #32326.
Updates #33953.

Change-Id: I97a1cbe682becfe9592e19294d4d94f5e5b16c21
Reviewed-on: https://go-review.googlesource.com/c/go/+/192342
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
(cherry picked from commit bac5b3f)
Reviewed-on: https://go-review.googlesource.com/c/go/+/192757
Run-TryBot: Andrew Bonventre <andybons@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
t4n6a1ka added a commit to t4n6a1ka/go that referenced this issue Sep 5, 2019
Updates golang#32326.
Updates golang#33953.

Change-Id: I97a1cbe682becfe9592e19294d4d94f5e5b16c21
Reviewed-on: https://go-review.googlesource.com/c/go/+/192342
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@odeke-em odeke-em added the Soon label Sep 14, 2019
@odeke-em
Copy link
Member Author

@odeke-em odeke-em commented Oct 6, 2019

Some great news, @minux's CL https://go-review.googlesource.com/c/go/+/197938 fixed this issue in that change from entersyscallblock() to entersyscall() so we'll need to backport Minux's CL to both Go1.12 and Go1.13 as per https://groups.google.com/d/msg/golang-dev/UYDTDWHJsC8/a1_4KiKnCwAJ.

@gopherbot please backport this issue to Go1.12 and Go1.13.

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 6, 2019

Backport issue(s) opened: #34713 (for 1.12), #34714 (for 1.13).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 6, 2019

Change https://golang.org/cl/199477 mentions this issue: os: re-enable TestPipeThreads on darwin

@gopherbot gopherbot closed this in cfe2320 Oct 6, 2019
@networkimprov
Copy link

@networkimprov networkimprov commented Oct 6, 2019

@odeke-em also need to revert 4c8037b

@odeke-em
Copy link
Member Author

@odeke-em odeke-em commented Oct 6, 2019

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 6, 2019

I don't follow you... The commit I referenced dropped the test on the 1.13 branch. The test has been reinstated on tip.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 7, 2019

We don't need to re-enable the test on the branch. That's not a bug fix.

@golang golang locked and limited conversation to collaborators Oct 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.