Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: ioutil.ReadFile when reading big files may cause long STW pauses #39479

Closed
IlyaM opened this issue Jun 9, 2020 · 4 comments
Closed

runtime: ioutil.ReadFile when reading big files may cause long STW pauses #39479

IlyaM opened this issue Jun 9, 2020 · 4 comments
Milestone

Comments

@IlyaM
Copy link

@IlyaM IlyaM commented Jun 9, 2020

What version of Go are you using (go version)?

go version go1.14.4 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ilya/.cache/go-build"
GOENV="/home/ilya/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ilya/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build777436491=/tmp/go-build -gno-record-gcc-switches"

What did you do?

When troubleshooting long STW pauses in production code we noticed that they seem to be often happening in conjunction with reading big files what seems to be visible in go traces - the execution on all but two processors stops with one of them running a goroutine which reads a file using standard library call ioutil.ReadFile and another being in STW phase which seemingly waits for the file reading goroutine to finish. As the files we are reading are relatively big this stops execution of the program on other CPU cores for rather long time, in the worst cases up to 100ms.

The following a very simple program demonstrates the problem:

package main

import (
	"bufio"
	"io/ioutil"
	"os"
	"runtime"
	"runtime/trace"
)

func check(e error) {
	if e != nil {
		panic(e)
	}
}

func main() {
	f, err := os.Create("trace")
	defer f.Close()
	check(err)

	w := bufio.NewWriter(f)
	defer w.Flush()

	err = trace.Start(w)
	check(err)
	defer trace.Stop()

	go func() {
		// dat-1G is 1G file created with command 'fallocate -l 1G dat-1G'
		_, err := ioutil.ReadFile("dat-1G")
		check(err)
	}()

	// forces GC to trigger STW
	runtime.GC()
}

If you run it and then look into the trace you will see that STW phase last untl the code finishes reading the file. My possibly incomplete understanding of the problem is that internally ioutil.ReadFile tries to read the file with a single 'read' syscall which go cannot preemt so STW phase cannot pause the goroutine which does the very long file read operation.

trace

Probably ioutil.ReadFile implementation should be changed to read file in blocks using multiple 'read' syscalls to allow go to preemt it when necessary.

@ianlancetaylor ianlancetaylor changed the title ioutil.ReadFile() when reading big files may cause long STW pauses runtime: ioutil.ReadFile when reading big files may cause long STW pauses Jun 9, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Jun 9, 2020
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 9, 2020

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Jun 9, 2020

If you run it and then look into the trace you will see that STW phase last untl the code finishes reading the file.

I don't think this is what is happening. If you wait for the goroutine doing ReadFile to finish (before the main function exits), it will take a lot longer. From your trace it seems it took ~40 ms to preempt that goroutine and stop the world, but it didn't wait the whole ReadFile to finish.

(~40 ms is still kind of long. I guess it is probably allocating and zeroing the memory of the big slice, which is currently nonpreemptible (#36365))

@IlyaM
Copy link
Author

@IlyaM IlyaM commented Jun 10, 2020

@cherrymui You are right. I tried to replace ioutil.ReadFile with manual allocation of the big slice and then explicitly reading the file via os.Open() and Read() and I can see in the trace that long STW pause is due to the big slice allocation as you wrote.

@aclements
Copy link
Member

@aclements aclements commented Jun 11, 2020

Thanks for the report. Since this is really about preemption of large allocations, I'm going to close it as a dup of #31222 (which is tracked in tracking bug #36365).

@aclements aclements closed this Jun 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.