Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: write should handle EINTR #38033

Open
kolyshkin opened this issue Mar 23, 2020 · 22 comments
Open

os: write should handle EINTR #38033

kolyshkin opened this issue Mar 23, 2020 · 22 comments

Comments

@kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Mar 23, 2020

Go 1.14 release notes (https://golang.org/doc/go1.14#runtime) say

programs that use packages like syscall or golang.org/x/sys/unix will see more slow system calls fail with EINTR errors. Those programs will have to handle those errors in some way, most likely looping to try the system call again. For more information about this see man 7 signal for Linux systems or similar documentation for other systems.

This gives me an impression that unless I'm using syscall or golang.org/x/sys/unix directly, my code is fine.

From the signal(7) man page on Linux:

If a blocked call to one of the following interfaces is interrupted by a signal handler, then the call is automatically restarted after the signal handler returns if the SA_RESTART flag was used;
<...>
read(2), readv(2), write(2), writev(2), and ioctl(2) calls <...>

Assuming that golang runtime always sets signal handlers with SA_RESTART, taken all the above into account, it seems right to conclude that using a high-level write function such as ioutils.WriteFile or os.Write should be fine.

Turns out it's not.

Originally reported in opencontainers/runc#2258

What version of Go are you using (go version)?

$ go1.14 version
go version go1.14 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go1.14 env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/kir/.cache/go-build"
GOENV="/home/kir/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/kir/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/kir/sdk/go1.14"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/kir/sdk/go1.14/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/kir/go/src/github.com/opencontainers/runc/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build157351814=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The code below should be compiled (go test -c) and run as root on a modern Linux system.

package fscommon

import (
	"io/ioutil"
	"os"
	"path"
	"strconv"
	"testing"
)

const iter = 1000000

func TestWriteHandlesEINTR(t *testing.T) {
	memoryCgroupMount := "/sys/fs/cgroup/memory"
	cgroupPath, err := ioutil.TempDir(memoryCgroupMount, "test-")
	if err != nil {
		t.Fatal(err)
	}
	defer os.RemoveAll(cgroupPath)
	filename := path.Join(cgroupPath, "memory.limit_in_bytes")

	for i := 0; i < iter; i++ {
		limit := 1024*1024 + i

		f, err := os.OpenFile(filename, os.O_WRONLY, 0644)
		if err != nil {
			t.Fatal(err)
		}
		if n, err := f.WriteString(strconv.Itoa(limit)); err != nil {
			t.Fatalf("Failed to write %d on attempt %d (wrote %d bytes): %s", limit, i, n, err)
		}
		f.Close()
	}
}

func TestWriteFileHandlesEINTR(t *testing.T) {
	memoryCgroupMount := "/sys/fs/cgroup/memory"
	cgroupPath, err := ioutil.TempDir(memoryCgroupMount, "test-")
	if err != nil {
		t.Fatal(err)
	}
	defer os.RemoveAll(cgroupPath)
	file := path.Join(cgroupPath, "memory.limit_in_bytes")

	for i := 0; i < iter; i++ {
		limit := 1024*1024 + i
		if err := ioutil.WriteFile(file, []byte(strconv.Itoa(limit)), 0644); err != nil {
			t.Fatalf("Failed to write %d on attempt %d: %s", limit, i, err)
		}
	}
}

What did you expect to see?

PASS

(the test passes with go 1.13.6 and probably all earlier versions)

What did you see instead?

[kir@kir-rhat fscommon]$ go1.14 test -c fscommon_test.go 
[kir@kir-rhat fscommon]$ sudo ./fscommon.test 
[sudo] password for kir: 
--- FAIL: TestWriteHandlesEINTR (0.10s)
    fscommon_test.go:30: Failed to write 1072354 on attempt 23778 (wrote 0 bytes): write /sys/fs/cgroup/memory/test-298000507/memory.limit_in_bytes: interrupted system call
--- FAIL: TestWriteFileHandlesEINTR (0.19s)
    fscommon_test.go:48: Failed to write 1094511 on attempt 45935: write /sys/fs/cgroup/memory/test-239836062/memory.limit_in_bytes: interrupted system call
FAIL
@ianlancetaylor ianlancetaylor changed the title go1.14 write should handle EINTR os: go1.14 write should handle EINTR Mar 23, 2020
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 23, 2020

Am I reading this correctly in that you are writing to files opened under /sys/fs/cgroup/memory? Is that a case that we should care about?

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 23, 2020

@kolyshkin, does that imply that the Linux kernel is out-of-spec for those write calls, or does libc retry EINTR itself for write?

(But we should arguably handle EINTR throughout the standard library regardless: see #20400 and #36644.)

@bcmills bcmills added this to the Go1.15 milestone Mar 23, 2020
@bcmills bcmills changed the title os: go1.14 write should handle EINTR os: write should handle EINTR Mar 23, 2020
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 23, 2020

Marking as release-blocker for 1.15. Once we decide what (if anything) to do here, we may want to backport to 1.14.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 23, 2020

@bcmills glibc does not retry write on EINTR. It's possible that this is a kernel bug in the sysfs file system. Although as sysfs is RAM-based I don't know why that would be. Even if it is a bug that doesn't necessarily mean that we can avoid working around it.

@kolyshkin

This comment has been minimized.

Copy link
Contributor Author

@kolyshkin kolyshkin commented Mar 23, 2020

Am I reading this correctly in that you are writing to files opened under /sys/fs/cgroup/memory? Is that a case that we should care about?

I'm afraid so, yes. Seems like a regression, too, from where I stand.

The alternative is to document that go1.14 may fail with EINTR from the standard library, not just syscall or x/sys/unix as it is stated in release notes.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 23, 2020

For my own edification, can you explain when code would write to /sys/fs/cgroup/memory? Thanks.

@kolyshkin

This comment has been minimized.

Copy link
Contributor Author

@kolyshkin kolyshkin commented Mar 23, 2020

It's possible that this is a kernel bug in the sysfs file system. Although as sysfs is RAM-based I don't know why that would be.

Sorry I haven't mentioned it earlier, this is actually cgroupfs (used to set various per-cgroup controller limits). In particular, memory.limit_in_bytessets the memory limit per cgroup, and I guess this is a relatively costly operation in the kernel.

For my own edification, can you explain when code would write to /sys/fs/cgroup/memory?

In our case, it happens during a container start (a "container" is Docker, Kubernetes or something similar). We can certainly workaround it in runc. I'm just not sure what other Linux pseudo-filesystems can behave like that, and so what other similar workarounds are needed.

@ymmt2005 ymmt2005 mentioned this issue Mar 24, 2020
0 of 4 tasks complete
@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

In my experience with weird kernel FSes like this, you can experience all sorts of weirdness, but all of it falls within the documented bounds of the relevant syscalls.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 25, 2020

@thockin In this case the reported behavior seems clearly not permitted. The signal handler is installed with SA_RESTART, so the write system call should never return EINTR.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

The test code above doesn't trap any signals. If there's a signal being delivered - what signal? And if it is being trapped - who is installing that handler?

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

And importantly - why does this appear in go-1.14 and not go-1.12 (which is what I had available)?

I dug around /proc looking for some clue on the differences between the 1.14 compiled binary and the 1.12 binary. Nothing obvious. Both show the same Sig* fields in /proc/pid/status:

/proc/39878/task/39887/status:SigQ:	2/257040
/proc/39878/task/39887/status:SigPnd:	0000000000000000
/proc/39878/task/39887/status:SigBlk:	0000000000000000
/proc/39878/task/39887/status:SigIgn:	0000000000000000
/proc/39878/task/39887/status:SigCgt:	fffffffe7fc1feff

And yet the 1.14 fails reliably and 1.12 passes reliably.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

Throwing a signal handler into the test shows that 1.14 experiences SIGURG and 1.2 does not.

The existence of the generic handler does not change the failure mode.

Commenting out the bulk of the test, the SIGURG does not get triggered.

Adding back just the os.OpenFile (leaving the write commented out) I see SIGURG delivered.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 25, 2020

@thockin The Go 1.14 runtime delivers SIGURG signals frequently. See https://golang.org/doc/go1.14#runtime. This is intended to be hidden from the program (unless the program decides to look for SIGURG signals).

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 25, 2020

who is installing that handler?

The signal handler--all Go signal handlers--is installed by the Go runtime when a Go program starts.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

Well, I am seeing a SIGURG perfectly with syscall.Open() on 1.14 when I do not see that signal on 1.12. I don't know if that is a red herring, but the correlation is there.

Reducing the test case, just to get a handle on SIGURG:

package foo                                                                                                                                                                    

import (
    "fmt"
    "io/ioutil"
    "os"
    "os/signal"
    "path"
    "syscall"
    "testing"
)

const iter = 1000000

func foo(name string, flag int, perm uint32) error {
    fd, e := syscall.Open(name, flag|syscall.O_CREAT|syscall.O_CLOEXEC, perm)
    if e != nil {
        fmt.Fprintf(os.Stderr, "open(%s): %v\n", name, e)
        return e
    }
    syscall.Close(fd)
    return nil
}

func TestFoo(t *testing.T) {
    c := make(chan os.Signal, 1)
    signal.Notify(c)
    go func() {
        s := <-c
        fmt.Println("Got signal:", s)
    }()

    dir, err := ioutil.TempDir("/tmp", "test-")
    if err != nil {
        t.Fatal(err)
    }   
    defer os.RemoveAll(dir)
    filename := path.Join(dir, "file")

    for i := 0; i < iter; i++ {
        if foo(filename, os.O_WRONLY, 0644) != nil {
            break
        }
    }   
}

Again, maybe SIGURG is not the issue, but it is the most obvious visible difference between 12 and 14, and the error is related to signals...

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 25, 2020

I am only pursuing this for fun, and I have run out of time for it right now. It's clearly reproducible, and it MAY be a problem in cgroupfs, but it didn't happen in Go 1.12 and it does in Go 1.14.

Happy hunting.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 25, 2020

The SIGURG is definitely expected in Go 1.14.

What is weird is that it interrupts the write system call.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 26, 2020

Just to be clear, it's expected that my user program would see SIGURG ? If so, OK. It's not clear what is interrupting write() - I only see that SIGURG once, and importantly - no other signals.

It's also clear that the signal handler is not happening synchronously to the main execution - putting a sleep in there, my test still dies almost immediately.

const iter = 1000000

func TestWriteHandlesEINTR(t *testing.T) {
    c := make(chan os.Signal, 1)
    signal.Notify(c)
    go func() {
        s := <-c 
        fmt.Printf("Got signal: %q - sleeping 5 seconds\n", s)
        time.Sleep(5 * time.Second)
    }() 

    memoryCgroupMount := "/sys/fs/cgroup/memory"
    cgroupPath, err := ioutil.TempDir(memoryCgroupMount, "test-")
    if err != nil {
        t.Fatal(err)
    }   
    defer os.RemoveAll(cgroupPath)
    filename := path.Join(cgroupPath, "memory.limit_in_bytes")

    for i := 0; i < iter; i++ {
        limit := 1024*1024 + i 

        f, err := os.OpenFile(filename, os.O_WRONLY, 0644)
        if err != nil {
            t.Fatal(err)
        }   
        if n, err := f.WriteString(strconv.Itoa(limit)); err != nil {
            t.Fatalf("Failed to write %d on attempt %d (wrote %d bytes): %s", limit, i, n, err)
        }   
        f.Close()
    }   
}
$ time sudo ./foo-1.14.test 
Got signal: "urgent I/O condition" - sleeping 5 seconds
--- FAIL: TestWriteHandlesEINTR (0.54s)
    eintr_test.go:41: Failed to write 1095844 on attempt 47268 (wrote 0 bytes): write /sys/fs/cgroup/memory/test-844556083/memory.limit_in_bytes: interrupted system call
FAIL

real	0m0.579s
user	0m0.339s
sys	0m0.528s
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 26, 2020

As of the Go 1.14 release, the Go runtime uses SIGURG to preempt goroutines. Because there is no way to distinguish a SIGURG sent by the runtime from a SIGURG sent for any other reasons, all SIGURG signals will be delivered to your program if you ask for them via the os/signal package.

To me the important point here is that the Go runtime installs the SIGURG signal handler with the SA_RESTART flag set, but the write system call is not being restarted.

I ran your test under strace -f and I see the signal handler installed with

rt_sigaction(SIGURG, {sa_handler=0x4669f0, sa_mask=~[], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x466b20}, NULL, 8) = 0

so the SA_RESTART flag is set, and I see this just before the crash:

[pid 46250] tgkill(46249, 46255, SIGURG <unfinished ...>
[pid 46255] write(3, "1048778", 7 <unfinished ...>
[pid 46250] <... tgkill resumed> )      = 0
[pid 46255] <... write resumed> )       = -1 EINTR (Interrupted system call)
[pid 46250] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 46255] --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=46249, si_uid=0} ---

so the signal is indeed interrupting the write, which should not happen.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 26, 2020

If I repeatedly send SIGURG to the go-1.12 binary, I have not been able to reproduce the problem.

I can see in kernel code where cgroups' memory code returns -EINTR, but I can also see arch/x86 code that handles SA_RESTART.

Weird

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Mar 26, 2020

@thockin, the runtime sends thread-targeted signals. Perhaps you are testing by sending a process-targeted SIGURG instead? The difference may be relevant.

@thockin

This comment has been minimized.

Copy link

@thockin thockin commented Mar 26, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.