Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: allow reading sysfs, etc with ignoring event scanning error #30817

Closed
stgraber opened this Issue Mar 13, 2019 · 7 comments

Comments

Projects
None yet
6 participants
@stgraber
Copy link

stgraber commented Mar 13, 2019

What version of Go are you using (go version)?

go version devel +870cfe6484 Wed Mar 13 21:44:45 2019 +0000 linux/amd64

Does this issue reproduce with the latest release?

No, this is only reproducible with master, 1.10, 1.11 and 1.12 are all unaffected.

What operating system and processor architecture are you using (go env)?

Linux buildd01 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

go env Output
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GOPROXY=""
GORACE=""
GOROOT="/lxc-ci/build/cache/gimme/versions/go"
GOTMPDIR=""
GOTOOLDIR="/lxc-ci/build/cache/gimme/versions/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build092171511=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Attempted multiple parallel reads of the same file from /sys

package main

import (
	"fmt"
	"io/ioutil"
	"sync"
	"time"
)

func main() {
	wg := sync.WaitGroup{}

	readSysfs := func() {
		for i := 0; i < 100; i++ {
			_, err := ioutil.ReadFile("/sys/devices/system/cpu/cpu0/topology/core_id")
			if err != nil {
				fmt.Printf("error: %v\n", err)
			}
			time.Sleep(100*time.Millisecond)
		}
		wg.Done()
	}

	for i := 0; i < 4; i++ {
		wg.Add(1)
		go readSysfs()
	}

	wg.Wait()
}

What did you expect to see?

No errors, all reads succeeding as they do with all other tested Go versions

What did you see instead?

root@buildd01:~# /lxc-ci/build/cache/gimme/versions/go/bin/go run test.go 
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable
error: read /sys/devices/system/cpu/cpu0/topology/core_id: not pollable

Tests so far show about a 15% failure rate on that ReadFile call.

@stgraber

This comment has been minimized.

Copy link
Author

stgraber commented Mar 13, 2019

We've seen this popping up in LXD's automated testing starting over the past two days or so, suggesting a pretty recent regression.

@tmthrgd

This comment has been minimized.

Copy link
Contributor

tmthrgd commented Mar 13, 2019

This is possibly caused by CL 166497 for #30624.

@mikioh mikioh changed the title Current Go master breaks reading files from /sys runtime: allow reading sysfs, etc with ignoring event scanning error Mar 14, 2019

@mikioh mikioh added the OS-Linux label Mar 14, 2019

@mikioh mikioh added this to the Go1.13 milestone Mar 14, 2019

@mikioh

This comment has been minimized.

Copy link
Contributor

mikioh commented Mar 14, 2019

Thanks for the report. In general, using your own polling stuff is better to read sysfs-like virtual files because the runtime-integrated network poller never uses EPOLLPRI or a pair of EPOLLPRI+EPOLLERR which is able to provide the special sign, for example, actual data reception from the underlying device. For backward compatibility, I'll make the poller a bit conservative to allow most user-configured files to ignore event scanning errors except the case of /dev/net/tun-like misconfigured stuff blocking the subsequent I/O calls forever.

@ianlancetaylor, any opinion?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Mar 15, 2019

I don't know enough to have an opinion. Why would the poller sometimes return EPOLLERR here?

@mikioh

This comment has been minimized.

Copy link
Contributor

mikioh commented Mar 15, 2019

For example, on some special files, EPOLLIN+EPOLLOUT indicates the underlying stuff is ready for operation and EPOLLPRI+EPOLLERR indicates actual data reception. Other stuff uses another combination. Fortunately, we may use an individual POLLERR, EPOLLERR or EV_ERROR as a critical state by convention the same as marking all events as the end of a session; see https://go-review.googlesource.com/c/go/+/167777

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Mar 15, 2019

Change https://golang.org/cl/167777 mentions this issue: runtime, internal/poll: report only critical event scanning error

@mvdan

This comment has been minimized.

Copy link
Member

mvdan commented Mar 17, 2019

I hit this while trying to use https://github.com/aclements/perflock on a recent Go build. I was scratching my head for a good fifteen minutes until I realised it wasn't setting the right CPU frequency because of these read errors.

@mikioh mikioh removed the OS-Linux label Mar 18, 2019

@gopherbot gopherbot closed this in 451a2eb Mar 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.