Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: TestEPIPE fails on android #18100

Closed
crawshaw opened this issue Nov 29, 2016 · 9 comments
Closed

os: TestEPIPE fails on android #18100

crawshaw opened this issue Nov 29, 2016 · 9 comments
Assignees
Milestone

Comments

@crawshaw
Copy link
Contributor

@crawshaw crawshaw commented Nov 29, 2016

At tip, TestEPIPE causes the package os test binary to immediately exit. It shows up as no output on the builder, but run locally as go test -short -v os:

--- SKIP: TestMkdirAllAtSlash (0.00s)
	path_test.go:203: skipping on android
=== RUN   TestEPIPE
exitcode=141go_android_exec: adb shell rm -rf /data/local/tmp/os.test-39331
FAIL	os	1.788s
$

The failure is caused by the change in https://golang.org/cl/32796. (I am not sure why, but reverting it fixes the test.)

cc @ianlancetaylor

@crawshaw crawshaw added this to the Go1.8 milestone Nov 29, 2016
@crawshaw

This comment has been minimized.

Copy link
Contributor Author

@crawshaw crawshaw commented Nov 29, 2016

This is also responsible for the os/exec test failure on android, which exits with no error in TestContextCancel (which uses a pipe).

@eliasnaur

This comment has been minimized.

Copy link
Contributor

@eliasnaur eliasnaur commented Nov 29, 2016

On it.

@quentinmit quentinmit added the NeedsFix label Nov 29, 2016
@eliasnaur

This comment has been minimized.

Copy link
Contributor

@eliasnaur eliasnaur commented Nov 30, 2016

Here are my results so far.

CL 32796 did two things: First, the Go signal handler for SIGPIPE is installed even in c-archive and c-shared mode. Second, sigfwdgo is changed to forward SIGPIPE signals to any C signal handler installed before Go. The problem is the second part, SIGPIPE forwarding.

sigfwdgo contains this check:

    // Determine if the signal occurred inside Go code. We test that:
    //   (1) we were in a goroutine (i.e., m.curg != nil), and
    //   (2) we weren't in CGO (i.e., m.curg.syscallsp == 0).
    g := getg()
    if g != nil && g.m != nil && g.m.curg != nil && g.m.curg.syscallsp == 0 {
        return false
    }

However, it seems that (2) in the comment and check disagree: when TestEPIPE triggers a SIGPIPE through the write system call, syscallsp is not 0 because syscallsp is nonzero in both CGO and in syscalls. So sigfwdgo decides to forward the SIGPIPE because it thinks it originates from CGO. The default handler in turn ends up crashing the process.

I was puzzled for a while why Android failed and other unixes didn't. The reason seems to be that the earlier check in sigfwdgo

    // If there is no handler to forward to, no need to forward.
    if fwdFn == _SIG_DFL {
        return false
    }

catches the SIGPIPE early on (my) Linux, while on Android there is a non-default handler for SIGPIPE and so SIGPIPEs fall through to the syscallsp check.

I'm not sure how to proceed from here. Perhaps the CGO check should be fixed to exclude syscalls from Go (if not, the (2) comment should be updated). If so, how? @ianlancetaylor ?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 30, 2016

Thanks for investigating. The case you mention--a signal occurring during a system call--was never a problem before we started handling SIGPIPE in this way, because of course a synchronous signal can never occur while executing a system call. That did not occur to me while reviewing the earlier CL.

Unfortunately I can't think of a way to reliably distinguish running in cgo code from running in a system call. It may be necessary to add a field like ncgo that is true or false depending on whether we are executing in cgo code. But I'm somewhat reluctant to do that at this point in the release cycle.

@eliasnaur

This comment has been minimized.

Copy link
Contributor

@eliasnaur eliasnaur commented Nov 30, 2016

I suspected that much. Without ncgo or similar would you accept a CL that removed forwarding of SIGPIPE for Go 1.8? Forwarding it is most correct, but since SIGPIPE seems like a pretty useless signal I'd rather not forward them than allowing SIGPIPEs from Go to crash c-archive and c-shared programs.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 30, 2016

Frankly at this point my inclination is to roll back the whole change series and try again in 1.9. Failing to forward SIGPIPE will break Unix command line programs that expect to be killed by SIGPIPE when writing to a closed pipe. I don't know if there are any such programs that call Go libraries, but I don't particularly want to fix some programs while potentially breaking others. Especially since the currently broken programs should be able to call signal.Notify(c, SIGPIPE) to collect and discard SIGPIPE signals themselves.

@eliasnaur

This comment has been minimized.

Copy link
Contributor

@eliasnaur eliasnaur commented Dec 1, 2016

@eliasnaur

This comment has been minimized.

Copy link
Contributor

@eliasnaur eliasnaur commented Dec 1, 2016

With the revert in, this issue should be closed. Also, could someone please re-open #17393 and mark it for Go 1.9 (or show me how to do it myself)? Thanks.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 2, 2016

@eliasnaur Issues adjusted. I think I added you to the appropriate github group so that you can frob issues yourself.

@golang golang locked and limited conversation to collaborators Dec 2, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.