Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: Inconsistent error codes from Process.Signal #7658

Closed
gopherbot opened this issue Mar 28, 2014 · 3 comments
Closed

os: Inconsistent error codes from Process.Signal #7658

gopherbot opened this issue Mar 28, 2014 · 3 comments
Milestone

Comments

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 28, 2014

by glyn.normington:

I noticed this problem while reading the source code of Process.Signal. A discussion on
go-nuts ([1]) concluded that a feature request would be beneficial.

The problem is that one of two errors can be returned if the process terminates before
the signal is sent. If the caller wants to treat this situation as a soft error and
ignore it, she must check for both errors. Not only is this inconvenient, but it exposes
the implementation of Signal to the caller.

Signal starts by checking p.done() and, if this shows the process has terminated, Signal
returns errors.New("os: process already finished"). But if the process
terminates after p.done() has returned false but before Signal calls syscall.Kill, then
(apparently, see [1], but I can't find any documentation of this behaviour) syscall.Kill
will return syscall.ESRCH which will then be returned by Signal.

So the caller who wishes to ignore this error needs to test the Error() string of the
message and, if that compares false, also check to see if the error is syscall.ESRCH.
Not only is this inconvenient and potentially fragile, but it exposes the implementation
of Signal.

The solution is to return a consistent error which needs to be chosen carefully.
syscall.ESRCH seems a little risky to me because it relies on the implementation of
Signal and of syscall.Kill. It seems better for Signal to return a (possibly new)
os.XXXX error.

After choosing the appropriate error code to return, there are two solutions. One is to
fix up the current code so that the same error is returned regardless of whether
p.done() or syscall.Kill detects that the process has terminated.

However a much better solution, in my opinion, is to delete the p.done() check and lean
on the check performed by syscall.Kill. This solution removes a small race and ensures
that a consistent error code is returned. I would still recommend converting
syscall.ESRCH to an os.XXXX error to hide the implementation of Signal from the caller.
The other benefit of this solution is that it reduces the amount of code, which makes
the code easier to read and maintain.

Ian Lance Taylor disagreed with this solution (in discussion [1]) and thinks it would be
better to continue to issue p.done() but ensure that the error code is consistent. Ian's
objection to removing the p.done() check is that he is concerned about its effect on a
quite different race where the process in question terminates *and* its PID is reused.
This race is a known problem in *IX and not something that Go can or should address.
However, Ian was of the opinion that removing the p.done() check makes this race
(significantly) worse - a conclusion with which I disagree.

Let's analyse this second race. If the PID has already been reused before Signal is
called, the proposed change will make no difference and Signal will operate on the wrong
process (and the caller is out of luck). If the PID is reused after Signal has returned,
then the proposed change will again make no difference: Signal will fail and return a
non-nil error code.

The situation of concern to Ian is where the PID is reused in the window between Signal
being called and it returning. Deleting the p.done() check decreases the size of the
window and makes the situation of concern less likely. But also let's consider the
effect of decreasing the size of the window on the race. The effect is to increase the
probability that Signal will return before the PID is reused, which is a beneficial
effect: Signal will operate on the wrong PID a little less often.

[1]
https://groups.google.com/forum/#!searchin/golang-nuts/Process.signal/golang-nuts/2Zmut0tcnPE/wIsFGWmWz1MJ
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 28, 2014

Comment 1:

Labels changed: added repo-main, release-go1.4.

@gopherbot

This comment has been minimized.

Copy link
Author

@gopherbot gopherbot commented Oct 6, 2014

Comment 2:

CL https://golang.org/cl/152240043 mentions this issue.
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 6, 2014

Comment 3:

This issue was closed by revision d21b37b.

Status changed to Fixed.

@gopherbot gopherbot added fixed labels Oct 6, 2014
@rsc rsc added this to the Go1.4 milestone Apr 14, 2015
@rsc rsc removed the release-go1.4 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
wheatman added a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
While we're here, fix the implementation of Release on both
Unix and Windows: Release is supposed to make Signal an error.

While we're here, make sure we never Signal pid 0.
(Don't try this at home.)

Fixes golang#7658.

LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/152240043
wheatman added a commit to wheatman/go-akaros that referenced this issue Jun 26, 2018
While we're here, fix the implementation of Release on both
Unix and Windows: Release is supposed to make Signal an error.

While we're here, make sure we never Signal pid 0.
(Don't try this at home.)

Fixes golang#7658.

LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/152240043
wheatman added a commit to wheatman/go-akaros that referenced this issue Jul 9, 2018
While we're here, fix the implementation of Release on both
Unix and Windows: Release is supposed to make Signal an error.

While we're here, make sure we never Signal pid 0.
(Don't try this at home.)

Fixes golang#7658.

LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/152240043
wheatman added a commit to wheatman/go-akaros that referenced this issue Jul 30, 2018
While we're here, fix the implementation of Release on both
Unix and Windows: Release is supposed to make Signal an error.

While we're here, make sure we never Signal pid 0.
(Don't try this at home.)

Fixes golang#7658.

LGTM=r
R=golang-codereviews, r
CC=golang-codereviews, iant
https://golang.org/cl/152240043
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.