Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/signal: add NotifyContext function #37255

Open
henvic opened this issue Feb 17, 2020 · 43 comments
Open

os/signal: add NotifyContext function #37255

henvic opened this issue Feb 17, 2020 · 43 comments

Comments

@henvic
Copy link

@henvic henvic commented Feb 17, 2020

As previously discussed on issues such as #21521 and #16472, or on several other places handling POSIX signals through context seems to be somewhat useful (and a little bit hard to do correctly?), and I would like to propose a way to handle it by adding a new signal.WithContext function.

There's also an idea to improve the current approach to handling signals (see #21521 (comment)). I didn't give it a try yet, unfortunately. I only found the earlier proposals here after trying to write some code, so I decided to create this new issue anyway to share my idea.

People using some sort of it or talking about the subject:

@gopherbot gopherbot added this to the Proposal milestone Feb 17, 2020
@gopherbot gopherbot added the Proposal label Feb 17, 2020
@gopherbot
Copy link

@gopherbot gopherbot commented Feb 17, 2020

Change https://golang.org/cl/219640 mentions this issue: os/signal: add WithContext to control single usage signals easily.

@henvic
Copy link
Author

@henvic henvic commented Feb 17, 2020

Also please notice that this change would, unfortunately, require introducing packages time and context as dependencies for package os/signal.

@ianlancetaylor ianlancetaylor changed the title proposal: signal.WithContext proposal: os/signal: add WithContext function Feb 17, 2020
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 17, 2020

Could you describe the new function in this issue, so that we don't have to figure it out from the CL? Thanks.

@henvic
Copy link
Author

@henvic henvic commented Feb 17, 2020

The new function receives a parent context and a variadic number of signals that can be used to cancel this context. When a parent context is canceled or when one of the signals is handled, the context is canceled.

Example of handling cancelation of a slow process through SIGTERM and SIGINT:

func main() {
	ctx, cancel := signal.WithContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
	defer cancel()

	// ...
	slow.Run(ctx)
	cancel()

	// to verify if the slow function was terminated by a signal, we can use:
	var sigErr signal.ContextError
	if errors.As(ctx.Err(), &sigErr) {
		fmt.Printf("slow function terminated by signal %q\n")
	}
}

On another side, we have the example shown for https://golang.org/pkg/net/http/#Server.Shutdown that, in my opinion, would not be simplified by this.

@robpike
Copy link
Contributor

@robpike robpike commented Feb 17, 2020

Perhaps this should be a function of the context package instead, since it creates a context and it's pretty much all about contexts.

ctx, cancel := context.Signal(context.Background(), syscall.SIGTERM, syscall.SIGINT)

Not sure it's the right idea at all, just throwing it out there.

@mattn
Copy link
Member

@mattn mattn commented Feb 17, 2020

If several packages handle same signal, and runtime does not handle those contexts equally, context.Signal won't work correctly, I think. If runtime does not handle them, context.Signals is not best way . The os/signal package has the potential feeling for us that it is only used in applications, but the context package is used by everyone. So I'm thinking signal.WithContext is better since it make users known that we should do it in only one place. (But I'm also not sure this is right idea)

@rsc
Copy link
Contributor

@rsc rsc commented Mar 11, 2020

To be clear, the runtime does already broadcast signals to anyone who has signed up to hear about them, using signal.Notify. So this could be done as a separate package without worrying about other possible uses of os/signal.

context.WithCancelSignal seems like the clearest name, and it would appear right next to context.WithCancel in the docs.

/cc @Sajmani for thoughts

@rsc rsc changed the title proposal: os/signal: add WithContext function proposal: context: add WithCancelSignal function Mar 25, 2020
@rsc
Copy link
Contributor

@rsc rsc commented Mar 25, 2020

It sounds like maybe we have converged on context.WithCancelSignal. I've retitled.
Based on the discussion above, this seems like a likely accept.

@rsc rsc moved this from Active to Likely Accept in Proposals Mar 26, 2020
@henvic
Copy link
Author

@henvic henvic commented Mar 27, 2020

Great! Thanks for the feedback. I'll try to send a new CL proposing an implementation for a context.WithCancelSignal command Monday.

@undeconstructed
Copy link

@undeconstructed undeconstructed commented Mar 28, 2020

Is there a plan for how this would be documented? Signals are process scoped, so any library using this feature could be problematic. And in a server I would normally catch signals at main level and propagate down, which can be easily/lazily done already by cancelling a single parent context. So using signals for local task cancellation seems only suited for interactive apps. If that's not too controversial, could it be made clear in the docs?

@henvic
Copy link
Author

@henvic henvic commented Mar 31, 2020

I pushed a new implementation to https://go-review.googlesource.com/c/go/+/219640/1

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 31, 2020

I note that currently according to go/build/deps_test.go context depends only on

{"errors", "internal/reflectlite", "sync", "sync/atomic", "time"}

This is going to add a dependency of context on os and os/signal, which pulls in a lot more. I don't know that it matters much, but it seems worth mentioning. Is anybody concerned about the additional dependencies?

@robpike
Copy link
Contributor

@robpike robpike commented Mar 31, 2020

I was worried too, but if you look at the full lists it doesn't add much transitively. In the current tree:

% deps io os os/signal | sort -u
errors
internal/bytealg
internal/cpu
internal/oserror
internal/poll
internal/race
internal/reflectlite
internal/syscall/execenv
internal/syscall/unix
internal/testlog
io
os
runtime
runtime/internal/atomic
runtime/internal/math
runtime/internal/sys
sync
sync/atomic
syscall
time
unsafe
% deps context
errors
internal/bytealg
internal/cpu
internal/oserror
internal/race
internal/reflectlite
runtime
runtime/internal/atomic
runtime/internal/math
runtime/internal/sys
sync
sync/atomic
syscall
time
unsafe
%

The only addition would be io and os themselves, and three internals:internal/syscall/{unix,execenv} and internal/testlog.

I'm not saying I'm happy about this, but it's arguably OK.

@mattn
Copy link
Member

@mattn mattn commented Mar 31, 2020

If receiving the signal in goroutine and context.WithCancelSignal, I wonder how this work.

package main

import (
	"context"
	"os"
	"os/signal"
	"time"
)

func doSomething(ctx context.Context) {
	// ...
}

func main() {
	s := make(chan os.Signal, 1)
	signal.Notify(s, os.Interrupt)

	ctx := context.WithCancelSignal(s)
	doSomething(ctx)

	go func() {
		<-s
		println("interrupted")
	}()

	var b [1]byte
	os.Stdin.Read(b[:])
	time.Sleep(time.Second)
	println("exit")
}
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 31, 2020

One possibility to avoid adding a dependency on os and os/signal is to have the context package reach into the runtime package as the os/signal package already does. One consideration about depending on the os/signal package is that the os/signal package has an init function that slightly changes how the runtime package handles signals. But we might be able to get rid of that anyhow.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Mar 31, 2020

@mattn The proposed context.WithCancelSignal function takes a signal as an argument, not a channel.

But assuming you meant to pass os.Interrupt to context.WithCancelSignal, then on receipt of a signal both 1) the context would be canceled; 2) the signal would be sent on the channel s.

@mattn
Copy link
Member

@mattn mattn commented Mar 31, 2020

Oh, I see.

@henvic
Copy link
Author

@henvic henvic commented Apr 1, 2020

I'm worried about the number of changes in the signal package too. I'll take a look into the runtime package Friday to see if I understand @ianlancetaylor's idea and to see if I can come up with something.

@henvic henvic closed this Apr 1, 2020
@henvic henvic reopened this Apr 1, 2020
@rsc
Copy link
Contributor

@rsc rsc commented Apr 1, 2020

FWIW, I think we can easily hook this into the runtime package directly (with an internal package if needed) and bypass os and os/signal, to keep the deps lite. The general trend is to avoid os where possible, and all this code only needs the runtime.

No change in consensus here, so accepted.

@rsc rsc moved this from Likely Accept to Accepted in Proposals Apr 1, 2020
@rsc rsc added this to the Proposal milestone Apr 15, 2020
@rsc
Copy link
Contributor

@rsc rsc commented Apr 15, 2020

@rogpeppe, I am a little confused about "the inclusion of syscall in the public API".
The public API would be

package context
func WithCancelSignal(ctx Context, ... os.Signal) (Context, func())

It mentions os, not syscall. It's true that calls might pass signals defined in syscall, but that will happen regardless.

It seems like the three choices would be:

  • syscall.SomethingContext - syscall imports context, argument can't be os.Signal, would need to be syscall.Signal, more OS-dependent than what was previously accepted.

  • os/signal.SomethingContext - os/signal imports context. argument could be os.Signal

  • os.SomethingContext - os imports context. argument could be os.Signal

It seemed cleaner to keep the context constructors in context. Is there a good argument for os or os/signal to start importing context and turn it around?

@Merovius
Copy link

@Merovius Merovius commented Apr 15, 2020

I know that "it feels wrong to me" isn't a great argument, but I agree with @rogpeppe - this felt wrong to me when it was accepted and it still feels wrong. context feels, to me, as a relatively pure abstraction and signal-handling like a relatively platform-dependent use-case.

To try a more tangible argument: Say, hypothetically, the opentracing API would make it into the stdllib. Would we put StartSpanFromContext into the context-package, just because it returns a Context? That seems far-fetched. Of course, we wouldn't put the opentracing API into the stdlib as is and whatever would make it might well have a more natural answer. But IMO it still illustrates that something might return a context, but still fit more naturally into another package.

Another "argument" is that signal.WithContext (or whatever) is naturally implemented as a pretty simple wrapper around context.WithCancel, so it at least has no need to share a namespace with the rest of the context package. Of course it also has no need to share a namespace with the signal-package, but at least it seems closer aligned with that concern IMO.

At the end of the day, both locations seem not fully satisfying to me and as a corollary I can also live with either. But if I was the one who's vague feelings it gets down to, I'd put it into signal as well :)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 15, 2020

If we do decide that os/signal is the right place for this functionality, I think the name should be simply signal.Context.

package signal
func Context(ctx context.Context, ...os.Signal) (context.Context, context.CancelFunc)

I don't see how signal.WithContext makes sense, and any name that really describes the scenario gets pretty long, as in signal.CancelContextOnSignal. I think signal.Context is clear enough.

@robpike Note that you suggested the context package above in #37255 (comment) and then changed your mind in #37255 (comment). Which is fine, just wanted to double check.

@robpike
Copy link
Contributor

@robpike robpike commented Apr 16, 2020

@ianlancetaylor I didn't exactly change my mind. If you read those comments again, I think you'll see my position has been tentative at best all along. @Merovius explains it well, I think. Context is and perhaps should remain an abstraction free of outside dependency.

I want to reiterate though that I am speaking only from the point of view of a general API designer, not as someone who is a heavy user of either the context or signal packages.

@rogpeppe
Copy link
Contributor

@rogpeppe rogpeppe commented Apr 16, 2020

It mentions os, not syscall. It's true that calls might pass signals defined in syscall, but that will happen regardless.

Yes, sorry, my mistake - I read the syscall values in the examples and assumed the wrong signature. I think my point still stands though. The os package is quite large and system specific, which doesn't seem to fit with the small context API to me.

It seemed cleaner to keep the context constructors in context. Is there a good argument for os or os/signal to start importing context and turn it around?

I think so. Anyone that wishes to add any kind of custom cancellation to a context must provide their own constructor, and I don't see why signal-based cancellation should be any different.
If it's inefficient to do so, then perhaps that's something more general that could be addressed in the context package.

FWIW I support @ianlancetaylor's suggestion of a signal.Context function.

@rsc
Copy link
Contributor

@rsc rsc commented Apr 22, 2020

@ianlancetaylor @bradfitz and I bikeshedded the name for a while. This is the best we came up with:

package signal
func NotifyContext(parent context.Context, signals ...os.Signal) (ctx context.Context, cancel context.CancelFunc)

NotifyContext returns a copy of the parent context that is marked done
(it's Done channel is closed) when one of the listed signals arrives,
when the returned cancel function is called, or when the parent context's
Done channel is closed, whichever happens first.

Canceling this context releases resources associated with it, so code should
call cancel as soon as the operations running in this Context complete.

Thoughts?

@henvic
Copy link
Author

@henvic henvic commented Apr 22, 2020

From the perspective of a user, this sounds good and very useful to me to write cleaner code.

Also, what about an error type such as the ContextError I implemented in patchset 1 https://go-review.googlesource.com/c/go/+/219640/1?

// ContextError is the error returned by Context.Err when the context is
// canceled by an operating system signal.
type ContextError struct {
	sig os.Signal
}

func (e ContextError) Error() string {
	return "context canceled after receiving signal " + e.sig.String()
}

// Signal that canceled the context.
func (e ContextError) Signal() os.Signal { return e.sig }

The idea is to make it possible to identify what signal canceled it. This is useful for deciding what to do next. For example, you might want to delegate handling your signals to a function and then differentiate what is happening there, like:

SIGHUP: configuration file reload.
SIGTERM, SIGINT: stop handling new HTTP requests and exit applications gracefully.

  • I'm not satisfied with the name ContextError, though.
@bcmills
Copy link
Member

@bcmills bcmills commented Apr 22, 2020

Canceling this context releases resources associated with it, so code should
call cancel as soon as the operations running in this Context complete.

Does canceling the context also (potentially) change the signal behavior of the program?

If so, does that signal behavior only change when cancel is called, or can that also happen implicitly when the parent context becomes done and/or the signal arrives? (Personally, I would prefer that it only change when cancel is called.)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 22, 2020

I'm not necessarily opposed to context.SignalError, but I think that if you want to make decisions based on which signal was received then you shouldn't be using signal.NotifyContext. You should be using context.WithCancel and signal.Notify, and canceling the context when receiving a value from the signal.Notify channel. We already have a mechanism for handling signals; the point of this change is to provide a convenient way to cancel a context on receipt of a signal. If you have to do different things based on different signals, we already have mechanisms for that.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 22, 2020

Ouch, I hadn't thought about changes in signal handling behavior. I agree that the only possible choice here is that the signal handling behavior only changes when or if the cancel function is explicitly called. Otherwise the wrong thing would happen if the program received two SIGHUP signals in a row.

@pierrre
Copy link

@pierrre pierrre commented Apr 22, 2020

Personally I will probably not use this new function.
I'm in favor of not adding this to the stdlib.
It's very simple to write your own implementation in your app and the need may be different in each app.

I've already built a package that I'm using at my company, and I don't think it's compatible with your proposal.
Here what it is doing:

  • the first time it receives a signal, it cancels the context (without unregistering the signal watcher)
  • the second time it receives the signal, it exits the app
  • the caller must explicitly unregister the signal watcher

Here is the code if someone is interested:

// Package ctxsignal provides a bridge for context and POSIX signals.
package ctxsignal

import (
	"context"
	"log"
	"os"
	"os/signal"
	"sync"
	"syscall"

	"github.com/xxx/yyy/goroutine"
)

// RegisterCancel registers a listener for the SIGINT/SIGTERM signals.
// The first time that the signal is received, the context is canceled.
// The second time, the program exits with status 0.
func RegisterCancel(parent context.Context) (ctx context.Context, unregister func()) {
	ctx, cancel := context.WithCancel(parent)
	c := make(chan os.Signal, 1)
	signal.Notify(c, syscall.SIGINT, syscall.SIGTERM)
	waitWatchSignal := goroutine.Go(func() {
		watchSignal(c, cancel)
	})
	unregister = newUnregisterFunc(c, waitWatchSignal)
	return ctx, unregister
}

func watchSignal(c <-chan os.Signal, cancel func()) {
	sig, ok := <-c
	if !ok {
		return
	}
	log.Printf("%q signal received, context canceled. Send it again to exit the program immediately.", sig)
	cancel()
	_, ok = <-c
	if !ok {
		return
	}
	log.Println("Signal received twice, exit now.")
	os.Exit(0)
}

func newUnregisterFunc(c chan os.Signal, waitWatchSignal func()) func() {
	var once sync.Once
	return func() {
		once.Do(func() {
			signal.Stop(c)
			close(c)
		})
		waitWatchSignal()
	}
}

(waitWatchSignal is a function that blocks until the goroutine is finished).

PS: I use it at the very beginning of my applications, in order to support graceful stop.

@henvic
Copy link
Author

@henvic henvic commented Apr 22, 2020

Not an expert in signal handling by any means, but...

Why would it be bad to change the signal handling behavior when the parent context is canceled? Wouldn't also be similarly bad if it doesn't (especially if there is a way to check what signal caused the context to end)? It's not clear to me what an example would be. Instead, shouldn't NotifyContext be used first when this happens?

Another thing to worry: if you want to use this for something like handling configuration reload through SIGHUP, it might be fine if you lose a signal (if you replace the context immediately to keep this going on - I'm not sure if this would be good btw). However, if you're listening to, say, SIGTERM, if a second SIGTERM arrives before you called signal.Notify again, your process might terminate before it should.

@bcmills
Copy link
Member

@bcmills bcmills commented Apr 22, 2020

The idea is to make it possible to identify what signal canceled it.

There is already a pretty strong precedent that ctx.Err() is exactly either context.Canceled or context.DeadlineExceeded. Rather than deviating from that precedent, it might be less intrusive to encode the signal in the Value method and provide a top-level accessor for it:

package signal

// ContextSignal reports which (if any) signal caused ctx to be canceled.
//
// If ctx was not canceled due to a signal, ContextSignal returns (nil, false).
func ContextSignal(ctx context.Context) (os.Signal, bool) {
	sig, ok := ctx.Value(signalContextKey{}).(os.Signal)
	return sig, ok
}
@henvic
Copy link
Author

@henvic henvic commented Apr 22, 2020

@bcmills, I like it.

@narqo
Copy link
Contributor

@narqo narqo commented Apr 23, 2020

@bcmills, If somewhere within a call stack, I had an operation, that accepted the context and reacted to its cancellation, I don't think I would check something related to the signal in this operation, but rather returned ctx.Err(). After this error bubbled up to someone who "cares" about the signals, it would

  1. check if the error wasn't nil
  2. check if the context was cancelled
  3. check if there is a signal in the context's value

Is that how ContextSignal expected to be used?

@rsc
Copy link
Contributor

@rsc rsc commented Apr 29, 2020

To fix the double-signal problem, it sounds like we have to deviate from the usual context.WithCancel to require calling the cancellation function if you want to stop the captured signal behavior. That is, the docs would change to:

package signal
func NotifyContext(parent context.Context, signals ...os.Signal) (ctx context.Context, stop context.CancelFunc)

NotifyContext returns a copy of the parent context that is marked done
(it's Done channel is closed) when one of the listed signals arrives,
when the returned stop function is called, or when the parent context's
Done channel is closed, whichever happens first.

The stop function unregisters the signal behavior, which, like signal.Reset,
may restore the default behavior for a given signal. For example, the default
behavior of a Go program receiving os.Interrupt is to exit. Calling
NotifyContext(parent, os.Interrupt) will change the behavior to cancel
the returned context. Future interrupts received will not trigger the default
(exit) behavior until the returned stop function is called.

The stop function releases resources associated with it, so code should
call stop as soon as the operations running in this Context complete and
signals no longer need to be diverted to the context.


It remains unclear that we really need the signal details. For that, it's easy enough to write the simple code that handles the signal, records it, and cancels a context. That is, NotifyContext is meant to be more a convenience than a complete Swiss army knife. We can always add it later if we need to, but it probably makes sense to start without it.

Thoughts?

@henvic
Copy link
Author

@henvic henvic commented Apr 29, 2020

It makes sense to me. I liked the idea of calling the cancel function stop to differentiate things.

@rsc rsc changed the title context: add WithCancelSignal function os/signal: add NotifyContext function May 6, 2020
@rsc
Copy link
Contributor

@rsc rsc commented May 6, 2020

Based on the discussion above, it seems like we've reconverged on signal.NotifyContext, documented in #37255 (comment).

This seems like a likely accept (take two).

@rsc rsc moved this from Active to Likely Accept in Proposals May 6, 2020
@rsc rsc changed the title os/signal: add NotifyContext function proposal: os/signal: add NotifyContext function May 6, 2020
@rsc
Copy link
Contributor

@rsc rsc commented May 20, 2020

No change in consensus (yet?) so accepted (again).

@rsc rsc moved this from Likely Accept to Accepted in Proposals May 20, 2020
@rsc rsc modified the milestones: Proposal, Backlog May 20, 2020
@rsc rsc changed the title proposal: os/signal: add NotifyContext function os/signal: add NotifyContext function May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Accepted
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.