Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: A way to signal pods #24957

Open
thockin opened this issue Apr 29, 2016 · 63 comments
Open

Feature request: A way to signal pods #24957

thockin opened this issue Apr 29, 2016 · 63 comments

Comments

@thockin
Copy link
Member

@thockin thockin commented Apr 29, 2016

This has come up a bunch of times in conversations. The idea is that some external orchestration is being performed and the user needs to signal (SIGHUP usually, but arbitrary) pods. Sometimes this is related to ConfigMap (#22368) and sometimes not. This also can be used to "bounce" pods.

We can't currently signal across containers in a pod, so any sort of sidecar is out for now.

We can do it by docker kill, but something like kubectl signal is clumsy - it's not clearly an API operation (unless we revisit
'operation' constructs for managing imperative async actions).

Another issue is that a pod doesn't really exist - do we signal one container? Which one? Maybe all containers? Or do we nominate a signalee in the pod spec?

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 29, 2016

We can just use config map as the kube-event-bus. It's what I was going to do for petset.

I need to notify petset peers about changes in cluster membership. The idea is people will write "on-change" hook handlers, the petset will create a config map and write to it anytime it creates a new pet. The kubelet runs a handler exec/http-probe style, on every change to the config map.

I'd rather have a hook than an explicity signal because I don't need a proxy to listen for the signal and reload the pet. In fact to do this reliably I need a handler, because I need to poll till the pet shows up in DNS.

I vastly prefer this to running a pid 1 that is cluster aware (autopilot strategy).

@thockin
Copy link
Member Author

@thockin thockin commented Apr 29, 2016

I opened this as a separate PR because I have been asked several times for
literal signals. Asking people to do something instead of signals,
especially something kube-centric is a non-starter for people who have apps
that need signals. We could ask them to bundle sidecars as bridges, but
because we can not signal across containers, they have to bundle into their
own containers. That is gross to max.

On Fri, Apr 29, 2016 at 2:32 PM, Prashanth B notifications@github.com
wrote:

We can just use config map as the kube-event-bus. It's what I was going to
do for petset.

I need to notify petset peers about changes in cluster membership. The
idea is people will write "on-change" hook handlers, the petset will create
a config map and write to it anytime it creates a new pet. The kubelet runs
a handler exec/http-probe style, on every change to the config map.

I'd rather have a hook than an explicity signal because I don't need a
proxy to listen for the signal and reload the pet. In fact to do this
reliably I need a handler, because I need to poll till the pet shows up in
DNS.

I vastly prefer this to running a pid 1 that is cluster aware (autopilot
strategy).


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 29, 2016

You won't need a sidecar, you'll need a shell script that sends pkill -HUP nginx, right? the kubelet will exec-probe that script when the config map changes. This will require modifcation of pod spec but is more flexible imo.

@thockin
Copy link
Member Author

@thockin thockin commented Apr 30, 2016

You're missing the point. There are other things people want to do that
involve sending a signal and "create a config map, modify your container to
include bash and pkill, modify your pod to mount the config map, write a
script that runs your process and then watches the configmap and signals
your process, and then run that instead of your real app" is a crappy
solution. It's a kludgey workaround for lack of a real feature.

On Fri, Apr 29, 2016 at 3:27 PM, Prashanth B notifications@github.com
wrote:

You won't need a sidecar, you'll need a shell script that sends pkill
-HUP nginx, right? the kubelet will exec-probe that script. This will
require modifcation of pod spec but is more flexible imo.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 30, 2016

write a script that runs your process and then watches the configmap and signals
your process, and then run that instead of your real app"

That's not part of what i want. I explicitly don't want a long running process aware of the config map.

create a config map, modify your container to
include bash and pkill, modify your pod to mount the config map,

I think this is trivial. Not bash, sh or ash. Or http. The same things you need for a probe, or a post start hook, or pre stop hook. concepts we already have.

In short I want a way to tell the container about a reconfigure event and I don't want to teach all databases to handle a signal properly.

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 30, 2016

And I'm trying not to end up with 2 overlapping concepts when you can achieve one with the other. Somehow I dont think people will be against a probe like thing instead of a straight signal.

@thockin
Copy link
Member Author

@thockin thockin commented Apr 30, 2016

It feels like a hack to me. I'd be fine with notifiers, and with notifiers
attached to configmap changes, and with notifiers that included "send a
signal". None of that obviates the utility of "i want to signal my pods"

On Fri, Apr 29, 2016 at 10:55 PM, Prashanth B notifications@github.com
wrote:

And I'm trying not to end up with 2 overlapping concepts when you can
achieve one with the other. Somehow I dont think people will be against a
probe like thing instead of a straight signal.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 30, 2016

Hmm, my feature request also doesn't require a config map, it's the common case. Maybe a notifier of type=probe,exec,signal?

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Apr 30, 2016

@thockin
Copy link
Member Author

@thockin thockin commented Apr 30, 2016

I could see an argument for "restart" as a verb on pods (albeit with he
same issues around imperatives), distinct from signal, though I assumed
signalling a pod meant to signal all containers or a self-nominated signal
receiver container.
On Apr 30, 2016 10:03 AM, "Clayton Coleman" notifications@github.com
wrote:

Regarding signals and restarts, is kill the right signal, or do users
want to restart the pod itself? I fee like the granularity on signals
is processes (of which container is a relatively good proxy), but the
granularity on restart / exec actions is containers or pods.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Apr 30, 2016

@bprashanth
Copy link

@bprashanth bprashanth commented Apr 30, 2016

I'm still kind of stuck at the point where we're special casing a specific form of linux IPC. I really want a cluster notification system, the last hop of which is some implementation of a node local ipc, i.e, say i have a pod with:

notification:
  exec: /on-change.sh
  stdin: ""

and I write "foo" to stdin, the kubelet will deliver that at least once, to the one container with the hook. I then observe generation number and know that my container has received echo foo | on-change.sh. Now say i want to send a signal, I'd just do:

notification:
  signal: posix number??

or

notification:
  exec: "sighup.sh"
  stdin: ""

And I still write "" to the stdin field, and poll on generation.

Is that a different feature request?

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Apr 30, 2016

I kind of agree signal is a subset of "pod notification". The security
aspects of exec make it something I'd prefer to require the pod author to
define, in which case it's a named hook / notification.

On Sat, Apr 30, 2016 at 3:11 PM, Prashanth B notifications@github.com
wrote:

I'm still kind of stuck at the point where we're special casing a specific
form of linux IPC. I really want a cluster notification system, the last
hop of which is some implementation of a node local ipc, i.e, say i have a
pod with:

notification:
exec: /on-change.sh
stdin: ""

and I write "foo" to stdin, the kubelet will deliver that at least once,
to the one container with the hook. I then observe generation number and
know that my container has received echo foo | on-change.sh. Now say i
want to send a signal, I'd just do:

notification:
signal: posix number??

or

notification:
exec: "sighup.sh"
stdin: ""

And I still write "" to the stdin field, and poll on generation.

Is that a different feature request?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented May 10, 2016

I would like to have a concrete way to force and guarantee a restart of a
pod from the outside without hacking. One option that is not signals is a
monotonic value on the pod that a user can update (essentially,
generation). That's more "whole pod" signaling.

On Sat, Apr 30, 2016 at 3:15 PM, Clayton Coleman ccoleman@redhat.com
wrote:

I kind of agree signal is a subset of "pod notification". The security
aspects of exec make it something I'd prefer to require the pod author to
define, in which case it's a named hook / notification.

On Sat, Apr 30, 2016 at 3:11 PM, Prashanth B notifications@github.com
wrote:

I'm still kind of stuck at the point where we're special casing a
specific form of linux IPC. I really want a cluster notification system,
the last hop of which is some implementation of a node local ipc, i.e, say
i have a pod with:

notification:
exec: /on-change.sh
stdin: ""

and I write "foo" to stdin, the kubelet will deliver that at least once,
to the one container with the hook. I then observe generation number and
know that my container has received echo foo | on-change.sh. Now say i
want to send a signal, I'd just do:

notification:
signal: posix number??

or

notification:
exec: "sighup.sh"
stdin: ""

And I still write "" to the stdin field, and poll on generation.

Is that a different feature request?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#24957 (comment)

@bprashanth
Copy link

@bprashanth bprashanth commented May 10, 2016

So all containers BUT pause need to restart? (otherwise why not just delete the pod?)

@eghobo
Copy link

@eghobo eghobo commented May 10, 2016

it should be optional and probably support rollout restart, it's too dangerous to restart everything right a way.

@bprashanth
Copy link

@bprashanth bprashanth commented May 10, 2016

it should be optional and probably support rollout restart,

If it's too dangerous to restart containers in a pod at the same time, put them in different pods/services? Everything else in the system will treat the pod as a unit, so there's probably no avoiding this. Or are you asking for a dependency chain of restarts on containers in a pod?

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented May 10, 2016

Classic cases:

  • service registers itself on start with a system, that systems failed
  • server has caches that need to be flushed because of a failure elsewhere
  • server is flaking in a way that liveness probe doesn't/can't catch
  • debugging a pod on a particular node
  • database that needs a restart after a schema change

In all of these cases the invariant enforced by pod deletion is excessive.

@eghobo
Copy link

@eghobo eghobo commented May 10, 2016

@bprashanth: for example we have replication controller which has N of replicas, if we are going to restart all of them at the same time (e.g. we rollout new config change) we will affect customer traffic.

@therc
Copy link
Contributor

@therc therc commented May 12, 2016

More use cases: some software can be told to increase logging or debugging verbosity with SIGUSR1 and SIGUSR2, to rotate logs, open diagnostic ports, etc.

@pmorie
Copy link
Member

@pmorie pmorie commented May 24, 2016

I also view this as notification. I would prefer to expose an intentional API instead of something in terms of POSIX signals. I would take two actions to start with:

  1. bump - signal the container somehow that a reload should happen of any config
  2. restart - what it says on the tin

The problems with an intentional API are:

  1. The ways of carrying out intent are not going to be universal across platforms; we need to define them for each platform we support; we can say that on Linux, bump means SIGHUP
  2. As others have commented, there will be some overlap with other areas like the use of SIGUSR1 and SIGUSR2 - where do we draw the line between rich APIs for specific platforms versus more abstract (but harder to implement universally) APIs?

@pmorie
Copy link
Member

@pmorie pmorie commented May 24, 2016

Also, for the record, seems plausible that we would constrain the list of signals you could send in an API using PodSecurityPolicy.

@nhlfr
Copy link

@nhlfr nhlfr commented Jun 3, 2016

I have no strong preference about "intentional API vs POSIX" problem. The only proposition I'm opposed is using any king of exec or scripts as a workaround. Docker has docker kill --signal, rkt has the issue where the idea of supporting custom signals is considered rkt/rkt#1496. I'd prefer to go with these features.

@pmorie sorry for my ignorance, but what exact non-Linux platforms we have to care about in k8s? I had no luck with googling/greping that information.

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Jun 3, 2016

@bprashanth
Copy link

@bprashanth bprashanth commented Jun 3, 2016

For the record we should be able to exec scripts from sidecars for notification delivery with shared pidnamespaces, which is coming in docker 1.12 apparently (#1615)

@nhlfr
Copy link

@nhlfr nhlfr commented Jun 3, 2016

@smarterclayton OK, so in that case intentional API sounds better.

So, for now we're going to implement the bump and restart actions as

  1. API actions for pod
  2. Actions which may be triggered by event bus

?

Also, as far as I understand, this event bus is something which we have to implement? Sorry if I'm missing some already existing feature.

If I understand correctly the API part, then I'm eager to implement this.

@thockin
Copy link
Member Author

@thockin thockin commented Aug 15, 2017

There's no generic tag for "design needed". The "feature request" in the title says that, to me.

@chrishiestand
Copy link
Contributor

@chrishiestand chrishiestand commented Sep 29, 2017

Does this PR solve the issue for some/most use-cases? e.g. afaik, a signal sidecar should be possible
#45236

@nhlfr
Copy link

@nhlfr nhlfr commented Sep 29, 2017

@chrishiestand Depends whether you ask about hacky solution you can use here and now for killing containers/processes, or whether you are concerned about designing a proper API for killing containers.

Well, if you have shared PID namepsace, you can add a container with kill binary to the pod (which is continuously tunning, i.e. sleep in the loop), execute it (kubectl exec) and kill any process from the pod you want. This may be a good solution for today.

But I don't think it really helps in the discussion we had about implementing an API.

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jan 6, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@discordianfish
Copy link
Contributor

@discordianfish discordianfish commented Jan 6, 2018

/remove-lifecycle stale

@thockin
Copy link
Member Author

@thockin thockin commented Feb 1, 2018

/lifecycle frozen
/remove-lifecycle stale

@jeffrey4l
Copy link
Contributor

@jeffrey4l jeffrey4l commented May 2, 2018

any news for this?

use kubectl exec is not convenience when there are lots of pod replicates.

@verb
Copy link
Contributor

@verb verb commented May 2, 2018

It's not an API for signaling, but signaling across containers in a pod is available as an alpha feature in 1.10, which could enable sidecars.

You can follow along and give feedback in kubernetes/enhancements#495

@subos2008
Copy link

@subos2008 subos2008 commented Apr 4, 2020

This issue was linked from #29761 (and the source issue closed).

That original issue is saying we want to restart containers when secrets change. I see that as a simpler case than the more complex issue of signalling here. Secrets can be set in the environment, the viewable environment is in some ways set on process startup, so if a secret used to set an environment variable is changed then you could say that container is outdated, and it can be flagged for re-creation.

How is rolling updates on image handling done? Can we use similar logic to flag pods with out of date environments to need 'updating' and restart them in a safe rolling manner?

Would others agree splitting the issues of "I want to send custom signals to a pod" and "outdated pods getting replaced" makes ... well, means we might move some of the less complex use cases forward?

@alitoufighi
Copy link

@alitoufighi alitoufighi commented May 22, 2020

Regarding comparisons of doing a rolling restart instead of just sending signals, in my case, I have a Prometheus deployment that uses a volume to persist the time series, so that its data is huge. When restarting, it takes about 5 minutes to read the tsdb to become available.

But if I send a SIGHUP signal after changing its config in ConfigMap, it only reloads the config and is available without any downtime.

Currently I am using kubectl exec $PROMETHEUS_POD -- /bin/sh -c "/bin/pkill -HUP prometheus" as a workaround, but I think there must be a native Kubernetes approach to do this.

@joemiller
Copy link

@joemiller joemiller commented May 22, 2020

@alinbalutoiu You could use a sidecar container that watches the mounted configmap for changes and sends a signal to prometheus. That's the best/closest to native way I'm aware of.

in the pod spec: shareProcessNamespace: true

A bash script with while loop around inotify

#!/bin/bash
# requires these rpm packages on fedora/centos flavors:
# - procps-ng
# - inotify-tools
# or, requires these deb packages on debian/ubuntu flavors:
# - procps
# - inotify-tools

set -eu -o pipefail

WATCH_FILE="/configmap-mount/prometheus.config"
INOTIFYWAIT_OPTS="${INOTIFYWAIT_OPTS:-"-e modify -e delete -e delete_self"}"

main() {
  while true; do
    while [[ ! -f "$WATCH_FILE" ]]; do
      echo "Waiting for $WATCH_FILE to appear ..."
      sleep 1
    done

    echo "Waiting for '$WATCH_FILE' to be deleted or modified..."
    # shellcheck disable=SC2086
    if ! inotifywait $INOTIFYWAIT_OPTS "$WATCH_FILE"; then
      echo "WARNING: inotifywait exited with code $?"
    fi
    # small grace period before sending SIGHUP:
    sleep 1

    echo "sending SIGHUP to prometheus"
    if ! pkill -HUP prometheus; then
      echo "WARNING: 'pkill' exited with code $?"
    fi
  done
}
main "$@"

inotifywait will block until any changes to the $WATCH_FILE are detected.

You could also try my go-init-sentinel, which is a pid-1 compliant wrapper that allows for watching 1 or more files and sending signals to child processes when those files change. This does not require a sidecar container. https://github.com/joemiller/go-init-sentinel

@haircommander
Copy link
Contributor

@haircommander haircommander commented Jun 25, 2021

is @joemiller 's workaround sufficient for this? it doesn't seem there's a path forward for a clean API for this behavior, and there seem to be multple work arounds (sidecars, shared pid namespaces, etc)

/priority backlog

@zffocussss
Copy link

@zffocussss zffocussss commented Aug 12, 2021

@alinbalutoiu You could use a sidecar container that watches the mounted configmap for changes and sends a signal to prometheus. That's the best/closest to native way I'm aware of.

in the pod spec: shareProcessNamespace: true

A bash script with while loop around inotify

#!/bin/bash
# requires these rpm packages on fedora/centos flavors:
# - procps-ng
# - inotify-tools
# or, requires these deb packages on debian/ubuntu flavors:
# - procps
# - inotify-tools

set -eu -o pipefail

WATCH_FILE="/configmap-mount/prometheus.config"
INOTIFYWAIT_OPTS="${INOTIFYWAIT_OPTS:-"-e modify -e delete -e delete_self"}"

main() {
  while true; do
    while [[ ! -f "$WATCH_FILE" ]]; do
      echo "Waiting for $WATCH_FILE to appear ..."
      sleep 1
    done

    echo "Waiting for '$WATCH_FILE' to be deleted or modified..."
    # shellcheck disable=SC2086
    if ! inotifywait $INOTIFYWAIT_OPTS "$WATCH_FILE"; then
      echo "WARNING: inotifywait exited with code $?"
    fi
    # small grace period before sending SIGHUP:
    sleep 1

    echo "sending SIGHUP to prometheus"
    if ! pkill -HUP prometheus; then
      echo "WARNING: 'pkill' exited with code $?"
    fi
  done
}
main "$@"

inotifywait will block until any changes to the $WATCH_FILE are detected.

You could also try my go-init-sentinel, which is a pid-1 compliant wrapper that allows for watching 1 or more files and sending signals to child processes when those files change. This does not require a sidecar container. https://github.com/joemiller/go-init-sentinel

it is a useful way for us to control process life inside the application container in the same pod.
May I know if there is a common way(maybe CRD, controller) to manage all the pods in a center container/pod?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet