New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evented Process Monitor #11529

Closed
crosbymichael opened this Issue Mar 20, 2015 · 42 comments

Comments

Projects
None yet
@crosbymichael
Contributor

crosbymichael commented Mar 20, 2015

TLDR:

Pros:

  • Evented lock free code
  • 1 goroutine for all containers
  • Generic zombie fighting abilities for free

Cons:

  • none

Currently docker uses exec.Cmd and cmd.Wait() inside a goroutine for blocking on a container's processes until it finishes. After a container dies two things can happen. One, the container's process is restarted depending if the user requested that the container always restart. Two, we tear down the container and update it's status with the exit code and release any resources required.

Writing a process monitor like this is not very efficient and docker is unable to handle SIGCHLD events to reap zombies that were not direct children of the daemon's process.

It also means that if we have one goroutine per container, if we have 1000 containers we have 1000 blocking goroutines in the daemon. Booooo.

We can do better. The proper way is to move to an evented system for monitoring when child processes change state. This can be handled via SIGCHLD. A process can setup a signal handler for SIGCHLD events and when the status of a child process changes this signal is sent to the handler. We can use this to extract the pid, exit status, and make decision on how to handle the event.

Using an evented system like this we can reduce the number of goroutines to 1 for N number of containers also reduce the amount of locks that are required to handle the previous level of concurrency. Running one container will require 1 goroutine, running 10k container's will require 1 goroutine, win. This model also allows us to reap zombies, because zombies are bad m'kay, in the daemon process that are not direct children. i.e. non container PID1's.

A sample code of what the process monitor would look like is follows:

package main

import (
    "os"
    "os/exec"
    "os/signal"
    "sync"
    "syscall"
    "time"

    "github.com/Sirupsen/logrus"
)

var pidsICareAbout map[int]string

func handleEvents(signals chan os.Signal, group *sync.WaitGroup) {
    defer group.Done()
    for sig := range signals {
        switch sig {
        case syscall.SIGTERM, syscall.SIGINT:
            // return cuz user said so
            return
        case syscall.SIGCHLD:
            var (
                status syscall.WaitStatus
                usage  syscall.Rusage
            )
            pid, err := syscall.Wait4(-1, &status, syscall.WNOHANG, &usage)
            if err != nil {
                logrus.Error(err)
            }
            logrus.WithField("pid", pid).Info("process status changed")
            if _, ok := pidsICareAbout[pid]; ok {
                logrus.Infof("i care about %d", pid)
                // we can modify the map without a lock because we have a single
                // goroutine and handler for signals.
                delete(pidsICareAbout, pid)
                if len(pidsICareAbout) == 0 {
                    // return after everything is dead
                    return
                }
            } else {
                logrus.Infof("---> i don't care about %d", pid)
            }
        }
    }
}

func runSomething() error {
    // just add some delay for demo
    time.Sleep(1 * time.Second)
    cmd := exec.Command("sh", "-c", "sleep 5")
    // cmd.Start() is non blocking
    if err := cmd.Start(); err != nil {
        return err
    }
    // get the pid because I care about this one.
    logrus.WithField("pid", cmd.Process.Pid).Info("spawned new process")
    pidsICareAbout[cmd.Process.Pid] = "sleep 5"
    return nil
}

func randomFork() error {
    syscall.ForkLock.Lock()
    pid, _, err := syscall.RawSyscall(syscall.SYS_FORK, 0, 0, 0)
    syscall.ForkLock.Unlock()
    if err != 0 {
        return err
    }
    if pid == 0 {
        logrus.Info("i'm on a boat")
        os.Exit(0)
    } else {
        logrus.Infof("forked off %d", pid)
    }
    return nil
}

func main() {
    signals := make(chan os.Signal, 1024)
    signal.Notify(signals, syscall.SIGCHLD, syscall.SIGTERM, syscall.SIGINT)
    pidsICareAbout = make(map[int]string)
    group := &sync.WaitGroup{}
    group.Add(1)
    go handleEvents(signals, group)
    for i := 0; i < 5; i++ {
        if err := runSomething(); err != nil {
            logrus.Fatal(err)
        }
    }
    // fork off a random process that we don't care about but make sure the
    // signal handler reaps it when it dies.
    if err := randomFork(); err != nil {
        logrus.Error(err)
    }
    logrus.Info("waiting on processes to finish")
    group.Wait()
    logrus.Info("all processes are done, exiting...")
}

We should build this in a generic way because we want this monitor with restart capabilities to be available to any type of process the daemon can spawn.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael
Contributor

crosbymichael commented Mar 20, 2015

@mrunalp

This comment has been minimized.

Show comment
Hide comment
@mrunalp

mrunalp Mar 20, 2015

Contributor

Multiple child processes dying quickly could generate a single SIGCHLD so would have to syscall.Wait4 in a loop till you get -1 unless golang is changing signal semantics somehow.

Contributor

mrunalp commented Mar 20, 2015

Multiple child processes dying quickly could generate a single SIGCHLD so would have to syscall.Wait4 in a loop till you get -1 unless golang is changing signal semantics somehow.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Mar 20, 2015

Contributor

@mrunalp no, you are correct.

Contributor

crosbymichael commented Mar 20, 2015

@mrunalp no, you are correct.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Mar 21, 2015

Contributor

I do declare, that's awesome.

Contributor

cpuguy83 commented Mar 21, 2015

I do declare, that's awesome.

@randywallace

This comment has been minimized.

Show comment
Hide comment
@randywallace

randywallace Mar 21, 2015

+1

On Fri, Mar 20, 2015 at 9:07 PM, Brian Goff notifications@github.com
wrote:

I do declare, that's awesome.


Reply to this email directly or view it on GitHub
#11529 (comment).

Randy D. Wallace Jr.
randy@randywallace.com
646-484-8004

randywallace commented Mar 21, 2015

+1

On Fri, Mar 20, 2015 at 9:07 PM, Brian Goff notifications@github.com
wrote:

I do declare, that's awesome.


Reply to this email directly or view it on GitHub
#11529 (comment).

Randy D. Wallace Jr.
randy@randywallace.com
646-484-8004

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Mar 21, 2015

Contributor

@crosbymichael I was thinking of a design where this would be a drop in replacement for exec.Command. Since this impacts all code that forks process, quite a bit will have to change. Additionally, no new code should ever use exec.Command in Docker. By having a compatible API we could easily switch and then essentially blacklist exec.Command and in CI scan the code for references to it.

Obviously this doesn't buy any benefits in the terms of reducing goroutines. The new Command.Wait() will just communicate over a channel to the central signal handling loop. To address the goroutines I'm thinking we add Command.OnExit(func(int)) and Command.StartFor(handler). OnExit() will allow one to register a callback for that specific process. StartFor(handler) would start a process and then on exit notify a common handler that could be registered through some API like ProcessManager.AddHandler(name string, handler ProcessHandler). I imagine libcontainer would have one generic handler for all containers.

Contributor

ibuildthecloud commented Mar 21, 2015

@crosbymichael I was thinking of a design where this would be a drop in replacement for exec.Command. Since this impacts all code that forks process, quite a bit will have to change. Additionally, no new code should ever use exec.Command in Docker. By having a compatible API we could easily switch and then essentially blacklist exec.Command and in CI scan the code for references to it.

Obviously this doesn't buy any benefits in the terms of reducing goroutines. The new Command.Wait() will just communicate over a channel to the central signal handling loop. To address the goroutines I'm thinking we add Command.OnExit(func(int)) and Command.StartFor(handler). OnExit() will allow one to register a callback for that specific process. StartFor(handler) would start a process and then on exit notify a common handler that could be registered through some API like ProcessManager.AddHandler(name string, handler ProcessHandler). I imagine libcontainer would have one generic handler for all containers.

@pwaller

This comment has been minimized.

Show comment
Hide comment
@pwaller

pwaller Mar 21, 2015

Contributor

I'd just like to raise the canary of caution 🐦 with respect to introducing a non-idiomatic approach to such a common task. Callbacks seem quite a strange suggestion in go. But then, in a way, avoiding a few thousand goroutines also seems a bit strange. It's usually just said, "start a goroutine, they're cheap!". Have large numbers of goroutines proven themselves to be a problem?

Contributor

pwaller commented Mar 21, 2015

I'd just like to raise the canary of caution 🐦 with respect to introducing a non-idiomatic approach to such a common task. Callbacks seem quite a strange suggestion in go. But then, in a way, avoiding a few thousand goroutines also seems a bit strange. It's usually just said, "start a goroutine, they're cheap!". Have large numbers of goroutines proven themselves to be a problem?

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Mar 21, 2015

Contributor

The approach I described still allows for a go idiomatic approach Wait(), but will have to be using a custom package name and not exec. The number of goroutine isn't a huge issue, the root issue just has to do with the calling style. With the current approach it's really hard to cleanly manage zombies. Its a really common pattern in supervisor style code to have an event driven model to handle processes. Because of some things in the guts of golang we can't combine an event driven model with other goroutines calling Command.Wait()

Contributor

ibuildthecloud commented Mar 21, 2015

The approach I described still allows for a go idiomatic approach Wait(), but will have to be using a custom package name and not exec. The number of goroutine isn't a huge issue, the root issue just has to do with the calling style. With the current approach it's really hard to cleanly manage zombies. Its a really common pattern in supervisor style code to have an event driven model to handle processes. Because of some things in the guts of golang we can't combine an event driven model with other goroutines calling Command.Wait()

@sirupsen

This comment has been minimized.

Show comment
Hide comment
@sirupsen

sirupsen Mar 22, 2015

Contributor

Generic zombie fighting abilities for free

This is a big deal. It's one of the things that are hurting people in production. I don't think it should be a requirement for a large-scale Docker deployment to have to care about init.

A lot of people currently don't write custom init processes for their containers, because they don't realize what responsibilities an init actually has (or that they even need one). It's fine if you're always running a single process inside your container, but most applications fork+exec at some point (i.e. to shell out) and don't handle the case where the parent of such a process dies before the child does. In the past our init didn't ack children, which means you can't terminate the container completely since the zombie sticks around in the kernel data structure. This can lead to incredibly obscure bugs:

The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes.

Meaning if someone is running a Docker deployment with high container churn (which is common), e.g. an application with a master-worker model where the workers get killed if they take too long. If those workers are wait()ing on e.g. a shelled out command, that'll leak. Eventually the process table will fill up—I'm sure this is already happening in production deployments and people have no clue what's going on.

I don't think it should be required for Docker users, even in very large deployments, to have this much of an understand of the PID namespace (it's documented on the man page) and their container layouts. Docker should definitely be reaping children to avoid this problem.

Concerns

What happens when you have multiple processes trying to wait4(2) on the same processes? The custom init people have inside their containers may be updating state (internally or externally) when a process quits. AFAIK both processes receive a SIGCHL and get the process returned by wait(2), but something to write a test for.

Another question I'd have is whether the kernel assumes anything else about init? As far as I know it doesn't, but kernel developers develop with the reaper living inside the PID namespace in mind as it's not common for processes to wait(4) across immediate parent/child boundaries

Observations from inits in the wild

I can't think of any other responsibilities an init should have that can't be covered by the planned plugin/extension API. This means people do not have to care about init processes anymore. Most custom inits I've seen in the wild do things such as:

  • Secret decryption. This should eventually be solved by Docker, possibly with a certificate-based model or through a local daemon where containers request secrets.
  • Refresh TTL for service discovery. This can be solved with the API and a side-kick container.
  • Ensure the ENV is right. Many deployments rely on some environment variables being set for containers in their environment to work (and have wrapper scripts around docker run to make it convenient). This could be solved in Docker, however, I think in most cases this is rather hacky and not something that should be encouraged.
  • Executing from a Procfile. I've seen people in the wild, including us, who have a key-value file inside their application code (in JSON, YAML, or whatever) and the init translates the key (e.g. web) to the command. However, I don't think it's a big deal to get rid of it.
  • Proxying signals. It needs to proxy signals down to the child. This is no longer necessary if the init does not need to be custom as pid 0 will be the main process of the container.

Mostly I think all this functionality is dangerous to encourage because if any of it blocks (particularly easy in service discovery code) the container will not shut down. Similarly if it crashes, the entire container crashes. Docker should not encourage people to write custom inits.

The reason why this is currently happening, is because larger deployments need an init anyway with a little bit of custom functionality—it becomes a slippery slope to just shove more in there. Most of this could be implemented by using the Docker API, which IMO should be the encouraged way since it's much easier. It's currently somewhat painful, because it does require running a sidekick process on every host to listen to the API and report it (error reporting, service discovery, ..).

To sum up: Let's encourage people to not write custom inits. It's dangerous. This is a major step in that direction.

Contributor

sirupsen commented Mar 22, 2015

Generic zombie fighting abilities for free

This is a big deal. It's one of the things that are hurting people in production. I don't think it should be a requirement for a large-scale Docker deployment to have to care about init.

A lot of people currently don't write custom init processes for their containers, because they don't realize what responsibilities an init actually has (or that they even need one). It's fine if you're always running a single process inside your container, but most applications fork+exec at some point (i.e. to shell out) and don't handle the case where the parent of such a process dies before the child does. In the past our init didn't ack children, which means you can't terminate the container completely since the zombie sticks around in the kernel data structure. This can lead to incredibly obscure bugs:

The kernel maintains a minimal set of information about the zombie process (PID, termination status, resource usage information) in order to allow the parent to later perform a wait to obtain information about the child. As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes.

Meaning if someone is running a Docker deployment with high container churn (which is common), e.g. an application with a master-worker model where the workers get killed if they take too long. If those workers are wait()ing on e.g. a shelled out command, that'll leak. Eventually the process table will fill up—I'm sure this is already happening in production deployments and people have no clue what's going on.

I don't think it should be required for Docker users, even in very large deployments, to have this much of an understand of the PID namespace (it's documented on the man page) and their container layouts. Docker should definitely be reaping children to avoid this problem.

Concerns

What happens when you have multiple processes trying to wait4(2) on the same processes? The custom init people have inside their containers may be updating state (internally or externally) when a process quits. AFAIK both processes receive a SIGCHL and get the process returned by wait(2), but something to write a test for.

Another question I'd have is whether the kernel assumes anything else about init? As far as I know it doesn't, but kernel developers develop with the reaper living inside the PID namespace in mind as it's not common for processes to wait(4) across immediate parent/child boundaries

Observations from inits in the wild

I can't think of any other responsibilities an init should have that can't be covered by the planned plugin/extension API. This means people do not have to care about init processes anymore. Most custom inits I've seen in the wild do things such as:

  • Secret decryption. This should eventually be solved by Docker, possibly with a certificate-based model or through a local daemon where containers request secrets.
  • Refresh TTL for service discovery. This can be solved with the API and a side-kick container.
  • Ensure the ENV is right. Many deployments rely on some environment variables being set for containers in their environment to work (and have wrapper scripts around docker run to make it convenient). This could be solved in Docker, however, I think in most cases this is rather hacky and not something that should be encouraged.
  • Executing from a Procfile. I've seen people in the wild, including us, who have a key-value file inside their application code (in JSON, YAML, or whatever) and the init translates the key (e.g. web) to the command. However, I don't think it's a big deal to get rid of it.
  • Proxying signals. It needs to proxy signals down to the child. This is no longer necessary if the init does not need to be custom as pid 0 will be the main process of the container.

Mostly I think all this functionality is dangerous to encourage because if any of it blocks (particularly easy in service discovery code) the container will not shut down. Similarly if it crashes, the entire container crashes. Docker should not encourage people to write custom inits.

The reason why this is currently happening, is because larger deployments need an init anyway with a little bit of custom functionality—it becomes a slippery slope to just shove more in there. Most of this could be implemented by using the Docker API, which IMO should be the encouraged way since it's much easier. It's currently somewhat painful, because it does require running a sidekick process on every host to listen to the API and report it (error reporting, service discovery, ..).

To sum up: Let's encourage people to not write custom inits. It's dangerous. This is a major step in that direction.

@sirupsen

This comment has been minimized.

Show comment
Hide comment
@sirupsen

sirupsen Mar 22, 2015

Contributor

I can open a separate issue on this matter since it's somewhat orthogonal to this discussion.

Contributor

sirupsen commented Mar 22, 2015

I can open a separate issue on this matter since it's somewhat orthogonal to this discussion.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Mar 25, 2015

Contributor

The missing piece from my description is to set the subreaper option on the docker daemon to support this. The full docs can be found here but here is a quick description.

PR_SET_CHILD_SUBREAPER (since Linux 3.4)
              If arg2 is nonzero, set the "child subreaper" attribute of the
              calling process; if arg2 is zero, unset the attribute.  When a
              process is marked as a child subreaper, all of the children
              that it creates, and their descendants, will be marked as
              having a subreaper.  In effect, a subreaper fulfills the role
              of init(1) for its descendant processes.  Upon termination of
              a process that is orphaned (i.e., its immediate parent has
              already terminated) and marked as having a subreaper, the
              nearest still living ancestor subreaper will receive a SIGCHLD
              signal and be able to wait(2) on the process to discover its
              termination status.
Contributor

crosbymichael commented Mar 25, 2015

The missing piece from my description is to set the subreaper option on the docker daemon to support this. The full docs can be found here but here is a quick description.

PR_SET_CHILD_SUBREAPER (since Linux 3.4)
              If arg2 is nonzero, set the "child subreaper" attribute of the
              calling process; if arg2 is zero, unset the attribute.  When a
              process is marked as a child subreaper, all of the children
              that it creates, and their descendants, will be marked as
              having a subreaper.  In effect, a subreaper fulfills the role
              of init(1) for its descendant processes.  Upon termination of
              a process that is orphaned (i.e., its immediate parent has
              already terminated) and marked as having a subreaper, the
              nearest still living ancestor subreaper will receive a SIGCHLD
              signal and be able to wait(2) on the process to discover its
              termination status.
@sirupsen

This comment has been minimized.

Show comment
Hide comment
@sirupsen

sirupsen Mar 25, 2015

Contributor

Had a good discussion with @crosbymichael on this on IRC, agree with setting PR_SET_CHILD_SUBREAPER does exactly what I want. This needs to come with a docs change clarifying Docker and init processes as well as recommended patterns.

Contributor

sirupsen commented Mar 25, 2015

Had a good discussion with @crosbymichael on this on IRC, agree with setting PR_SET_CHILD_SUBREAPER does exactly what I want. This needs to come with a docs change clarifying Docker and init processes as well as recommended patterns.

@mrunalp

This comment has been minimized.

Show comment
Hide comment
@mrunalp

mrunalp Mar 25, 2015

Contributor

Yeah, won't be much of a reaper without it unless running as pid 1.

Contributor

mrunalp commented Mar 25, 2015

Yeah, won't be much of a reaper without it unless running as pid 1.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Mar 27, 2015

Contributor

@ibuildthecloud I havn't thought about it much but maybe we could model this after the http.Handler with a middle router/muxer for handling exits.

Don't stress over the first implementation, we can get something working and then itterate on the API

Contributor

crosbymichael commented Mar 27, 2015

@ibuildthecloud I havn't thought about it much but maybe we could model this after the http.Handler with a middle router/muxer for handling exits.

Don't stress over the first implementation, we can get something working and then itterate on the API

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Mar 27, 2015

Contributor

From what @mrunalp said about the look for reaping fast existing children, I experimented and you have to handle at ECHILD like below:

func reapChildren() (exits []exit, err error) {
        for {
                var (
                        ws  syscall.WaitStatus
                        rus syscall.Rusage
                )
                pid, err := syscall.Wait4(-1, &ws, syscall.WNOHANG, &rus)
                if err != nil {
                        if err == syscall.ECHILD {
                                return exits, nil
                        }
                        return nil, err
                }
                exits = append(exits, exit{
                        pid:    pid,
                        status: ws.ExitStatus(),
                })
        }
}
Contributor

crosbymichael commented Mar 27, 2015

From what @mrunalp said about the look for reaping fast existing children, I experimented and you have to handle at ECHILD like below:

func reapChildren() (exits []exit, err error) {
        for {
                var (
                        ws  syscall.WaitStatus
                        rus syscall.Rusage
                )
                pid, err := syscall.Wait4(-1, &ws, syscall.WNOHANG, &rus)
                if err != nil {
                        if err == syscall.ECHILD {
                                return exits, nil
                        }
                        return nil, err
                }
                exits = append(exits, exit{
                        pid:    pid,
                        status: ws.ExitStatus(),
                })
        }
}
@mrunalp

This comment has been minimized.

Show comment
Hide comment
@mrunalp
Contributor

mrunalp commented Mar 27, 2015

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Mar 30, 2015

Contributor

@crosbymichael Are you working on an implementation of this? If you're not I can start on something.

Contributor

ibuildthecloud commented Mar 30, 2015

@crosbymichael Are you working on an implementation of this? If you're not I can start on something.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Apr 7, 2015

Contributor

No i'm not working on it.

Contributor

crosbymichael commented Apr 7, 2015

No i'm not working on it.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Apr 7, 2015

Contributor

WANT!!!!

Contributor

jessfraz commented Apr 7, 2015

WANT!!!!

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Apr 7, 2015

Collaborator

Spammy +1 :P

Collaborator

tiborvass commented Apr 7, 2015

Spammy +1 :P

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Apr 9, 2015

Contributor

No fair, why does @crosbymichael get all the fun PRs :) j/k looking forward to the implementation.

Contributor

ibuildthecloud commented Apr 9, 2015

No fair, why does @crosbymichael get all the fun PRs :) j/k looking forward to the implementation.

@sirupsen

This comment has been minimized.

Show comment
Hide comment
@sirupsen

sirupsen Apr 9, 2015

Contributor

@ibuildthecloud I think he meant he's not working on it :)

Contributor

sirupsen commented Apr 9, 2015

@ibuildthecloud I think he meant he's not working on it :)

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Apr 9, 2015

Contributor

@sirupsen lol, man I suck at reading. I'll starting working on this.

Contributor

ibuildthecloud commented Apr 9, 2015

@sirupsen lol, man I suck at reading. I'll starting working on this.

@sirupsen

This comment has been minimized.

Show comment
Hide comment
@sirupsen
Contributor

sirupsen commented Apr 9, 2015

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 22, 2015

Collaborator

Added to 1.9.0 coz #yolo

EDIT: we're still looking for volunteers to help fix this :) /cc @sirupsen @ibuildthecloud

Collaborator

tiborvass commented Jul 22, 2015

Added to 1.9.0 coz #yolo

EDIT: we're still looking for volunteers to help fix this :) /cc @sirupsen @ibuildthecloud

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jul 22, 2015

Member

@tiborvass should be added to ROADMAP.md?

Member

thaJeztah commented Jul 22, 2015

@tiborvass should be added to ROADMAP.md?

@runcom

This comment has been minimized.

Show comment
Hide comment
@runcom

runcom Jul 22, 2015

Member

😸

Member

runcom commented Jul 22, 2015

😸

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 22, 2015

Collaborator

@thaJeztah you're late to the game :)

Collaborator

tiborvass commented Jul 22, 2015

@thaJeztah you're late to the game :)

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Jul 29, 2015

Contributor

@tiborvass @crosbymichael I was under the impression that runc was going to address this as it already has the reaping code in it. Is that not the case? Otherwise I'll try to take this on for 1.9

Contributor

ibuildthecloud commented Jul 29, 2015

@tiborvass @crosbymichael I was under the impression that runc was going to address this as it already has the reaping code in it. Is that not the case? Otherwise I'll try to take this on for 1.9

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 31, 2015

Contributor

@ibuildthecloud some of the reaping issues will be fixed but using an evented process monitor is a much more light weight and robust method to monitor for processes.

Contributor

crosbymichael commented Jul 31, 2015

@ibuildthecloud some of the reaping issues will be fixed but using an evented process monitor is a much more light weight and robust method to monitor for processes.

@jakirkham

This comment has been minimized.

Show comment
Hide comment
@jakirkham

jakirkham Sep 11, 2015

Now that it exists, would it be possible to just use tini, or are there still some issues that need to be resolved, and, if so, what are they?

jakirkham commented Sep 11, 2015

Now that it exists, would it be possible to just use tini, or are there still some issues that need to be resolved, and, if so, what are they?

jakirkham referenced this issue in jupyter/notebook Oct 8, 2015

Dockerfile: Install `tini` and use as the entrypoint to handle zombie…
… processing reaping. Run notebook with some default arguments if no other command is provided.
@mxplusb

This comment has been minimized.

Show comment
Hide comment
@mxplusb

mxplusb Oct 9, 2015

@tiborvass still looking for volunteers?

mxplusb commented Oct 9, 2015

@tiborvass still looking for volunteers?

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Oct 9, 2015

Collaborator

@mynameismevin yes please :)

Collaborator

tiborvass commented Oct 9, 2015

@mynameismevin yes please :)

@mxplusb

This comment has been minimized.

Show comment
Hide comment
@mxplusb

mxplusb Oct 9, 2015

I'll work on coming up with something based off the discussion that's been had thus far.

mxplusb commented Oct 9, 2015

I'll work on coming up with something based off the discussion that's been had thus far.

@tiborvass tiborvass removed this from the 1.9.0 milestone Oct 9, 2015

@edevil

This comment has been minimized.

Show comment
Hide comment
@edevil

edevil Dec 18, 2015

Has progress stalled on this issue? I noticed that @mynameismevin's PR was not merged.

Just to be clear, this would solve the current problem of processes having PID 1 in a container and having to reap grand-children, right?

edevil commented Dec 18, 2015

Has progress stalled on this issue? I noticed that @mynameismevin's PR was not merged.

Just to be clear, this would solve the current problem of processes having PID 1 in a container and having to reap grand-children, right?

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Dec 18, 2015

Member

@edevil possibly this will now be implemented through containerd; https://github.com/docker/containerd

@crosbymichael am I right?

Member

thaJeztah commented Dec 18, 2015

@edevil possibly this will now be implemented through containerd; https://github.com/docker/containerd

@crosbymichael am I right?

@edevil

This comment has been minimized.

Show comment
Hide comment
@edevil

edevil commented Jan 5, 2016

@crosbymichael, is @thaJeztah right? :)

@mxplusb

This comment has been minimized.

Show comment
Hide comment
@mxplusb

mxplusb Jan 6, 2016

@edevil my PR was really just a summation of this conversation, and I did nothing outside of just providing some code in a package for others to use in solving this problem. I didn't fix the issue, just assisted with moving it closer towards resolution.

Also, I didn't test it when it came to reaping grandchildren, however I did test abandoned child processes, so theoretically it could clean up abandoned grand-children so long as the child cleaned up the grand-child. No promises on that actually working as it wasn't tested.

mxplusb commented Jan 6, 2016

@edevil my PR was really just a summation of this conversation, and I did nothing outside of just providing some code in a package for others to use in solving this problem. I didn't fix the issue, just assisted with moving it closer towards resolution.

Also, I didn't test it when it came to reaping grandchildren, however I did test abandoned child processes, so theoretically it could clean up abandoned grand-children so long as the child cleaned up the grand-child. No promises on that actually working as it wasn't tested.

@icecrime

This comment has been minimized.

Show comment
Hide comment
@icecrime

icecrime Feb 17, 2016

Contributor

This is directly linked to #20361, and is on our 1.11 roadmap.

Contributor

icecrime commented Feb 17, 2016

This is directly linked to #20361, and is on our 1.11 roadmap.

@icecrime icecrime added this to the 1.11.0 milestone Feb 17, 2016

@tonistiigi tonistiigi referenced this issue Feb 24, 2016

Merged

Containerd integration #20662

11 of 11 tasks complete

@jessfraz jessfraz closed this in #20662 Mar 19, 2016

@ibuildthecloud

This comment has been minimized.

Show comment
Hide comment
@ibuildthecloud

ibuildthecloud Apr 3, 2016

Contributor

AFAIK containerd doesn't solve the zombie problem. This issue was opened for evented process monitoring, which I assume has been fixed w/ containerd, but the sub discussions about zombie killing has not been addressed. @sirupsen I would probably open a new issue. Can somebody correct me if I'm wrong, because I'm not seeing zombie killing working in 1.11.0-rc3. It's my understanding that process subreapers don't work across a PID namespace boundary. So if you child creates a new PID namespace and you are set to the subreaper of the child, processes will still be reparented to the PID 1 of the child PID namespace and not yourself (the subreaper).

Contributor

ibuildthecloud commented Apr 3, 2016

AFAIK containerd doesn't solve the zombie problem. This issue was opened for evented process monitoring, which I assume has been fixed w/ containerd, but the sub discussions about zombie killing has not been addressed. @sirupsen I would probably open a new issue. Can somebody correct me if I'm wrong, because I'm not seeing zombie killing working in 1.11.0-rc3. It's my understanding that process subreapers don't work across a PID namespace boundary. So if you child creates a new PID namespace and you are set to the subreaper of the child, processes will still be reparented to the PID 1 of the child PID namespace and not yourself (the subreaper).

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Apr 12, 2016

Contributor

@ibuildthecloud You are correct, sadly :(

Contributor

cpuguy83 commented Apr 12, 2016

@ibuildthecloud You are correct, sadly :(

@fiunchinho

This comment has been minimized.

Show comment
Hide comment
@fiunchinho

fiunchinho May 28, 2016

So using an init system (like tini) is still the way to go to solve the zombie processes, right?

fiunchinho commented May 28, 2016

So using an init system (like tini) is still the way to go to solve the zombie processes, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment