(revised caption) When doing async without poll, and with timeout of 0, don't babysit the process, just execute directly #4778

willthames · 2013-11-02T03:19:24Z

See https://groups.google.com/forum/?fromgroups=#!topic/ansible-project/bMuOs5lLg_8 for more background.

Using command or shell to start a synchronous task that kicks of a job in the background under nohup does not succeed (for some reason the popen.communicate does not return)

It is not possible to start a long running task that does not have an incredibly large timeout. In the meantime, there are three supervisory processes on top of the task itself, checking every five seconds whether the timeout has expired.

Because async_wrapper kills the entire process group (and no reasonable mechanism seems to exist in bash for a process to remove itself from a process group, unlike removing itself from its parent) with a SIGKILL, it's near impossible to protect the long running task from being killed when the timeout expires, even if that's what you'd like to achieve. A SIGHUP would have similar effect but could be protected against with nohup. Even SIGTERM etc could be protected against with trap

I quite like the asynchronous approach, and believe that changing the signal sent, either by default or through an override, would be a reasonable solution. The hardcoded check every five seconds might also warrant examination, but I'm really not that concerned if I can set a timeout of 5 seconds but protect a process from being killed from the timeout.

mpdehaan · 2013-11-02T03:38:40Z

Original caption was "There is no way to kick off a task that runs forever"

Ok, so this defect is not entirely correct. There is.

"async" with a poll interval of 0 does fire and forget a task, there's no need for the nohup. You could set a timeout that was longer than the death of the universe, and it would work.

The above problem was reported when trying to get around the babysitter process, as I understand it, that still runs around that task for "cleaner process output" -- not a critical feature, but nice to have -- but the babysitter is needed to set a maximum lifetime for the program. So what if the maximum lifetime is 0, implying the process should be forever, without passing in a crazy large value?

What was discussed in the thread was that it would be nice to just exec the task and not have a helper process in the case that you have given up on the idea that you want to poll it. This should only happen if the poll interval is 0 and the lifetime is set to 0 -- both 0.

There should be no need to change the kill handling -- just don't spawn the babysitter task in the async_wrapper if the user has waived the idea of polling the task, and let it go down a completely different path.

mpdehaan · 2013-11-02T03:43:10Z

note: updated comment above to make it more clear what I'm proposing if both the poll interval is 0 and the lifetime is 0.

Current way to execute infinite task is poll interval of 0 -- fire and forget -- and a lifetime of VERY LARGE NUMBER.

Request would be basically let 0 be "infinite timeout" and if poll interval is also 0 just execute directly and don't have the watcher process -- just the necc. levels of daemonization.

willthames · 2013-11-02T03:45:34Z

Sounds like a sensible approach - and you're right about the original title, it was incorrect.

I'll see if I can come up with a way of implementing this in the way you're proposing.

mpdehaan · 2013-11-02T04:06:49Z

Outstanding, thank you!

romabysen · 2013-11-02T15:39:54Z

Shouldn't a process that is supposed to run forever really be some kind of service though? I just have a hard time seeing this as a common use-case that isn't better solved outside of ansible.

willthames · 2013-11-04T02:45:55Z

I suppose that would be a reasonable alternative (handle the daemonization in the application rather than through ansible). Certainly my investigations of code fixes haven't come up with a means of doing this in any nice way.

bcoca · 2013-11-05T18:34:49Z

demonizing over ssh is not normally a good idea, you can use many things
(runit, uwsgi, monit, start-stop-daemon, daemontools, superviserctl,
init/upstart/systemd script, etc) to do this correctly and then call that
from ssh.

willthames · 2013-11-05T22:53:17Z

I don't really disagree - I just wonder if the documentation could be improved (I think it's the term fire-and-forget in the async documentation that led me down the wrong path).

I haven't done much documentation for Ansible so I guess now's my opportunity to contribute - basically outline some of the mechanisms (in the end I solved my issue by running the task synchronously under setsid - so there's no supervision but I could always live with that)

This was referenced Nov 4, 2013

Allow async commands to run forever without watching #4800

Closed

Allow async commands to run forever without watching #4807

Closed

willthames closed this as completed Dec 20, 2013

ryshah mentioned this issue Jun 3, 2016

Add the scaleup/scaledown option to swift-ring blueboxgroup/ursula#1931

Merged

ansibot added feature This issue/PR relates to a feature request. and removed feature_idea labels Mar 2, 2018

ansible locked and limited conversation to collaborators Apr 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(revised caption) When doing async without poll, and with timeout of 0, don't babysit the process, just execute directly #4778

(revised caption) When doing async without poll, and with timeout of 0, don't babysit the process, just execute directly #4778

willthames commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

willthames commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

romabysen commented Nov 2, 2013

willthames commented Nov 4, 2013

bcoca commented Nov 5, 2013

willthames commented Nov 5, 2013

(revised caption) When doing async without poll, and with timeout of 0, don't babysit the process, just execute directly #4778

(revised caption) When doing async without poll, and with timeout of 0, don't babysit the process, just execute directly #4778

Comments

willthames commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

willthames commented Nov 2, 2013

mpdehaan commented Nov 2, 2013

romabysen commented Nov 2, 2013

willthames commented Nov 4, 2013

bcoca commented Nov 5, 2013

willthames commented Nov 5, 2013