There are three important bits here:
unsigned types in ei++.h make date calculations actually work with negative numbers.
check_children reuses stop_child to end processes that have gone on beyond their deadline.
the internal kill function ensures that kill(-1, ...) will not be called, which is a fairly bad thing to have happen.
Use signed values since these can be used for negative values as well.
Use our own kill function in order to avoid kill(-1), and rework kill…
…ing in check_children.
check_children now utilizes stop_child in order to kill child
processes in order to reuse the same logic used elsewhere.
Increase Shutdown timeout in exec app definition.
Hrm... pull request has improved things, but not all is right. When I do application:stop(exec) it does not clean up properly.
Thinking out load: terminate() should wait on a message from the c++ app...
This branch has a few more fixes that seem to stop things cleanly when we do application:stop(exec). One bit that I was unsure of is where I just put a sleep(1) - maybe that's not the right way of doing things, but it seems to work for me.
Here's the motivation behind a lot of this work:
I'm managing some external processes, and init:stop gets called, which in turn stops exec, but some of the applications that get stopped catch SIGTERM and take a few seconds to shut down (or have issues and require SIGKILL). When those processes are shutting down, they'd still like to log stuff which they do via Erlang (they're really open_ports that are managed by exec). So it'd be nice if Erlang and the logger application would wait until exec has finished shutting down, which this code seems to accomplish.
I will be testing it more on Monday...
This pull request looks good except for fprintf() calls which are terminated by "\n" rather than "\r\n". The later is needed so that when the port program prints debug info it shows up properly in console window (otherwise new lines are continued without carriage return on some platforms).
Also regarding the application_stop_fix branch - the unconditional use of sleep(1) is a problem in case when there is a stuck process that can't be killed for some reason. Say you called a manage(OsPid) on a pid that has different user's ownership, and can't be killed. In that case the port program would get stuck. Prior solution relied on a deadline given to exec to terminate. I believe the original approach is better. Perhaps the default deadline needs to be increased?
Merge branch 'master' into stop_child_fix
Use \r\n instead of \n for fprintf.
Ok, this one is fixed up. I will open a separate pull request for the other branch so we can hash out the details there.
I didn't realize that the cmd decode was fetching the command name as…
… well. Fixed.
Merge branch 'manage' into stop_child_fix
Thanks for your contribution!