Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upBenchmarker does not properly terminate child processes #927
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
msmith-techempower
Jul 15, 2014
Member
Agreed... in fact we have had this trouble with a few frameworks and done some EXTREMELY silly things in stop to try and ensure that their processes (and child processes) die.
I am up for suggestions on this one.
|
Agreed... in fact we have had this trouble with a few frameworks and done some EXTREMELY silly things in I am up for suggestions on this one. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hamiltont
Jul 16, 2014
Contributor
@methane Was thinking I could attempt to include this solution into #913. However, AFAIK it won't work on windows, so it's only a partial fix if it works
Also, it's a PITA to fix this in such a way that it would immediately work for all the setup.py files. We would likely need to change __run_tests to create a subprocess instead of a thread, and then we could do something like this use a group to automatically clean up any process the setup.py files created. The other method is just to add this to #913 and modify the setup.py files one by one to use whatever the new method is that can automatically clean up the processes
|
@methane Was thinking I could attempt to include this solution into #913. However, AFAIK it won't work on windows, so it's only a partial fix if it works Also, it's a PITA to fix this in such a way that it would immediately work for all the setup.py files. We would likely need to change __run_tests to create a subprocess instead of a thread, and then we could do something like this use a group to automatically clean up any process the setup.py files created. The other method is just to add this to #913 and modify the setup.py files one by one to use whatever the new method is that can automatically clean up the processes |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
methane
Jul 16, 2014
Contributor
We would likely need to change __run_tests to create a subprocess instead of a thread,
FYI, You already use process instead of thread.
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/toolset/benchmark/benchmarker.py#L492
FYI, You already use process instead of thread. https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/toolset/benchmark/benchmarker.py#L492 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hamiltont
Jul 16, 2014
Contributor
Good point, for some reason I though multiprocessing was thread-based. There may be hope for a generic fix
|
Good point, for some reason I though multiprocessing was thread-based. There may be hope for a generic fix |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
methane
Jul 16, 2014
Contributor
On Linux, setsid() in top of __run_tests may reduce problem.
On Windows, Using TASKKILL /T may resolve problem.
|
On Linux, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
methane
Jul 16, 2014
Contributor
But I don't have confidence because I'm not an expert of process management.
|
But I don't have confidence because I'm not an expert of process management. |
added a commit
to hamiltont/FrameworkBenchmarks
that referenced
this issue
Jul 19, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hamiltont
Jul 19, 2014
Contributor
I've fixed this (as well as possible) for linux with the above commit. It's not based on your master, i'll have to do that later.
Naturally, this code is just building a guarantee of termination above and beyond what is provided by a good stop function in each setup.py, so a number of the changes are just ensuring that stop is called correctly regardless of how the exit happens, and ensuring that stop is only called one time.
The guarantee provided beyond stop is pretty good - with one exception, any processes created as a side-effect of __run_test are guaranteed to be destroyed before __run_test is called again. The exception is processes that intentionally daemonize themselves by changing their session identifier. This is fairly odd for TFB, it normally means something was installed into the local system, via a mechanism such as apt-get, and it is being started from a system directory (/usr, /etc/init.d, etc), with sudo permissions. There is just no non-hackish way I can find to keep track of those processes and clean them up, so for this scenario we have to rely on people writing a good stop() function. My main test case here has been aspnet-mono which internally runs nginx (an exception to the above guarantee) and mono (not an exception) with like 20 open ports and 50ish processes - this commit appears to be stopping all the mono and nginx processes every single time.
I don't have a windows box, can't help there. There was a comment saying "These features don't work on windows" that was confusing - multiprocessing works just fine on windows IIRC.
|
I've fixed this (as well as possible) for linux with the above commit. It's not based on your master, i'll have to do that later. Naturally, this code is just building a guarantee of termination above and beyond what is provided by a good The guarantee provided beyond I don't have a windows box, can't help there. There was a comment saying "These features don't work on windows" that was confusing - multiprocessing works just fine on windows IIRC. |
added a commit
to hamiltont/FrameworkBenchmarks
that referenced
this issue
Jul 19, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
msmith-techempower
Jul 21, 2014
Member
The real test, from my experience, is whether Plack will be terminated correctly on an EC2 MLarge instance (not enough machine for Plack, haha). We regularly had the situation you were mentioning where it would run nginx from some root-perm folder, spawn 8 threads, and nothing called from stop() seemed a bad enough dude to kill it (ultimately, sudo killall plackup was the 'nuke from orbit' fix just for EC2).
|
The real test, from my experience, is whether Plack will be terminated correctly on an EC2 MLarge instance (not enough machine for Plack, haha). We regularly had the situation you were mentioning where it would run |
aschneider-techempower
added this to the Round 11 milestone
Jul 21, 2014
aschneider-techempower
added
the
postpone
label
Jul 21, 2014
aschneider-techempower
modified the milestone:
Round 11
Jul 21, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
resolved in master! |
hamiltont commentedJul 15, 2014
If i understand correctly, Benchmarker calls __run_test in a thread specifically to avoid long-running tests. It a test takes too long, Benchmarker can terminate that thread and result operation.
Any thread termination should also ensure any child processes created are stopped. This seems to be tricky to do correctly, and I've seen a number of times where a test is not properly terminated and the underlying servers are still active and bound to ports. I suppose an ideal solution would be to politely call test.stop(), but also to create a process group and SIGTERM everything in the group, so that any child processes are definitely stopped