Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cvise doesn't reliably kill things upon timeout. #145

Open
vext01 opened this issue Jun 21, 2024 · 1 comment
Open

cvise doesn't reliably kill things upon timeout. #145

vext01 opened this issue Jun 21, 2024 · 1 comment

Comments

@vext01
Copy link

vext01 commented Jun 21, 2024

Hi,

I'm reducing the lua interpreter using cvise. The interestingness oracle compiles and runs the interpreter.

I noticed that cvise would frequently get stuck, and an inspection of the process table shows that many lua processes were running well beyond the allocated timeout. It appears that you need to be more forceful than SIGTERM to reliably kill lua. SIGKILL seems to be sufficient (but see the caveat below).

So I wonder if it would be nice to add the ability to specify the signal that is sent to the process upon timeout (or maybe you could even allow an arbitrary shell command, in case you want to do something that isn't sending a signal?).

For now I've hacked the cvise source code to have it send a sigkill:

diff --git a/cvise/utils/testing.py b/cvise/utils/testing.py
index 2300c3c..15da7fa 100644
--- a/cvise/utils/testing.py
+++ b/cvise/utils/testing.py
@@ -362,7 +362,7 @@ class TestManager:
                 children.append(process)
                 for child in children:
                     try:
-                        child.terminate()
+                        child.kill()
                     except psutil.NoSuchProcess:
                         pass
             except psutil.NoSuchProcess:

A colleague mentioned that he had this same issue and wrapped the oracle in timeout(1) in order to SIGKILL the interpreter.

But here's a hitch, with the above patch applied, cvise gets stuck much, much less frequently, but on occasion still does:

image

For example, above the timeout is 60 seconds, yet a lua process has been running for over an hour. When this happens, so far I've just been manually kill -9ing the PID to allow cvise to continue. The processes have realiably died when I've manually killed them in this way. SIGKILL is unmaskable, so that's what I'd expect.

But I wonder how this can happen. Could there be a bug in cvise's killing logic?

I'm using cvise master branch (commit 16a34b2) with the above patch on Debian/amd64.

@marxin
Copy link
Owner

marxin commented Jun 26, 2024

Hello!

Good to hear from somebody using the tool to reduce a code in Lua!

Yeah, I must confess I've also encountered similar issues in the part. The most reliable approach for me was something like:

cvise -c "timeout 5 gcc -O2 x.c 2>&1 | grep "internal compiler error"

(one can potentially specify a signal for the timeout command). Can you please give it a try if it's a more reliable approach?

Sure, I would be happy to accept such a PR that will add the possibility to specify the signal, but I'm a bit sad one can still face the unpleasant situation where the process is blocking C-Vise :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants