Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

killall does not kill greenlets before they were started #1473

Closed
kochelmonster opened this issue Oct 30, 2019 · 2 comments · Fixed by #1499
Closed

killall does not kill greenlets before they were started #1473

kochelmonster opened this issue Oct 30, 2019 · 2 comments · Fixed by #1499

Comments

@kochelmonster
Copy link

@kochelmonster kochelmonster commented Oct 30, 2019

  • gevent version: 1.4.0
  • Python version: "cPython 3.7.9 downloaded from python.org"

Description:

Run the following example code:

import gevent

def test():
    print("test called")

print("try kill")
g = gevent.spawn(test)
g.kill()
gevent.sleep(0.1)
print()
print("try killall")
g = gevent.spawn(test)
gevent.killall([g])
gevent.sleep(0.1)

The output is:

try kill

try killall
test called

But test called may never be printed.

@jamadden

This comment has been minimized.

Copy link
Member

@jamadden jamadden commented Oct 31, 2019

What's happening is that spawn and killall are both asynchronous. They don't actually do anything until the next iteration of the event loop. The killall function takes a block parameter that defaults to true; when it is true, killall appears to be synchronous, but it actually hides yielding to the event loop inside it. (Meaning that gevent.sleep(0.1) isn't necessary to see the described behaviour.) Because the event loop process things in order, first the greenlet is spawned, then the callback that killall registered gets to run, by which time it's too late.

Greenlet.kill is itself asynchronous in exactly the same way, but it has a special synchronous case just to handle kill-before-start. That was added way back in 1.1a2. It was needed because there was no way to do the same thing from outside the class (all the necessary details were private).

I don't think it's necessarily unreasonable to expect that killall would handle unstarted greenlets the same way. However, it's a very subtle change and I worry that if it causes problems it would be hard to track them down: the caller of killall may or may not be anywhere close to the site that created the greenlet, and it's easier to trace the path of a single object than it is the contents of an arbitrary mutable collection.

Fortunately, if that is the desired behaviour, it can be achieved now without any changes in gevent because of the previous change to Greenlet.kill. Simply iterate the greenlets once, killing them without blocking to catch any that aren't started, and then pass it off to killall to deal with the rest:

def my_aggressive_killall(greenlets):
    for g in greenlets:
        g.kill(block=False) # Make sure any that haven't started won't.
    return killall(greenlets)

Alternatively, it's often possible and frequently more clear to work within the application domain. Communicate to the greenlet such that when it does get a chance to run, it realizes it's no longer necessary. This is very often helpful if the greenlet was supposed to take ownership of some resource that needs to be cleaned up, such as a socket. This way it still gets a chance to handle the cleanup. (The same holds true for Greenlet.kill too, of course. These are both low-level primitive operations.)

# (rough untested code, demonstrating the idea)
# The first one to find the answer stops the other ones.
class Worker(Greenlet):
    should_run = True

    def __init__(self, resource, other_workers):
        self.resource = resource
        self.other_workers.append(self)

    def _run(self):
        answer = None
        try:
            if self.should_run:
                answer = search_for_answer(self.resource)
        except HTTPRedirect as e:
             self.spawn(query_at_redirect(e), self.other_workers)
        finally:
            self.resource.close()
            if answer is not None:
                for worker in self.other_workers: 
                    worker.should_run = False
        return answer

    def kill(self, *args, **kwargs):
        # If we're killed before we get a chance to run, make sure to 
        # clean up our resource in a timely fashion. Don't leave it to the whims
        # of the garbage collector.
        self.resource.close()
        Greenlet.kill(self, *args, **kwargs)

sockets = query_a_bunch_of_hosts()
workers = []
for possible_answer in sockets:
    Worker.spawn(possible_answer, workers)

This is also the first time I recall hearing a request for this; I'd be interested to know more about the use case and hear input from others.

@kochelmonster

This comment has been minimized.

Copy link
Author

@kochelmonster kochelmonster commented Oct 31, 2019

Personally I would say: killall and kill should behave equally. At least note the differences in the documentation. In my case the difference was especially fatal otherwise I didn't even noticed it:
The Code is like this:

class TestServers(unittest.TestCase):
    @classmethod
    def setUpClass(cls):
        cls.greenlet = {}
        for s in servers:
            greenlets[s] = spawn(start_and_monitor, s)

    @classmethod
    def tearDownClass(cls):
        killall(cls.greenlets.values())

    def test_case1(self):
        ...

    def test_case2(self):
        ...

    def test_caseN(self):
        greenlets[server].kill()
        ...
        greenlet[server] = spawn(start_and_monitor, server)

start_and_monitor starts a process with Popen and terminates the process if the greenlet is killed.
test_caseN was called last. And it seems that the (monitor)-greenlet was killed during the popen process. Leaving back some zombie processes.

jamadden added a commit that referenced this issue Dec 31, 2019
And stop a fresh greenlet from even running.

Fixes #1473.
jamadden added a commit that referenced this issue Dec 31, 2019
And stop a fresh greenlet from even running.

Fixes #1473.

Free user-provided resources as soon as the greenlet is cancelled, if it's not running.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.