Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup gevent resources on thread destruction #1601

Closed
woutdenolf opened this issue May 4, 2020 · 2 comments
Closed

Cleanup gevent resources on thread destruction #1601

woutdenolf opened this issue May 4, 2020 · 2 comments
Assignees
Labels
Type: Bug

Comments

@woutdenolf
Copy link

woutdenolf commented May 4, 2020

Two questions:

  1. What do I need to do at thread cleanup to destroy all gevent resources used by that thread?
  2. How do I make sure during cleanup that no hub or other gevent resources are accidentally instantiated again? (answer might be: it's your responsibility, but just asking whether there is a way)

Here is a minimal working example. Destroying the hub does not seem to help:

import gc
import gevent
from greenlet import greenlet
import threading


def allgreenlets():
    return {obj for obj in gc.get_objects() if isinstance(obj, greenlet)}


class MyGreenlet(gevent.Greenlet):
    def _run(self):
        pass


def thread_main():
    g = MyGreenlet()
    g.start()
    g.join()
    hub = gevent.get_hub()
    hub.join()
    hub.destroy(destroy_loop=True)
    # What do I do here to prevent new gevent resources allocation in this thread?

before = allgreenlets()
t = threading.Thread(target=thread_main)
t.start()
t.join()
del t
while gc.collect():
    pass
after = allgreenlets()
notcleaned = after - before
assert not notcleaned, notcleaned
@jamadden
Copy link
Member

jamadden commented May 6, 2020

Thanks for the report!

What do I need to do at thread cleanup to destroy all gevent resources used by that thread?

Deliberately running gevent hubs in multiple native threads, while something that is expected to work, isn't something that is heavily tested. Most usage of gevent that I'm aware of, and that the gevent test cases execute, uses a single native thread to run a hub (CPU processing may be moved to background threads). So this is a little bit "here there be dragons." And you've definitely found a dragon.

Your sample code is basically correct as far as cleaning up all resources go (I would just add del hub, g). There are, however, some hidden references keeping the root greenlet of this thread, along with its hub object alive. These are buried in the way that hub.join() waits for the hub to finish running by catching the LoopExit that gets thrown into the parent greenlet; that produces some cycles that live in C outside of where the GC can get to them.

This image is the object graph this code was originally producing. It doesn't tell the whole story (parts of the story are in C), but it does start to point in the right direction.

objgraph-2haqm887

I think I have a way to make this better, if the backwards compatibility issues work out. Preliminary results are looking good with the following script (here, I use objgraph, which is based on object IDs, not keeping actual objects around, and am extra careful with variable scoping—module scope is a tricky, annoying thing). (If I add a similar loop to the original script, it usually still reports one extra greenlet, but that's constant no matter how many iterations happen, so no actual growth is happening.)

import gc
import gevent
import threading
import objgraph
import io

# Collect original object counts and discard the output
objgraph.show_growth(file=io.StringIO() if str is not bytes else io.BytesIO())

def thread_main():
    g = gevent.Greenlet(run=lambda: 0)
    g.start()
    g.join()
    hub = gevent.get_hub()
    hub.join()
    hub.destroy(destroy_loop=True)
    del hub

def tester():
    t = threading.Thread(target=thread_main)
    t.start()
    t.join()

    while gc.collect():
        pass
    # Show anything that has still leaked
    objgraph.show_growth(limit=15)

for _ in range(50):
    tester()

With the changes I'm working on, there's no output. Previously there would be growth on every iteration:

dict                           1008        +3
weakref                         625        +2
frame                            11        +1
list                            385        +1
greenlet                          5        +1
builtin_function_or_method      763        +1
instancemethod                   30        +1
Hub                               4        +1
...
dict                           1011        +3
weakref                         627        +2
frame                            12        +1
list                            386        +1
greenlet                          6        +1
builtin_function_or_method      764        +1
instancemethod                   31        +1
Hub                               5        +1

How do I make sure during cleanup that no hub or other gevent resources are accidentally instantiated again? (answer might be: it's your responsibility, but just asking whether there is a way)

Like the doctor said when the patient said "it hurts when I do this": "Well then don't do that." 😄 There's no official way to enforce something like that. There are certainly ways to crash the process accidentally if you try too hard; if you're going to destroy the hub+loop, it's best to do so as the last action of the thread.

@jamadden jamadden added the Type: Bug label May 6, 2020
@jamadden jamadden self-assigned this May 6, 2020
jamadden added a commit that referenced this issue May 6, 2020
Addressing #1601.

This solution works, but breaks anything that tries to use the hub again after joining it once, because the hub greenlet dies. So that can't be right.
@jamadden
Copy link
Member

jamadden commented May 11, 2020

Closed in #1604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug
Projects
None yet
Development

No branches or pull requests

2 participants