New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Celery instances leak #1949

Closed
graffic opened this Issue Apr 2, 2014 · 2 comments

Comments

Projects
None yet
2 participants
@graffic

graffic commented Apr 2, 2014

While running my application tests I was creating a huge amount of celery instances to send one message to a task without waiting for the result. Watching RabbitMQ connections I saw that connections weren't freed and I've traced that problem to two leaks of celery instance references.

multiprocessing.util from billiard

When accessing the pool attribute of the celery instance, the instance is registered in the multiprocessing module: https://github.com/celery/celery/blob/master/celery/app/base.py#L593

In this graph from objgraph you can see a dict with 139 Celery instances.

multiproc

The memoize method in celery.utils.functional

https://github.com/celery/celery/blob/master/celery/utils/functional.py#L124

The dictionary doesn't store weak references, but full references of the loader. The loader is also referenced by the celery instance, therefore keeping at least 100 celery instances.

This graph shows a dict with 100 loader instances.

backend

Other checks

I've removed these two leaks and celery instances where collected without problems. But I'm pretty sure that they were there for a reason.

@ask

This comment has been minimized.

Show comment
Hide comment
@ask

ask Apr 2, 2014

Member

Thanks! (and especially for such an illustrative bug report, which is a joy)

The cache of get_backend_cls predates the app instances but probably never questioned it
later. I don't think there is any good reason for these to be cached, the lookup happens rarely anyway so unlikely to have any effect on performance.

The second one is tricky, I guess App.close() would have to remove itself from the after forker registry, but there is no API to do this and the after fork registry is very private and does not support
fast removal:

snippet from python's multiprocessing.util.register_after_fork:

def register_after_fork(obj, func):
    _afterfork_registry[(next(_afterfork_counter), id(obj), func)] = obj

I guess we would need some way to keep weakrefs in the after fork registry instead,
something like:

class _after_fork_registry(object):

    def __init__(self):
        self.apps = weakref.WeakSet()
        self._registered = False

   def add(self, app):
       if not self._registered:
           multiprocessing.util.register_after_fork(self, self)
       self._registered = True
       self.apps.add(app)

    def __call__(self):
        for app in self.apps:
            app._after_fork()
Member

ask commented Apr 2, 2014

Thanks! (and especially for such an illustrative bug report, which is a joy)

The cache of get_backend_cls predates the app instances but probably never questioned it
later. I don't think there is any good reason for these to be cached, the lookup happens rarely anyway so unlikely to have any effect on performance.

The second one is tricky, I guess App.close() would have to remove itself from the after forker registry, but there is no API to do this and the after fork registry is very private and does not support
fast removal:

snippet from python's multiprocessing.util.register_after_fork:

def register_after_fork(obj, func):
    _afterfork_registry[(next(_afterfork_counter), id(obj), func)] = obj

I guess we would need some way to keep weakrefs in the after fork registry instead,
something like:

class _after_fork_registry(object):

    def __init__(self):
        self.apps = weakref.WeakSet()
        self._registered = False

   def add(self, app):
       if not self._registered:
           multiprocessing.util.register_after_fork(self, self)
       self._registered = True
       self.apps.add(app)

    def __call__(self):
        for app in self.apps:
            app._after_fork()

ask added a commit that referenced this issue Apr 2, 2014

@ask ask closed this in 9384218 Apr 2, 2014

@graffic

This comment has been minimized.

Show comment
Hide comment
@graffic

graffic Apr 2, 2014

Thank you people for your great work :) Awesome!

graffic commented Apr 2, 2014

Thank you people for your great work :) Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment