Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing Manager exceptions create cyclic references #106558

Closed
pteromys opened this issue Jul 9, 2023 · 5 comments · Fixed by #106559
Closed

multiprocessing Manager exceptions create cyclic references #106558

pteromys opened this issue Jul 9, 2023 · 5 comments · Fixed by #106559
Labels
performance Performance or resource usage topic-multiprocessing type-feature A feature request or enhancement

Comments

@pteromys
Copy link
Contributor

pteromys commented Jul 9, 2023

Bug report

multiprocessing.managers uses convert_to_error(kind, result) to make a raisable exception out of result when a call has responded with some sort of error. If kind == "#ERROR", then result is already an exception and the caller raises it directly—but because result was created in a frame at or under the caller, this creates a reference cycle resultresult.__traceback__ → (some frame).f_locals['result'].

In particular, every time I've used a manager queue I've expected frequent occurrences of queue.Empty, and the buildup of reference cycles sporadically wakes up the garbage collector and wrecks my hopes of consistent latency.

I'm including an example script below. PR coming in a moment, so please let me know if I should expand the example into a test and bundle that in. (Please also feel free to tell me if this is a misuse of queue.Empty and I should buzz off.)

Your environment

  • CPython versions tested on: 3.11.3
  • Operating system and architecture: uname -a says Linux delia 6.3.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 11 May 2023 16:40:42 +0000 x86_64 GNU/Linux

Minimal example

Output

net allocations:  0
got from queue:   [0, 1, 2, 3, 4]
net allocations:  23
garbage produced: Counter({<class 'traceback'>: 3, <class 'frame'>: 3, <class 'list'>: 1, <class '_queue.Empty'>: 1})
net allocations:  0

Script

#!/usr/bin/env python

import collections
import gc
import multiprocessing
import queue
import time


def sender(q):
    for i in range(5):
        q.put_nowait(i)

def get_all_available(q):
    result = []
    try:
        while True:
            result.append(q.get_nowait())
    except queue.Empty:
        ...
    return result

def main():
    q = multiprocessing.Manager().Queue()
    p = multiprocessing.Process(target=sender, args=(q,))
    p.start()

    # take control of gc
    gc.disable()
    gc.collect()
    gc.set_debug(gc.DEBUG_SAVEALL)
    time.sleep(0.1)  # just in case the new process took a while to create
    print('net allocations: ', gc.get_count()[0])

    # trigger a queue.Empty
    print('got from queue:  ', get_all_available(q))

    # check for collectable garbage and print it
    print('net allocations: ', gc.get_count()[0])
    gc.collect()
    print('garbage produced:', collections.Counter(type(x) for x in gc.garbage))
    gc.set_debug(0)
    gc.garbage.clear()
    gc.collect()
    print('net allocations: ', gc.get_count()[0])

    # clean up
    p.join()


if __name__ == '__main__':
    main()

Linked PRs

@pteromys
Copy link
Contributor Author

pteromys commented Jul 9, 2023

Shorter example:

#!/usr/bin/env python

import gc
import multiprocessing
import queue
import weakref

def main():
    gc.disable()
    q = multiprocessing.Manager().Queue()
    try:
        q.get_nowait()
    except queue.Empty as e:
        w = weakref.ref(e)
    else:
        raise RuntimeError("queue.Empty not raised when expected")
    print(type(w()))  # queue.Empty if it has circular refs, NoneType if not
    gc.enable()

if __name__ == '__main__':
    main()

@gaogaotiantian
Copy link
Member

I'm not 100% supporting this change for the following reasons:

  1. First of all, this is definitely not a bug - it's not unexpected behavior, it's well-defined and cycle reference is common in Python. So I'll change the tag to feature request.
  2. This is a very generic pattern as long as you save the exception in a local variable. If we accept this, there could be a lot of similar cases and we have to admit the current solution is not "pretty".
    import gc
    def f():
        result = None
        try:
            raise Exception()
        except Exception as e:
            result = e
            return e
    
    f()
    print(gc.collect())
  3. The patch does not fix your problem for good. It elimiates one usage of cycle reference, but as it's an allowed (and even common) pattern in Python, the garbage collector would probably be triggered when your code gets complicated. Not to mention your OS can introduce latency just when it's scheduling the tasks. To me, changing the code just for trying to get a consistent latency seems a bit off (I might be wrong). This patch might fix a very specific case for you, but in general, it would still cause similar issues on most of the programs. (Or maybe Python is not the best choice when you try to achieve very real-time responses?).

@gaogaotiantian gaogaotiantian added type-feature A feature request or enhancement performance Performance or resource usage and removed type-bug An unexpected behavior, bug, or error labels Jul 9, 2023
@pteromys
Copy link
Contributor Author

3 is a fair point, and I've made my peace with the risk of having to give up on eliminating reference cycles once that problem becomes hard. For now, the difficult cases are rare enough that one of my microservices already runs without gc activity past an initial warmup period.

I actually partly agree with (2) also—in general, the tradeoff that I'd be happiest making is this: saving an exception in a local variable is going deep enough that if the library user does it, then we have no responsibility to help them avoid reference cycles.

But if it's library code doing it, there's prior art of us trying just a bit harder—not only that it's documented behavior that except Exception as e: implicitly deletes e from scope on exit, but also all of the following examples of deleting exceptions and tracebacks in except or finally blocks:

  • concurrent/futures/_base.py in _result_or_cancel()
  • logging/__init__.py in Handler.handleError()
  • _strptime.py in _strptime() (search for del err)
  • threading.py in _make_invoke_excepthook()
  • xml/etree/ElementTree.py in XMLParser.close()

The linked PR is similarly low-hanging fruit—I would hope we can add two del statements without risking a slippery slope of more aggressive and meaning-obfuscating interventions!

@gaogaotiantian
Copy link
Member

Hmm, if we have done this in our code base before then it's a different story. That means the core devs have already determined that a little bit lost of readability is worth it to break the reference cycle. In that case, I would ask @iritkatriel for the PR review as she's the expert in Exceptions (and I think she was the reviewer for a similar PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-multiprocessing type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@pteromys @gaogaotiantian @sunmy2019 and others