Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SystemError in guest mode when cancelling finished scope #1576

Closed
altendky opened this issue Jun 5, 2020 · 7 comments
Closed

SystemError in guest mode when cancelling finished scope #1576

altendky opened this issue Jun 5, 2020 · 7 comments

Comments

@altendky
Copy link
Member

altendky commented Jun 5, 2020

Full code at the end but here's the 'interesting' snippet that causes trouble when run in guest mode. Tests run with Python 3.8.3 in Linux and 08fca2a.

async def mine():
    global cscope
    widget = QtWidgets.QWidget()
    with trio.CancelScope() as cscope:
        app.lastWindowClosed.connect(cscope.cancel)
        widget.show()
    return 1

With just widget.show() I get:

/home/altendky/repos/trio/trio/_core/_run.py:2161: RuntimeWarning: Trio guest run got abandoned without properly finishing... weird stuff might happen
  warnings.warn(

With app.lastWindowClosed.connect(cscope.cancel) and widget.show() I get:

StopIteration: 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/altendky/repos/trio/trio/_core/_ki.py", line 159, in wrapper
    locals()
SystemError: <built-in function locals> returned a result with an error set
/home/altendky/repos/trio/trio/_core/_run.py:2161: RuntimeWarning: Trio guest run got abandoned without properly finishing... weird stuff might happen
  warnings.warn(

I've tried to recreate the error without PyQt by just putting the scope cancellation further out but it results in no issues. I thought maybe PyQt was holding onto a reference to a no longer valid object but global cscope didn't avoid the SystemError. The same SystemError occurs back to the original commit in #1551, at least there and a few spot checks along the way.

I just bothered to follow the why-not-pyside2 link https://bugreports.qt.io/projects/PYSIDE/issues/PYSIDE-1313 and at least the StopIteration and SystemError are similar albeit against a different function call.

This isn't any sort of a blocker, just a thing where you have to be careful to disconnect a signal.

Thanks again for guest mode. Someday I'll do something more than a trivial demo with it. :]

Full example source
import trio
import sys
from outcome import Error
import traceback

# Can't use PySide2 currently because of
# https://bugreports.qt.io/projects/PYSIDE/issues/PYSIDE-1313
from PyQt5 import QtCore, QtWidgets

async def mine():
    global cscope
    widget = QtWidgets.QWidget()
    with trio.CancelScope() as cscope:
        app.lastWindowClosed.connect(cscope.cancel)
        widget.show()
    return 1

app = QtWidgets.QApplication(sys.argv)

# This is substantially faster than using a signal... for some reason Qt
# signal dispatch is really slow (and relies on events underneath anyway, so
# this is strictly less work)
REENTER_EVENT = QtCore.QEvent.Type(QtCore.QEvent.registerEventType())

class ReenterEvent(QtCore.QEvent):
    pass

class Reenter(QtCore.QObject):
    def event(self, event):
        event.fn()
        return False

reenter = Reenter()

def run_sync_soon_threadsafe(fn):
    event = ReenterEvent(REENTER_EVENT)
    event.fn = fn
    app.postEvent(reenter, event)

def done_callback(outcome):
    print(f"Outcome: {outcome}")
    if isinstance(outcome, Error):
        exc = outcome.error
        traceback.print_exception(type(exc), exc, exc.__traceback__)
    app.quit()

trio.lowlevel.start_guest_run(
    mine,
    run_sync_soon_threadsafe=run_sync_soon_threadsafe,
    done_callback=done_callback,
)

app.exec_()
@njsmith
Copy link
Member

njsmith commented Jun 5, 2020

Maybe you need to set quitOnLastWindowClosed to False?

https://doc.qt.io/qt-5/qguiapplication.html#quitOnLastWindowClosed-prop

@altendky
Copy link
Member Author

altendky commented Jun 5, 2020

It's different, but I'm not sure it's better. :]

Just to be clear, I don't see this as an issue for me. It's easy enough to disconnect the signal. But, I generally expect that I shouldn't be able to generate stuff like SystemError when writing pure Python code myself so I report it for completeness. I'm tempted to hold PyQt5 responsible but since the message happened in Trio code and in a feature you just added, I started with a report here.

Outcome: Error(SystemError('<built-in method send of coroutine object at 0x7f28d1fb3340> returned NULL without setting an error'))
StopIteration: 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/altendky/repos/trio/trio/_core/_ki.py", line 159, in wrapper
    locals()[LOCALS_KEY_KI_PROTECTION_ENABLED] = enabled
SystemError: <built-in function locals> returned a result with an error set
SystemError: <built-in method send of coroutine object at 0x7f28d1fb3340> returned NULL without setting an error
Source with `app.setQuitOnLastWindowClosed(False)`
import trio
import sys
from outcome import Error
import traceback

# Can't use PySide2 currently because of
# https://bugreports.qt.io/projects/PYSIDE/issues/PYSIDE-1313
from PyQt5 import QtCore, QtWidgets

async def mine():
    global cscope
    widget = QtWidgets.QWidget()
    with trio.CancelScope() as cscope:
        app.lastWindowClosed.connect(cscope.cancel)
        widget.show()
    return 1

app = QtWidgets.QApplication(sys.argv)
app.setQuitOnLastWindowClosed(False)

# This is substantially faster than using a signal... for some reason Qt
# signal dispatch is really slow (and relies on events underneath anyway, so
# this is strictly less work)
REENTER_EVENT = QtCore.QEvent.Type(QtCore.QEvent.registerEventType())

class ReenterEvent(QtCore.QEvent):
    pass

class Reenter(QtCore.QObject):
    def event(self, event):
        event.fn()
        return False

reenter = Reenter()

def run_sync_soon_threadsafe(fn):
    event = ReenterEvent(REENTER_EVENT)
    event.fn = fn
    app.postEvent(reenter, event)

def done_callback(outcome):
    print(f"Outcome: {outcome}")
    if isinstance(outcome, Error):
        exc = outcome.error
        traceback.print_exception(type(exc), exc, exc.__traceback__)
    app.quit()

trio.lowlevel.start_guest_run(
    mine,
    run_sync_soon_threadsafe=run_sync_soon_threadsafe,
    done_callback=done_callback,
)

app.exec_()

@njsmith
Copy link
Member

njsmith commented Jun 5, 2020

Oh, I missed the SystemError messages on the first pass... that does strongly suggest that PyQt5 has a similar bug to PySide2 :-/ AFAIK that error message should only be possible from bugs in extension modules.

@altendky
Copy link
Member Author

altendky commented Jun 5, 2020

Hey, I didn't even realize that trio had managed to stay pure Python. :| Anyways, I guess we'll call this 'good' for now. If I get around to following up with PyQt5 I'll post any results back here for reference. I just wrote up a trivial context manager for signal connections to avoid the issue all together.

@contextlib.contextmanager
def connection(signal, slot):
    this_connection = signal.connect(slot)
    try:
        yield this_connection
    finally:
        signal.disconnect(this_connection)

@altendky altendky closed this as completed Jun 5, 2020
@njsmith
Copy link
Member

njsmith commented Jun 5, 2020

It definitely looks like PyQt5 / sip has a similar bug as PySide2 / shiboken.

I ran your test script under gdb, and did:

  • b _PyGen_SetStopIterationValue (after checking the cpython source to see that this is called whenever a coroutine returns a non-None value)
  • hit c until I hit the breakpoint with the value argument of 1, since I know that that's the return value that's getting corrupted
  • added a watchpoint: watch -l _PyThreadState_UncheckedGet()->curexc_type, this breaks whenever Python's "current exception" changes
  • hit c, and saw that the StopIteration was getting removed by some code down in _Py_DeallocsipWrapper_dealloc

Full traceback:

#0  0x00000000004af8b2 in _PyErr_Fetch
    (p_traceback=0x7fffffffc588, p_value=0x7fffffffc578, p_type=0x7fffffffc570, tstate=0x95b410) at ../Python/errors.c:533
#1  _PyErr_FormatVFromCause
    (vargs=0x7fffffffc590, format=0x6c33e0 "%R returned a result with an error set", exception=<type at remote 0x8f0fa0>, tstate=0x95b410) at ../Python/errors.c:502
#2  _PyErr_FormatFromCause
    (exception=<type at remote 0x8f0fa0>, format=0x6c33e0 "%R returned a result with an error set") at ../Python/errors.c:533
#3  0x00000000004b669e in _Py_CheckFunctionResult
    (where=0x0, result=<optimized out>, callable=<optimized out>) at ../Objects/call.c:51
#4  _Py_CheckFunctionResult (where=0x0, result=<optimized out>, callable=<optimized out>)
    at ../Objects/call.c:23
#5  _PyObject_Vectorcall
    (kwnames=0x0, nargsf=<optimized out>, args=0x7ffff70011d8, callable=<optimized out>)
    at ../Include/cpython/abstract.h:128
#6  call_function
    (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x95b410)
    at ../Python/ceval.c:4987
#7  _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>)
    at ../Python/ceval.c:3500
#8  0x00000000005654d2 in PyEval_EvalFrameEx
    (throwflag=0, f=Frame 0x7ffff7001040, for file /home/njs/trio/trio/_core/_ki.py, line 159, in wrapper (args=(<CancelScope at remote 0x7ffff14dbdc0>,), kwargs={}))
    at ../Python/ceval.c:741
#9  _PyEval_EvalCodeWithName
    (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x7fffffffc960, kwcount=<optimized out>, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=(<cell at remote 0x7ffff71f07f0>, <cell at remote 0x7ffff70c0f70>), name='cancel', qualname='CancelScope.cancel') at ../Python/ceval.c:4298
#10 0x00000000005f1bc5 in _PyFunction_Vectorcall
    (func=<optimized out>, stack=0x7fffffffc958, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:435
#11 0x0000000000507e1c in _PyObject_Vectorcall
    (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ../Include/cpython/abstract.h:127
#12 method_vectorcall
    (method=<optimized out>, args=0x7ffff7632058, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/classobject.c:89
#13 0x00000000005f0e3e in PyVectorcall_Call
    (kwargs=<optimized out>, tuple=<optimized out>, callable=<method at remote 0x7ffff14f5dc0>) at ../Objects/call.c:199
#14 PyObject_Call
    (callable=<method at remote 0x7ffff14f5dc0>, args=<optimized out>, kwargs=<optimized out>) at ../Objects/call.c:227
#15 0x00007ffff69005f5 in PyQtSlot::call(_object*, _object*) const ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/QtCore.abi3.so
#16 0x00007ffff6900af8 in PyQtSlot::invoke(void**, _object*, void*, bool) const ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/QtCore.abi3.so
#17 0x00007ffff6900da0 in PyQtSlotProxy::unislot(void**) ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/QtCore.abi3.so
#18 0x00007ffff69018a7 in PyQtSlotProxy::qt_metacall(QMetaObject::Call, int, void**) ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/QtCore.abi3.so
#19 0x00007ffff61f4657 in void doActivate<false>(QObject*, int, void**) ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/Qt/lib/libQt5Core.so.5
#20 0x00007ffff28d4dec in QWidgetPrivate::close_helper(QWidgetPrivate::CloseMode) ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/Qt/lib/libQt5Widgets.so.5
#21 0x00007ffff28d52b0 in QWidget::~QWidget() ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/Qt/lib/libQt5Widgets.so.5
#22 0x00007ffff318e4d9 in sipQWidget::~sipQWidget() ()
    at /home/njs/.user-python3.8/lib/python3.8/site-packages/PyQt5/QtWidgets.abi3.so
#23 0x00007ffff374cb74 in forgetObject (sw=sw@entry=0x7ffff14f9310) at siplib.c:11339
#24 0x00007ffff374cb99 in sipWrapper_dealloc (self=0x7ffff14f9310) at siplib.c:10961
#25 0x00000000005a4fbc in subtype_dealloc (self=<optimized out>)
    at ../Objects/typeobject.c:1289
#26 0x00000000005e8c08 in _Py_Dealloc (op=<optimized out>) at ../Objects/object.c:2215
#27 _Py_DECREF
    (filename=0x881795 "../Objects/frameobject.c", lineno=430, op=<optimized out>)
    at ../Include/object.h:478

@njsmith
Copy link
Member

njsmith commented Jun 5, 2020

Hmm, though the way we're hitting it here is a bit different. I think what's happening is:

  • your code calls widget.show()
  • then it immediately exits the mine function, because there's nothing to wait for
  • this causes the QWidget object to be garbage-collected
  • which causes its window to close
  • which triggers your lastWindowClosed callback
  • which calls cscope.cancel
  • and now we're running cscope.cancel inside the GC collection at the end of the mine function, when CPython is in the middle of raising StopIteration, and CPython gets very confused

@altendky
Copy link
Member Author

altendky commented Jun 5, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants