Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qt5: SIGINT kills just the mpl window and not the process itself #13306

Merged
merged 5 commits into from Aug 23, 2021

Conversation

vdrhtc
Copy link
Contributor

@vdrhtc vdrhtc commented Jan 28, 2019

PR Summary

This pull request fixes issue #13302 using signal.set_wakeup_fd().

The implemented test_fig_sigint() uses the example from the aforementioned issue. It's rather an integration test, though. It does not fail with the previous behaviour, but never passes either hanging the whole testing process.

I also changed test_fig_signals() to pass with the new code.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant

@joelfrederico
Copy link
Contributor

I'd downvote. Timers kill performance, they're an exceptionally bad idea. If you really want Python to handle a signal, you should register a handler with Qt: http://doc.qt.io/qt-5/unix-signals.html. (There is an analogue Windows method to this as well.)

There is some discussion as to whether this is a good idea or not: #13302 (comment)

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 29, 2019

Removed QTimer, using signal.set_wakeup_fd() instead. Only tested on Linux!

KeyboardInterrupt is raised after qApp.exec_() when exit is due to caught SIGINT. Should be fine for everyone, I think, and be the default behaviour.

Copy link
Contributor

@joelfrederico joelfrederico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think Windows testing is important. Windows does sockets differently than Unix and I don't know what the Windows gotchas are. Also, your ctrl-c handlers for sure only get handled once the app processes some Python in Windows. If there's a path of execution through the event loop that stays in C, the Python callback writing to the socket will never happen. Windows really does need to have that callback triggered by its own ctrl-c handler a la https://github.com/zeromq/pyzmq/blob/39e679937bcedb178860847e81e931cda1b534a0/zmq/utils/win32.py#L116.

lib/matplotlib/backends/backend_qt5.py Outdated Show resolved Hide resolved
lib/matplotlib/backends/backend_qt5.py Outdated Show resolved Hide resolved
lib/matplotlib/tests/test_backend_qt.py Outdated Show resolved Hide resolved
lib/matplotlib/tests/test_backend_qt.py Outdated Show resolved Hide resolved
@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 30, 2019

Current approach seems to work on Windows almost out of the box, only had to change the socket type to default (STREAM) since Windows didn't allow DGRAM. The test however now doesn't work since os.kill(os.getpid(), signal.SIGINT) kills CPython with pytest and everything.

I again faced (and captured) the bug which is about qApp.exit() not being called after the sole plot window is closed. Is this connected with %pylab qt5?

@tacaswell tacaswell added this to the v3.2.0 milestone Jan 30, 2019
Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done more simply, I want to have final review on this.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 30, 2019

@tacaswell, it's certainly not ready yet, I need to test more and try to simplify =)

@joelfrederico, if I understand everything correctly, Python registers some low-level handler for SIGINTS that is actually setting a flag for the VM and writing to the file descriptor according to signal.set_wakeup_fd() independently of what the code is doing at the high level (this is undocumented). So when we make a Qt class listening to socketpair's other end, it should react and reacts (from what I see) immediately, even when not running the Python interpreter.

I also think I grasped what is in your zmq link with kernel32.SetConsoleCtrlHandler, I think it's also a way to try.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 30, 2019

Sorry, messed up with commits and the author names a bit, it was all me.

I've implemented the context manager (thanks, @joelfrederico), fixed the tests.

I think I am starting to like this code, but now I get
QSocketNotifier: Invalid socket 75 and type 'Read', disabling...
in the jupyter notebook terminal stderr. Probably due to the TCP socket type.

@joelfrederico
Copy link
Contributor

So... yes, Python actually has a SIGINT handler that, in an undocumented manner, records that a SIGINT has occurred. That's all the handler does.

Then, before every python instruction, python checks to see if a SIGINT has occurred. If it has, the python code runs the current Python-registered signal handler function.

But what actually happens if you're in the C part of Qt when a signal happens? The handler records that a SIGINT has occurred and returns. That's it. The Python-registered signal handler function will happen the next time any Python code runs. But what happens if no Python code ever runs again? Qt will continue along as if nothing happened. It will never call the Python-registered signal handler function, so there will be no socket event, so it will never quit.

Instead, what we probably want to happen is to register our own low-level signal handler. Then, if a signal happens during Qt C, our low-level signal handler runs. It then writes to a socket and returns. Our Qt should have a QSocketNotifier listening on that socket. Once the Qt event loops gets around to checking that socket, it sees the write and then emits a signal that it has detected an interrupt. Up until this point, everything is in C, no Python.

The question is, what do we want the SIGINT slots to do? We have options. I think a solid option is to simply call the custom handler (that is effectively a closure) that you designed. That custom handler records that a SIGINT has happened, followed by a call to Qt exit(). That way, whether a SIGINT happens in Qt or in Python, the result is the same. Then the Qt event loop continues until it checks to see if exit() has been called. It then exits and returns from the place exec_() was called. Your closure had registered that a SIGINT had happened, so then we call the previous signal handler.

@joelfrederico
Copy link
Contributor

The astute observer will notice the problem: what happens with us overriding Python's SIGINT handler? Does it get reset at some point, or does ours take precedence? If so, we need to make sure it gets reset. I don't really know how to do this. It should be possible in theory, just a question of whether Python's low-level SIGINT handler is documented.

Like I said, I switched away from dealing with this, so I didn't tackle this issue.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 31, 2019

@joelfrederico,

But what actually happens if you're in the C part of Qt when a signal happens? The handler records that a SIGINT has occurred and returns. That's it. The Python-registered signal handler function will happen the next time any Python code runs. But what happens if no Python code ever runs again? Qt will continue along as if nothing happened. It will never call the Python-registered signal handler function, so there will be no socket event, so it will never quit.

https://github.com/python/cpython/blob/a1f9a3332bd4767e47013ea787022f06b6dbcbbd/Modules/signalmodule.c#L237-L252

Here I can vaguely see that the handler indeed sets the flag for the VM to check AND writes to wakeup fd after that. The trip_signal() func is called in the default low-level handler signal_handler() used then in PyOS_setsig()(defined here), so I guess it gets fully executed no matter what code is running, pure C library or CPython. This means that we can rely on that wakeup fd, as I did, and do not invent anything -- it looks like changing this low-level handler would be hard.

@joelfrederico
Copy link
Contributor

Is that file descriptor/socket documented to be used always and only for SIGINT? If so, great! Done.

If not, I'm personally very wary of relying on undocumented behavior.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 31, 2019

@joelfrederico
No, from the code it's for any signal -- it writes the signal number into the fd. Qt registers that write (from now on we don't care about the value written), and then calls Python. Python now can find it's flag, finds that was actually a SIGINT, and, finally, acts according to our set handler.

It's not fully documented when the descriptor is written to (I mean, at the same time the signal is caught by the process), but I think it's the only purpose of this function. I think I may ask about it in their repo, should I? Maybe they will document that more precisely, I can't imagine why they wouldn't want to.

@joelfrederico
Copy link
Contributor

Hey it is documented! Python function signal.set_wakeup_fd. That's awesome! I was rooting around in the C API and it was in the signal module the whole time!

Okay, so all you have to do is adapt your code to create a listening socket, pass that in to signal.set_wakeup_fd and into a QSocketNotifier, and you're good! Just reset it when you're done. The QSocketNotifier just needs to run some Python, which runs your signal handler and everything's fine.

I feel silly for not noticing this earlier. Well done!

@joelfrederico
Copy link
Contributor

And AFAICT this is cross-platform, so no futzing with ugly Windows hacks! I have no idea why PyZMQ gets this so wrong. That's just neat, I like it.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Jan 31, 2019

it was in the signal module the whole time!

Yes, that's what I what trying to convey, it was from the beginning all on StackOverflow as well =) But not in the first answer, so coding was much harder this time!

I have no idea why PyZMQ gets this so wrong

Because they didn't have the function on Windows back in 2014, I guess:
default

And now no one would want to change that =)

Is it possible to set a handler like QSocketNotifier in ZMQ? I really don't know much about how it works

@joelfrederico
Copy link
Contributor

So the issue seems to be that the low-level Python SIGINT handler doesn't have an analogue in Windows. You have to set it yourself the way PyZMQ does. So signal.set_wakeup_fd does nothing. Oh well, it was a nice try.

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Feb 1, 2019

@joelfrederico, sorry, I wasn't precise: I said they didn't have it in 2014, but that's not true. The thing is that signal.set_wakeup_fd started supporting socket handles only since Python 3.5 (released on Fall, 2015), as said in the doc:
default
Additionally, you could not create a socketpair in Windows before Python 3.5:
default
So I guess these problems made them to create their own win32 utility in 2014 when Python 3.5 has not yet been released.

But now the approach works as expected, as I have tested -- the sigint test passes on Windows and on Linux.

@tacaswell tacaswell modified the milestones: v3.2.0, v3.3.0 Aug 27, 2019
@QuLogic QuLogic modified the milestones: v3.3.0, v3.4.0 May 2, 2020
@jklymak
Copy link
Member

jklymak commented Jul 23, 2020

I'm going to close this as abandoned, but please feel free to reopen if anyone wants to pursue it...

@jklymak jklymak closed this Jul 23, 2020
@QuLogic
Copy link
Member

QuLogic commented Jul 24, 2020

I'm not sure what you mean; this seems to have been updated to match the latest request. It needs a review now, but I tested it and it works.

@QuLogic QuLogic reopened this Jul 24, 2020
@jklymak
Copy link
Member

jklymak commented May 12, 2021

@vdrhtc I moved this to "Draft" to keep it off our radar until you were done with it. However, if you are done, feel free to move to "Ready for review". Thanks!

@anntzer
Copy link
Contributor

anntzer commented May 23, 2021

Can you describe how to repro the crash you mention?

@vdrhtc
Copy link
Contributor Author

vdrhtc commented May 23, 2021

@anntzer, the way is to replace this line by a pass statement so the socket is not read from. Can you reproduce?

@tacaswell tacaswell self-assigned this Jun 30, 2021
@tacaswell
Copy link
Member

@vdrhtc I'm going to do this rebase and take over this PR, sorry this has sat for so long.

vdrhtc and others added 3 commits August 19, 2021 15:37
signal.set_wakeup_fd() used instead, start_event_loop() is
interruptible now, allow_interrupt context manager, do not override
None, SIG_DFL, SIG_IGN
@tacaswell
Copy link
Member

I am now concerned why this did work for me locally before changing out the application for the loop in pause. My guess is something in detailed qt versions that I do not want to chase down.

These tests also pass for me locally.

@tacaswell tacaswell marked this pull request as ready for review August 20, 2021 19:44
@QuLogic
Copy link
Member

QuLogic commented Aug 20, 2021

They also pass on CI, so is all fine now?

@tacaswell
Copy link
Member

They also pass on CI, so is all fine now?

Yes I think so.

lib/matplotlib/backends/qt_compat.py Outdated Show resolved Hide resolved
lib/matplotlib/backends/qt_compat.py Outdated Show resolved Hide resolved
Co-authored-by: Elliott Sales de Andrade <quantum.analyst@gmail.com>
@QuLogic QuLogic merged commit 9e8c8fa into matplotlib:master Aug 23, 2021
@QuLogic
Copy link
Member

QuLogic commented Aug 23, 2021

Oops, crossed with your review.

@anntzer
Copy link
Contributor

anntzer commented Aug 23, 2021

my fault, shouldn't have decided to review after all :) do you want to address them separately?

@vdrhtc
Copy link
Contributor Author

vdrhtc commented Aug 26, 2021

@anntzer @tacaswell
Hi! I have just tested the master branch on Windows and found that there is indeed a problem with the QSocketNotifier (it appeared here: #13306 (comment)). Previously, on that Windows machine, I was running my old code with QAbstractSocket which was fine, and did not test the newest version with QSocketNotifier there (my fault, sorry). I have tested on Linux, and there is no such problem there.

The code to reproduce on Windows (in Jupyter):

%matplotlib qt
from matplotlib import pyplot as plt
plt.plot([1,2])
plt.pause(.1)
plt.pause(.1)
plt.pause(.1)

The problem seems to be that if socketpair generates a socket with the same fileno value as it generated in the previous pause() call, the QSocketNotifier unexpectedly fires and hangs here:

sn.activated.connect(lambda *args: rsock.recv(1))

because nothing was actually written to the wakeup_fd. I don't understand how that can be.

Workaround fix:

rsock.set_blocking(False)

@sn.activated.connect
def on_signal(*args):
    try:
        signal = rsock.recv(1)  # clear the socket to re-arm the notifier
    except BlockingIOError:
        pass # false-trigger, nothing on rsock

What should we do with this? Is there a better solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GUI: Qt Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants