-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid iopub losing messages by polling two iopub AND shell socket #1183
Conversation
nbconvert/preprocessors/execute.py
Outdated
if monotonic() > deadline: | ||
self._handle_timeout(exec_timeout, cell) | ||
if xlist: | ||
raise RuntimeError("Oops, unexpected rror") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise RuntimeError("Oops, unexpected rror") | |
raise RuntimeError("Oops, unexpected error from zmq") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is partly the reason this is a draft, we don't have a test that could trigger this. My guess is that maybe closing the socket from the kernel side might do it, or segfaulting the kernel.
Thanks for putting this together. I'll want to dedicate some time to comparing and testing it as this section of the code is hard to fully reason about on inspection alone. The good news is this code is still the same in nbclient so it should translate over cleanly. I can help with the tests if we get to approval and there's still gaps. |
Hi, was this implemented in another PR? I still see IOPub timeouts sometimes. |
@echuber2 the work here is not longer applicable as it was rewritten and moved into nbclient with the 6.0 release. There is always an internal buffer somewhere with the zmq communication. Timeouts usually occur when there's 1000's or 10s of 1000's of messages in a very short time (a second). I'd look at if your notebook is trying to print way too much information at once as it's also likely a mess in output formatting as well when there's that many messages. If there's a smaller unit of messages but still hitting timeouts I'd post to nbclient with more details on what you're executing. |
A POC/port to nbconvert of voila-dashboards/voila#536
Solves: nteract/papermill#426
Alternative to #994
I think eventually this should go into nbclient, but opening this just to show how it can be done.
#994 polls for 1 second on the shell channel before polling/reading from the iopub channel. In this 1 second, the iopub socket can hit the high water mark of 0mq (default of a 1000).
This causes some messages to be dropped, causing a IOPub timeout. Instead, in this PR, we poll both sockets, receiving them ASAP, no messages lost.
Fixing the unittests might be difficult, since it requires monkeypatching zmq.select I think.