New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KernelManagers don't use zmq eventloop properly #967
Comments
I should also note that the polling logic in the HB channel is unnecessarily complicated, and actually wrong in some places. There is a comment interpreting |
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
closed by PR #1030 |
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in ipython#967). This has been fixed. * KernelManager.is_alive() checks if the hb_channel is running if the kernel is not owned, rather than always returning True. * BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not reliably much faster than that. There are occasional >0.5s outlier responses.
The KernelManagers use sockets and tornado handlers directly via add/drop_io_state, thus essentially duplicating the ZMQStream objects that handle the state triggering, queuing, etc. already. Approximately all of the private methods on these channels are redundant with code already in ZMQStream.
Also, each channel should probably not be in a separate thread, they should share one ioloop instance between them, and it should be possible for this loop to be run in the main thread if so desired. The most apparent problem this causes (a minor one) is that stopping the channels can take a full second if there is no traffic on the network, which is a long time. I cannot think of any benefit to the channels running in their own threads, as they do now.
This is a low-ish priority, because the code we have works, but it should definitely be fixed.
The text was updated successfully, but these errors were encountered: