Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

KernelManagers don't use zmq eventloop properly #967

Closed
minrk opened this Issue · 2 comments

1 participant

@minrk
Owner

The KernelManagers use sockets and tornado handlers directly via add/drop_io_state, thus essentially duplicating the ZMQStream objects that handle the state triggering, queuing, etc. already. Approximately all of the private methods on these channels are redundant with code already in ZMQStream.

Also, each channel should probably not be in a separate thread, they should share one ioloop instance between them, and it should be possible for this loop to be run in the main thread if so desired. The most apparent problem this causes (a minor one) is that stopping the channels can take a full second if there is no traffic on the network, which is a long time. I cannot think of any benefit to the channels running in their own threads, as they do now.

This is a low-ish priority, because the code we have works, but it should definitely be fixed.

@minrk
Owner

I should also note that the polling logic in the HB channel is unnecessarily complicated, and actually wrong in some places. There is a comment interpreting poll() returning an empty list as a zeromq bug, which it certainly is not, and the code appears to follow this misunderstanding. Since this bug is actually critical to getting the terminal-frontend working, I will at least fix that part over in #864.

@minrk minrk referenced this issue from a commit
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
f9de65b
@minrk minrk referenced this issue from a commit
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
ac70da8
@minrk minrk referenced this issue from a commit
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
649aa17
@jenshnielsen jenshnielsen referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@minrk
Owner

closed by PR #1030

@minrk minrk closed this
@minrk minrk referenced this issue from a commit
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
804dd6f
@minrk minrk referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@minrk minrk referenced this issue from a commit
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
f7e44e6
@ellisonbg ellisonbg referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@mattvonrocketstein mattvonrocketstein referenced this issue from a commit in mattvonrocketstein/ipython
@minrk minrk Fixes to the heartbeat channel
* The heartbeat channel had some erroneous zeromq logic, and entirely False comments (as described in #967).  This has been fixed.

* KernelManager.is_alive() checks if the hb_channel is running if
the kernel is not owned, rather than always returning True.

* BlockingKM's hb_channel has been relaxed to 1s polling, because replies are not
reliably much faster than that.  There are occasional >0.5s outlier responses.
7d56d1a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.