New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Notebook] Clear kernel queue #2524
Comments
At what point can we clear the queue? After |
Can you clarify this. Is it a problem with just the Javascript side of things or the actual kernels queue. Please give more details. |
Only JS IIRC. |
I looked up the source. In terminal console, it seems there is no callback dict as request is done in a blocking manner. In QT console, there is callback dict for execute reply etc., but not for iopub (e.g., inline pylab). I guess this is because in QT console, result always appears linearly (after the already executed input prompt). In contrast, in notebook you need to remember in which cell the reply goes. Looking at ZMQTerminalInteractiveShell.run_cell, it seems that you can remove the callback after execute reply. But I am wondering if it is safe to assume that all iopub reply reaches at the client at the point execute reply arrives. As both messages go through different channels on different ports, isn't it possible to execute_reply message to arrive before iopub messages? Or ZMQ solves this problem wisely? |
I was also wondering what would append if a js plugin would register a callback itself. |
The traffic on iopub can arrive in any order, before or after the execute reply is received. We can't make any assumptions about ordering. Is the issue you are worrying about that the iopub callbacks never get cleared? |
If "you" is me, yes :) How about something like "end of IO signal" from kernel to tell clients that it is safe to clear callback queues. This way, you don't need to assume sequential execution and it will be thread-friendly. |
The problem is that the kernel doesn't know when code is done running for a cell. Users can always start a thread and that thread could keep pushing output at later times. But there is a problem with this right now - that output will be associated with the wrong cell. So maybe it is not too bad to do what you suggest? Thoughts? |
Yes, it is hard for threads created by user. Probably we can monkey patch threading module, or provide a cell magic command to temporally do monkey patch? The monkey patch I have in mind wraps the target function so that it sends "end of IO signal" if it is the last thread started in the corresponding execution request. At least, we can provide IPython-aware threading API so that it can be used in, for example, some magic functions such as But implementing "end of IO signal" does not require changes for making IPython thread-aware. Other way I can think of is to move |
It is a bit more complicated than this and I don't think it will work to muck with threads like this. Here is how the logic works:
What this means for threading is that if a thread lives beyond a single command, whenever it writes to iopub, it will use whatever request id is active at the time. This means that the threading output will appear in multiple cells - all of those that are run while the thread lives. I don't know of any way for the thread's output to be redirected to the current original cell. Here is one option:
BUT,threading or other long lived code (!mycode &) would still write output to iopub. But the frontend simply wouldn't have a callback handler for it. But maybe this is not any worse than the current situation where that output is written to multiple cells. So here is the big question underlying all of this: How do we want to handle output of long lived things (threads and !mycode &)? @fperez @minrk ? |
The iopub behavior you described is more less the same as what I think it would be (actually I thought request id will be cleared after execute_reply). And output_finished message you suggested is the same as what I thought it should be for the first implementation (i.e., no thread support). For the thread support, I am imagining doing something like the following in def flush(self):
data = self._buffer.getvalue()
content = {u'name': self.name, u'data': data}
parent_header = THREAD_HEADER_MAP[threading.current_thread()]
self.session.send(
self.pub_socket, u'stream', content=content,
parent=parent_header, ident=self.topic) Here def target_wrapper(parent_header, real_target):
THREAD_HEADER_MAP[threading.current_thread()] = parent_header
try:
real_target()
finally:
THREAD_HEADER_MAP.pop(threading.current_thread())
for (thread, header) in THREAD_HEADER_MAP.items():
if header == parent_header:
break
else:
get_session().send(self.pub_socket, u'output_finished',
parent=parent_header) (please imagine that |
This makes sense. The Right now, we have 'status=busy / idle' messages, which imply that the kernel has started / finished working on a task. It makes a bit more sense to me to have 'started / finished task' messages that imply that the kernel is busy / idle. |
@minrk makes a great point. We already send a message at exactly the right On Tue, Dec 4, 2012 at 1:03 PM, Min RK notifications@github.com wrote:
Brian E. Granger |
@tkf I really don't want to hack threads like this. We will face the same issues when users background processes using popen or |
What do you mean by general? If that means "it can be applied to many cases", I think my example is general enough. You can write the magic to execute command in background using the thread target function I mentioned. But of course I agree we should not do this by default. Also I agree that the magic to execute shell command does not need to use |
By general, I mean that there are many ways of running external processes: I also don't want to build anything into the ipython core that requires On Tue, Dec 4, 2012 at 1:41 PM, Takafumi Arakaki
Brian E. Granger |
I also don't want to create the expectation that output from threads will On Wed, Dec 5, 2012 at 4:35 PM, Brian Granger ellisonbg@gmail.com wrote:
Brian E. Granger |
If you want thread support only for calling command line program, why do you need to support popen, pexpect and os.system? If there is no thread API, what user can do is to run command line program through magic command or subprocess API, right? Then, there is no need for any general way to support thread. And I don't uderstand why my threading trick cannot handle all these ways of running subprocess, if the trick actually works as intended. It's not like I am 100% sure about my approach. I am just puzzled that why you just want to restrict threading support to subprocess related stuff while talking about generality. Supporting any subprocess related programming in thread is as difficult as supporting everything you can do in thread. It's a stupid example, but you can execute any python code after running some external command. If you are interested only in running external command, I guess it's better to just provide a magic command to do that. Having said that, it would be really nice if IPython provide thread API. For example, as ctypes releases GIL, you can call c function in a thread and update its output as a graph once in a while. Most of the time (i.e. computation is done in C), you can execute code in the same notebook. |
But if we add a new message to the iopub channel that clears the iopub On Fri, Dec 7, 2012 at 11:17 AM, Takafumi Arakaki
Brian E. Granger |
But all of this is much less important than simply clearing the JS On Mon, Dec 10, 2012 at 9:29 AM, Brian Granger ellisonbg@gmail.com wrote:
Brian E. Granger |
I propose Attaching cell id to callbacks (as callbacks are function one can attach other attributes to it) Note that this will only be an issue for really long running notebook. |
This is a pretty simple way of going about it, but I still think we want to On Mon, Dec 10, 2012 at 10:12 AM, Bussonnier Matthias <
Brian E. Granger |
the request id is already in the parent_header field. That's what parent_header has always been for - identifying the request that caused the reply / side-effect. What's missing? |
@minrk, What @Carreau suggested was cell id. Currently it is not in the message, right? And I guess he is suggesting for adding JS callback queues, not message. I think once output_finished is implemented there will be no left over in the message queue. There is no need for extra care to clear the queue, no? |
That's why I send output_finished in the last remaining thread. That |
Gotcha, I misunderstood - that was about the callback itself, not any addition to message content. So, I guess here's the question getting back to the issue at hand: In what way does the current status=idle message not address this issue, other than being poorly named? When status=idle is received, the iopub callback should be unregistered. I expect that the situation is a bit different for aborted tasks, so that will need to be looked at. Further, the callbacks can't be keyed by cell_id, because that won't work for widget-type requests, which are not associated with any cell. |
I thought if you send 100 execution request at once there will be only one status=idle after the last execution reply. But I just found out that there will be 100 replies with msg_id in its parent_header. So I think status=idle is functionally same as output_finished I was talking about. |
Yes, I don't think we need another message to be sent. On Mon, Dec 10, 2012 at 5:51 PM, Takafumi Arakaki
Brian E. Granger |
In kernel.js,
execution callback Queue keep growing in size even when callback are no more needed.
The text was updated successfully, but these errors were encountered: