-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add locking around output buffer #123
Conversation
@@ -314,7 +316,9 @@ def write(self, string): | |||
string = string.decode(self.encoding, 'replace') | |||
|
|||
is_child = (not self._is_master_process()) | |||
self._buffer_lock.acquire(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use with self._buffer_lock
in both places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand a bit: if the code between acquire()
and release()
throws an error for some reason, the lock would never be released. You can make sure that some 'after' code is run even if an exception is thrown:
lock.acquire()
try:
do_stuff()
finally:
lock.release()
But locks support a more convenient way to achieve the same thing:
with lock:
do_stuff()
Thanks for the PR! Sorry for the brief comment, hopefully @takluyver clarified it a bit. I think you are right that it's a threading issue, and this is an appropriate fix. We may be able to find a more efficient solution in a later patch that avoids locking on every write, but this is a great place to start. |
And since this is your first PR, some further information: you can keep adding and pushing commits to this branch, and it will continue to be updated; then we can merge it when everything is all set. |
@minrk in terms of a better approach, would it make sense to pass the data between threads by putting the strings into a |
That's possible. One thing I don't like about the Queue is it makes reading off the queue require N thread-safe |
Interesting. I think the queue is conceptually neater, so I played around a bit more to see if I could make it perform better. I couldn't improve it much using the Queue API, but internally a Queue is based around a deque and a lock, and using those directly, I could get comparable performance to StringIO+Lock: https://gist.github.com/takluyver/04d8df0cea9e9deb959aba9a3b089f9f Clearing the deque seems slightly nicer to me than recreating the StringIO on each flush, but we need a lock around in either scenario, so I don't think it's worth the trouble to switch. (In fact, for this, we don't even need a deque, we could just append to a list - I just tried that, and performance is about the same) |
Yeah, I would probably use a list+lock rather than a Queue, if that's preferable. I don't see much of a difference, so I'm inclined to let inertia decide, but I wouldn't object to a change like that. |
Updated to use |
@thomasj02 thanks! Welcome to the project. |
Please bear with me, as this is my first PR. This new locking in this PR appears to fix ipython/ipython#9397, ipython/ipython#9168
The lock/unlock does slow down output, but it should only be noticeable for notebooks that are writing data very quickly (i.e., the ones most likely to run into this issue).
TBH I don't fully understand the threading model in ipykernel, but the fact that the issue was hard to reproduce consistently (plus the fact that the locking seems to fix it) led me to believe it was a multithreading issue.
Apologies in advance if I'm totally off-base on how the threading in ipykernel works.