Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking around output buffer #123

Merged
merged 3 commits into from
Apr 15, 2016
Merged

Conversation

thomasj02
Copy link
Contributor

Please bear with me, as this is my first PR. This new locking in this PR appears to fix ipython/ipython#9397, ipython/ipython#9168

The lock/unlock does slow down output, but it should only be noticeable for notebooks that are writing data very quickly (i.e., the ones most likely to run into this issue).

TBH I don't fully understand the threading model in ipykernel, but the fact that the issue was hard to reproduce consistently (plus the fact that the locking seems to fix it) led me to believe it was a multithreading issue.

Apologies in advance if I'm totally off-base on how the threading in ipykernel works.

@@ -314,7 +316,9 @@ def write(self, string):
string = string.decode(self.encoding, 'replace')

is_child = (not self._is_master_process())
self._buffer_lock.acquire(True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use with self._buffer_lock in both places

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand a bit: if the code between acquire() and release() throws an error for some reason, the lock would never be released. You can make sure that some 'after' code is run even if an exception is thrown:

lock.acquire()
try:
    do_stuff()
finally:
    lock.release()

But locks support a more convenient way to achieve the same thing:

with lock:
    do_stuff()

@minrk
Copy link
Member

minrk commented Apr 14, 2016

Thanks for the PR! Sorry for the brief comment, hopefully @takluyver clarified it a bit. I think you are right that it's a threading issue, and this is an appropriate fix. We may be able to find a more efficient solution in a later patch that avoids locking on every write, but this is a great place to start.

@minrk minrk added this to the 4.4 milestone Apr 14, 2016
@minrk
Copy link
Member

minrk commented Apr 14, 2016

And since this is your first PR, some further information: you can keep adding and pushing commits to this branch, and it will continue to be updated; then we can merge it when everything is all set.

@takluyver
Copy link
Member

@minrk in terms of a better approach, would it make sense to pass the data between threads by putting the strings into a Queue rather than manipulating StringIO instances?

@minrk
Copy link
Member

minrk commented Apr 14, 2016

That's possible. One thing I don't like about the Queue is it makes reading off the queue require N thread-safe .get() calls on the reading side, matching the number of .put calls, instead of a single .getvalue(). I did a quick test and using a Queue is about an order of magnitude slower than using StringIO + locking, so I think this solution is a good one.

@takluyver
Copy link
Member

Interesting. I think the queue is conceptually neater, so I played around a bit more to see if I could make it perform better. I couldn't improve it much using the Queue API, but internally a Queue is based around a deque and a lock, and using those directly, I could get comparable performance to StringIO+Lock:

https://gist.github.com/takluyver/04d8df0cea9e9deb959aba9a3b089f9f

Clearing the deque seems slightly nicer to me than recreating the StringIO on each flush, but we need a lock around in either scenario, so I don't think it's worth the trouble to switch.

(In fact, for this, we don't even need a deque, we could just append to a list - I just tried that, and performance is about the same)

@minrk
Copy link
Member

minrk commented Apr 14, 2016

Yeah, I would probably use a list+lock rather than a Queue, if that's preferable. I don't see much of a difference, so I'm inclined to let inertia decide, but I wouldn't object to a change like that.

@thomasj02
Copy link
Contributor Author

Updated to use with statements for locking

@minrk
Copy link
Member

minrk commented Apr 15, 2016

@thomasj02 thanks! Welcome to the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consistent reproduction of 9168
3 participants