New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exceptions when updating plots in new example #4000
Comments
Hi everybody, I have the same error with a personal application. I regularly update some plots (by updating the underlying
I am using Bokeh 0.11.0. |
I personally think that some kind of race condition in the bokeh server is occuring. It seems that if you update a plot from the server side, the javascript realizes that something has changed and triggers an 'on change' to make the server aware that something happened. Kind of a vicious circle. Most of the time this thing is properly handled by the server but sometimes it even leads to an endless loop between the browser and the bokeh server like I pointed out in my initial post. |
Hey guys, I also experienced exactly the same issue a few month ago and it actually prevented bokeh from being used in our company in a production environment. |
Hey Joe I suggest next time you encounter such an issue you better directly post it here since it could have been fixed by now. In my experience the bokeh community is pretty fast dealing with issues especially @bryevdv is very active. |
@benoit9126 when exactly does your Exception occur? Does it happen when you update multiple ColumnDataSources on the server side at the same time? This is what I am doing. |
@the-neworder : yes exactly. It happens with a plot which is built from two my_column_data_source.data.update(**my_new_data) |
Ah I think we are getting to the core of the problem. In my example from above I have three ColumnDataSources per plot (one for the rects, one for the green line and one for the baseline) all of them get updated. Also I do not only have one but four plots so 4*3 data sources which can get updated at the same time. |
In my case, I have two plots. Each of them has two |
@the-neworder thanks very much for the kind words, I certainly try to do my best to be responsive. I will admit as the interest in Bokeh has recently grown this has become more of a challenge. (The scaling properties of individuals are not great!) I'm excited to see some new people interested in more significant development involvement starting to show up. Anyway, there is definitely code specifically in place to prevent propagation of updates resulting from the latest patch message, to prevent ping-ponging. It's possible there is a situation that slips through, of course. Or perhaps it is something else. The first step to being able to fix this, and also put in tests to ensure it does not creep back in, is to find a way to reliably reproduce the problem. Any help along these lines would be extremely useful and appreciated. I do think the two of you have narrowed it down nicely, which is a great start. |
@bryevdv Here is an example which fails after a few tenth of seconds. #!/usr/bin/env python
# -*- coding: utf-8 -*-
import time
from threading import Thread
import numpy as np
from bokeh.client import push_session
from bokeh.document import Document
from bokeh.models import ColumnDataSource
from bokeh.plotting import Figure
class Test(Thread):
def __init__(self, data1, data2):
super(Test, self).__init__(daemon=True)
self.data1 = data1
self.data2 = data2
self.cds1 = ColumnDataSource(data=data1)
self.cds2 = ColumnDataSource(data=data2)
self.fig = Figure(plot_width=500, plot_height=500, toolbar_location='left')
self.fig.circle_x(source=self.cds1, x='x', y='y', size='size')
self.fig.circle_cross(source=self.cds2, x='x', y='y', size='size')
def run(self):
while True:
print('ok')
self.data1['y'] = np.roll(self.data1['y'], 1)
self.data2['y'] = np.roll(self.data2['y'], 1)
self.data1['size'] = np.random.random(10) * 20
self.data2['size'] = np.random.random(10) * 20
self.cds1.data.update(**self.data1)
self.cds2.data.update(**self.data2)
time.sleep(0.1)
data1 = dict(x=np.arange(10), y=np.arange(10), size=np.random.random(10) * 20)
data2 = dict(x=np.arange(10), y=np.arange(10, 0, -1), size=np.random.random(10) * 20)
test = Test(data1=data1, data2=data2)
doc = Document()
doc.add_root(test.fig)
session = push_session(document=doc)
session.show()
test.start()
session.loop_until_closed() It is not beautiful but it does the work. 🎉 Update: if you try to zoom with the wheel, it sometimes trigger the bug. |
@benoit9126 your example is not executable unless i remove daemon=True.
also your code does not produce the error on my machine |
I am using python 3.4 and the daemon argument has been added in python 3.3. This example is pretty unpredictable. Sometimes, it works fine and sometimes it fails (generally quickly)... |
iam also trying to create an example but need some time to make it small |
I think the quantity of data is a key point... |
@benoit9126 can you try my example? execute
iam using python 2.7 |
You guys are great :) I will commit to running both of these today and try confirm if they reproduce the problem. Also ping @havocp just in case you have any quick thoughts. I immediately am suspicious that both examples above seem to use threads, and AFAIK none of our examples do. Is there an explicit document lock that ought to be acquired at some place? |
I will try to create a threadless version which uses periodic callback |
@the-neworder Your code fails more than mine which is good!
So there is only one thread which modifies the data... |
ok it seems that if data is added through add_periodic_callback, that this synchronized everything a lot better. I cannot get it to fail anymore even if I do:
also it still works if I do:
instead of
To summarize it: We get this error independet of using threads if we modify the datasource outside of the periodic callback. |
I haven't reviewed the whole discussion, but item 2 above is not allowed; you can't loop in one thread and update data in another. All touching of the document (or any part of bokeh really) needs to be in the ioloop thread. To get data from the worker thread to the ioloop thread, you could add a next tick callback; I'm not sure if we've actually made it safe to add a next tick callback from a thread but we should if not. That's the ONLY operation that tornado itself supports doing from another thread. Another possible approach is to hide the thread behind a function which returns a future (aka a coroutine) and then call that from the main ioloop thread and handle the future's result when it arrives (in the main thread). tldr - there are absolutely no thread locks or thread safety attempts in Bokeh that I'm aware of. It is categorically not thread safe. Doesn't mean threads are the cause of that exception, but I'd expect attempts to use bokeh from multiple threads to explode in all sorts of ways. |
fwiw the exceptions here look like corrupted protocol which could certainly happen if two threads both try to modify models (or one touches a model while another receives messages from the client). In that case you have uncontrolled random interleaving of reads/writes from the websocket. |
hey @havocp thx for clarification. I now changed my initial example and i can now add data every millisecond and update 4 plots at the same time and it never fails. Probably it should be strongly highlighted somewhere that modifying the data outside of the ioloop thread leads to exceptions. The first thing that i did was to put the loop_until_closed into a thread and I bet a lot of other also do. |
Yes there will absolutely be documentation support added as a result of this (very productive) discussion. I'm also trying to investigate the next tick callback a bit more, I have not used it but an example might go a long way as well. |
For reference here is a link to Tornado's http://www.tornadoweb.org/en/stable/ioloop.html#tornado.ioloop.IOLoop.add_callback It explicitly mentions that it "may be used to transfer control from other threads to the IOLoop‘s thread." so perhaps the task is to make an appropriately simple (or at least predictable and documentable) pattern for users to follow. |
fwiw I think you could put loop_until_closed in a thread, as long as every other use of Bokeh is also in that same thread. But kind of just a technicality ;-) not useful. The deal with next tick callback is just that we have our own data structures around it (on Document and ClientSession / ServerSession) which would need locking to make document.add_next_tick_callback threadsafe. If we add that locking, then I think Tornado supports adding a next tick callback from another thread. In the meantime as a workaround, it would probably work to use Tornado directly to call IOLoop.add_callback and get into the ioloop thread. Check out the docs at http://tornadokevinlee.readthedocs.org/en/latest/ioloop.html#tornado.ioloop.IOLoop.add_callback
So only next tick callbacks can be added from another thread - but of course you can always add a next tick callback which in turn adds another kind of callback or does anything else. The next tick callback is just a way to move code into the ioloop thread. For most applications I think the better option may be to use futures as discussed here: We really should have a nice API for that like the server_context.add_session_updater or similar suggested in that post. Another useful link perhaps about ways to do concurrency/threads with tornado: |
Thanks a lot for all this documentation. I have never used Tornado IOLoop directly. Tomorrow, I will try to use the |
Indeed a very productive discussion. I know now how to proceed. Thx a lot. |
It seems to work fine with |
Awesome! OK I will aim to write up some docs while I am traveling early next week. If there is a small-ish self-contained example that anyone would like to contribute or have added, that would certainly be good. A few questions: @havocp would it work to just wrap (internally) all callback additions in a next tick callback that actually makes the IOloop call to add the real callback? If I understand this could be a way to allow callback adding methods to be called from any thread? Are there any downsides? If that is not workable, @the-neworder @benoit9126 do y'all have any ideas or suggestions about either a thin convenience APIs or common patterns we could endorse and promote? |
I think this can be closed, there is an example in the server docs. |
Which is example on the server is this referring to? Thanks a lot for making this contribution! |
I created a pretty cool application with bokeh, which shows functionality that is not yet covered by any existing example. The code is attached to this issue and can freely be used as an official example. In the code custom barplots (not charts) for year, month, daily and named data were created. Tapping on a year will trigger a server side select of a month and refreshes the month plot data. The month plot then at the same time selects a day and refreshes the daily data and so on.
You should give it a try it is very cool.
Download and unpack bars_example.zip
Start the bokeh server (iam using bokeh 0.11.1):
run the example with:
Try selecting a year/month/day and see how everything is adapting x_range, y_range, secondary y_range. Also a mechanism was implemented to make it impossible to not have selected any bar. (Try clicking on the white space next to a bar).
The example was created with the goal to be able to continuously add more data and have the barplots adapt accordingly live. E.g. adding data from year 2017 will create the bar 2017 and directly show it if it does not exist yet.
A secondary mode was created which continuously adds days, which will create new month and even years. If new data was created it is directly selected and displayed. You can execute this mode by:
The mode works pretty well but if you wait a few minutes the following exception will appear. I tried it on multiple machines .. if you wait long enough the exception always appears.
Sometimes no exception appears and cpu usage by bokeh server is 100% and firefox and chrome usage also goes to 100%, while the plots stop updating.
The Exception that shows up is:
And sometimes directly after starting the program:
The text was updated successfully, but these errors were encountered: