-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server rewrite based on Tornado #2794
Conversation
My hot reaction to the proposal is absolutely positive. Need to dedicate some more time to go through it more carefully but here are a couple of quick thoughts:
Before pouring the river of thoughts I have 😝 it's better to wait for the last question .. |
@fpliger Yes I need to add some notes about updates, that's basically the "decide what message/operations" exist bullet is about. I had planned on support both full and incremental and partial updates (i.e., a message specifically for a streaming append, for instance, that checks that all columns get appended to equally and complains informatively if not) I'll put up some initial thoughts today and we can iterate on them. Regarding applets, I was still imagining a way to associate "scripts" or "apps" with a document, so the code could be run in the server (in thread pool executors since we are async) with "direct" access to the models. This is basically @havocp's "server callbacks" afaict. Edit: but maybe everything interacting through messages would be simpler and enough? I'm open to discussion though I feel like the want is to be able to "publish" an app and further I think in people's minds this implies it is somehow stored in the server as well. |
|
||
@gen.coroutine | ||
def workon(self, item): | ||
yield self._queue.put(item) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means you are not interested in the result of the processing, or even the information that the processing is done, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly, most of the processing here will involve updating the persistent store, perhaps as a result of a longer running computation. Those updates might trigger sending of notifications. Right now this part is just basic scaffolding, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance, a computationally expensive callback gets queued here, that eventually updates the data source for a plot. That causes another chunk of work to get queues on the main IOLoop that will either push updates to the connected client, or send a message that there are updates available to pull. Does that seem reasonable? I suppose the yield
here could return that secondary work and queue it on the other queue here, though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speaking of this topic, I was planning on serializing any operations that might write to the document, and servicing reads from the document immediately.
I'm very enthusiastic about this direction! Some comments from an initial skim of the code.
|
METADATA = 2 | ||
CONTENT = 3 | ||
BUFFER_HEADER = 4 | ||
BUFFER_PAYLOAD = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possibly-more-concise way to do this state machine is to make each state a function, and have a variable containing the current function. Like self.current_consumer = consume_hmac
kind of thing. Then you avoid the "if" statements and the enum itself. Just a thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup was already thinking of making basically that change.
IMO deciding what to do with messages is the prerogative of the server and client, respectively, so those methods have nothing to do on the message objects themselves. Message objects should provide methods pertaining to their inner logic, but that's all. |
raise gen.Return(None) | ||
|
||
# make sure the session ID from the client message matches | ||
if message.header['sessid'] != self.session.id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this check to be once when the connection opens, rather than per-message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just being paranoid I guess, seemed that perhaps a client could forge a different session id if it wanted to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I meant I guess is you could send a special "my session id is ___" message as the first thing on the connection and then omit it from each message, then it can't be wrong and doesn't need to be checked per-message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sending the session ID is something that happens on when the server replies to a connection open with an ACK. I think it would still need to be sent, if we are going to simplify the message IDs?
Yah I tend to agree, was just imagining an explosion of little handler classes. But better to do things right. |
Full confession some of this part was cargo culting on my part. I cribbed the HMAC bit from the ipython protocol. I had assumed it was to detect corruption or adulteration of message contents over HTTP connections. I do agree that an incremental counter could suffice in conjuction with a session ID, but also with an origin tag (client or server) as well? So the client has one counter, and the server has another. Otherwise how to synchronize one counter across both? Edit: perhaps an option to enable/disable HMAC checking?
It's a hedge for sure, on the notion that if you do need it later, it's a royal pain to tack on after the fact. I have definitely seen protocol versioning used. So, I dunno. One scenario I can easily imagine is adding new message types. Which might not be incompatible per se, but it's nice to have a record of exacltly where and how these things happen. And no, I think a user could create a document e.g., load BokehJS from CDN but also expects to talk to a Bokeh server for the document data. Edit: I should say that scenario is currently do-able but not trivially so. But in general if a person publishes a document with a specific version of Bokeh is there never an expectation that it be rendered with that same version? If so, I think a server deployed as a "public service" would actually need to be able to serve several versions of BokehJS. |
@bokeh/dev OK some notes on the latest pushes:
I think if just the new |
@bryevdv do you intend to replace the current server completely with the new (limited) one in 0.10? If so we should should out loud about those limitations so users that have bokeh-server in production (know that their app may break. I don't expect many users using bokeh-server in production but still, there may be a significant number... My gut feeling is that maybe there are users using the server streaming capabilities.. |
@fpliger I may not have been clear, any limitations would be relative to the future new server, not relative to the current one. As I mentioned implementing push/pull should reproduce current server capability. I don't intend to replace anything until that parity is reached. (Maybe some syntax will break or usage will have to change, but the feature set should be at least equal before replacing.) I don't have strong feelings about shipping both servers tho I think my preference is to have a clean break. If people need the old server for a time they could just hold off on upgrading. |
I also don't think getting this in |
@bryevdv sounds good. I misunderstood your previous comment. 👍 |
def _start_helper(self): | ||
self.io_loop = IOLoop.current() | ||
atexit.register(self._atexit) | ||
signal.signal(signal.SIGTERM, self._sigterm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a stupid question, but why is this done asynchronous? Can't this not simply be done in the init or start() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this was inspired loosely from the notebook server init code. I'll try simplifying it.
@pitrou thinking about this a bit more -- I think it will have to be a little more sophisticated? Basically we need to serialize all the "work" for a given document. For instance we don't want any operations that write to a given document to overlap, but operations for different documents can be concurrent. I think that it is possible for multiple operations on a single doc to overlap, with the single My original thinking was that each doc would it's own executor (with thread/process limit 1, so that jobs for that document would get serialized). Does this seem reasonable? I guess that makes it hard to control the total number of threads/processes. Do you have any other suggestions? BTW I played around with |
Flaky how so? It's true that process execution can be more fragile (the child process may be killed independently of the parent), although recent concurrent.futures versions should normally be quite robust.
How many documents do you have running tasks simultaneously? If it's a small multiple of the number of cores, then it sounds reasonable. Otherwise, you risk memory consumption issues, as well as cache thrashing. Using processes only makes the issue more severe. Another possibility is to use a per-document Lock, but inside the main thread, not inside the workers. A threading Lock inside a worker would block the whole worker thread, while a coroutine Lock from the main thread code calling the executor will simply yield execution to other coroutines. See http://tornado.readthedocs.org/en/stable/locks.html#tornado.locks.Lock So your scheduling code could look like:
|
I mean running more than two or three clients immediately results in this:
and then no more work is processed at all. Regarding number of documents, it's hard to say, I think the best we can do to start is to build in some monitoring capability so that it's more evident when adding resources is necessary. I will play around with you second idea though, thanks! |
Do you have a short reproducer? If it's a Python bug, it may be useful reporting it... |
I believe you can just change the thread pool to a process pool with what is currently checked in, and start up several clients at once, to reproduce. |
Please permit me to summarise my understanding of the architecture and flow here, so that my misunderstandings can be squashed earlier rather than later. The whole system revolves around the BokehServer Tornado application. It receives input from Clients (which can, e.g., create and update Document) and bokehJS in users' browsers (which can request to view documents and pass back user actions in a view). It provides views and updates back to bokehJS and can request actions by clients which have created documents and are still connected. The server holds a set of Documents, which may have different states in each session (browser view). Is this on the right lines? |
@martindurant yes that is basically on the right track. One thing to clarify is that a session (containing a view of documents) is basically an in-memory cache for a browser tab. That's one of the nice things about using web sockets, you just automatically get this 1-1 correspondence. There are complications, we are trying to figure out what to do about "callbacks" that a user might want to specify to do actual work in response to UI events. These obviously might involve CPU work. After a discussion with @havocp today we have a few (hopefully) clarifying ideas. I am going to try to get those in a Wiki document for wider review tonight. |
…t and hplot with VBox and HBox to avoid auto add to plotting.curdoc
…OTE: population and taylor are still failing for other reasons
remove superfluous server examples
…izer I had somehow reinvented Property.from_json in a very broken way, when Property already had exactly the right method. This should fix tuples with PlotObject members, and also no doubt a number of other cases.
Python default out of sync with js
an integer cannot be a callback arg, must be plot object. pass in nlines into code directly. also use the renderer that's returned from calling line()
clean-up ajax_source pylint - excuse to make an examples commit
two questions about selection_histogram.py example - https://github.com/bokeh/bokeh/blob/tornado/examples/plotting/server/selection_histogram.py
Edit - I think I know the answer to the first question:
|
The two browser tabs are synced because the session id is in the url giving them the same session. It has to be that way because the example works by having a Python client also talk to this same session. Remove the session ID and you should get a blank page (because no Python client to fill it in). This whole setup is for local messing around, it isn't a production-multiuser-server-compatible setup. For a real app you would want to run the selection_histogram.py code in the server rather than as a client, which is the "bokeh serve foo.py" way. Ultimately we should port most examples to that way and only leave a couple maybe that show you can have a Python client talking to a fixed session id. But for step 1 I guess we're trying to keep the examples more similar to how they worked originally. |
the main reason we have this single-session behavior if I remember right would be for use with a local notebook, or for something like a dashboard display or demo that runs on a single computer. |
We don't have a Set property type for now
This fixes the box_annotation.py example
This PR is intended to be a place to have concrete discussions about technical points in the new server development. I intend to submit a new PR with all commits squashed, when this is ready to merge.
Notes
To see some basic operation with messages passed back and forth from client and server, run
and
The class/logic in
client.py
is eventually intended to be rolled intobokeh.session
and then real control, updating of the server will be possible with no more ugly polling, ever.Concepts
WSHandler
— takes individual message framents, passes toRecevier
until a message is completed, then passes message to a handlerReceiver
— responsible for assembling message fragments into complete messagesMessage
— define operations on the server or clientProtocol
— defines a specific collection of message revisions and helps automate their creationServerSession
— an in-memory document cache for one client connection (e.g., a single browser tab)ServerDocument
encapsulate a Bokeh Document on the server. Manages a work queue that serializes any storage operations on the document.Wire Protocol
Modeled loosely after the IPython protocol. A message comprises these sub-parts, sent as individual websocket messags:
New Document Idea
The basic idea is for
Document
to be a data structure composed of three parts:variables
,models
,stores
:Various use-cases are possible:
vars
,models
,stores
(updates get pushed to sessions)Additionally there should be some provision or mode for clients to store changes that that push to a session back to the persistent backend. Need to distinguish between changes pushed to session cache vs changes inteded to be saved. GH commit vs push analogy?
Questions
Message
subclasses havehandle_server
andhandle_client
methods on them, which are intended used by the server and client respectively to handle incoming messages. Would it be worthwhile to factor out the handlers? There are definitely some nice reasons to do that, but it would also spread out the codebase somewhat even more.create
method on theProtocol
, picks out the right message and callscreate
on the message class. It just passes on any required params asargs
andkwargs
, but the user who callscreate
has to just know what any additional params are for the specific message type. Maybe there is a better way?subprocess
, etc. but though perhaps there might be some tighter ipython integration possible.Protocol
is defined in an__init__
there's probably a better arrangement. Source layout suggestions welcome in general.Todo
Session
integration