Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook + Server Not Working #3461

Closed
bryevdv opened this issue Dec 31, 2015 · 34 comments
Closed

Notebook + Server Not Working #3461

bryevdv opened this issue Dec 31, 2015 · 34 comments

Comments

@bryevdv
Copy link
Member

bryevdv commented Dec 31, 2015

It's possible the examples just need to be updated to current practice somehow, but I suspect there is little more to it than that.

examples/plotting/notebook/animated.ipynb:
 notebook + server does not work

examples/plotting/notebook/interact_numba.ipynb:
 image update performance degraded

examples/plotting/notebook/line_server.ipynb:
 notebook + server does not work

examples/plotting/notebook/random_walk.ipynb:
 notebook + server does not work
@bryevdv
Copy link
Member Author

bryevdv commented Jan 1, 2016

OK, "got this working". Seems to boil down to:

  • now needing to open up bokeh server for connections from notebook server with --allow-websocket-origin=localhost:8888
  • session ids mismatching
  • unncessary push (is io.push good for anything? should it be removed?)

With this patch

diff --git a/bokeh/io.py b/bokeh/io.py
index d7b593f..6c8d3d0 100644
--- a/bokeh/io.py
+++ b/bokeh/io.py
@@ -299,7 +299,6 @@ def _show_file_with_state(obj, state, new, controller):

 def _show_notebook_with_state(obj, state):
     if state.server_enabled:
-        push(state=state)
         snippet = autoload_server(obj, session_id=state.session_id_allowing_none, url=state.url, app_path=state.app_path)
         publish_display_data({'text/html': snippet})
     else:

And if I explicitly add a session id to session = push_session(curdoc(), session_id='default') (as shown in (https://github.com/bokeh/bokeh/blob/notebook_server/examples/plotting/notebook/animated.ipynb) then things work as expected with . I guess this is mostly a documentation issue, but if there are ideas to make the notebook use-case closer to "just works" they are welcome

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

Removing that push seems logical since I think we push to the server elsewhere anyway right?

Could we get the session id from the server push and then set it as the state's session id and be sure we do the notebook push second so that autoload_server gets it ?

Unfortunate that the notebook is on a different port so can't use the websocket. Possible solution: when we are using the default --host, also default to allowing websocket port 8888 if that's the notebook default ... we should probably be careful not to enable that websocket origin if the --host has been changed to a real prod host though.

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

Here's an untested diff of things I saw, including your removal of push:
notebook-changes.txt

There's an "elif" in _show_with_state that I think is wrong, and I played with the idea here of saving the session id from the server push (which means _show_with_state needs to push to server first before doing notebook... which makes sense anyway so the server is ready for the notebook to load the output).
In fact with this setup maybe defaulting output_server to DEFAULT_SESSION_ID doesn't make sense any more and we could get rid of DEFAULT_SESSION_ID, I"m not sure.

Needs testing, and probably could have some unit tests too, etc.

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

Here's that thing inline, hadn't ever tried the github file attach feature, it appears to be limited.

diff --git a/bokeh/io.py b/bokeh/io.py
index d7b593f..2cdbf3f 100644
--- a/bokeh/io.py
+++ b/bokeh/io.py
@@ -280,14 +280,17 @@ def show(obj, browser=None, new="tab"):
 def _show_with_state(obj, state, browser, new):
     controller = browserlib.get_browser_controller(browser=browser)

-    comms_handle = None
+    # we show ALL enabled outputs, they are not mutually exclusive.
+
+    # show server first so if we send an autoload_server snippet
+    # to the notebook, the server will be ready.
+    if state.server_enabled:
+        _show_server_with_state(obj, state, new, controller)

+    comms_handle = None
     if state.notebook:
         comms_handle = _show_notebook_with_state(obj, state)

-    elif state.server_enabled:
-        _show_server_with_state(obj, state, new, controller)
-
     if state.file:
         _show_file_with_state(obj, state, new, controller)

@@ -299,7 +302,6 @@ def _show_file_with_state(obj, state, new, controller):

 def _show_notebook_with_state(obj, state):
     if state.server_enabled:
-        push(state=state)
         snippet = autoload_server(obj, session_id=state.session_id_allowing_none, url=state.url, app_path=state.app_path)
         publish_display_data({'text/html': snippet})
     else:
@@ -399,6 +401,7 @@ def _push_to_server(session_id, url, app_path, document, io_loop):
     session = push_session(document, session_id=session_id, url=url, app_path=app_path, io_loop=io_loop)
     session.close()
     session.loop_until_closed()
+    return session.id

 def push(session_id=None, url=None, app_path=None, document=None, state=None, io_loop=None, validate=True):
     ''' Update the server with the data for the current document.
@@ -439,7 +442,6 @@ def push(session_id=None, url=None, app_path=None, document=None, state=None, io
         app_path = state.app_path

     # State is supposed to ensure these are set
-    assert session_id is not None
     assert url is not None
     assert app_path is not None

@@ -452,8 +454,13 @@ def push(session_id=None, url=None, app_path=None, document=None, state=None, io
     if validate:
         document.validate()

-    _push_to_server(session_id=session_id, url=url, app_path=app_path,
-                    document=document, io_loop=io_loop)
+    new_session_id = _push_to_server(session_id=session_id, url=url, app_path=app_path,
+                                     document=document, io_loop=io_loop)
+    # fill in the session id we used, so _show_notebook_with_state
+    # can use it and so any future push() maybe after modifying
+    # the document further can use it.
+    if session_id is None:
+        state._session_coords.session_id = new_session_id

 def push_notebook(document=None, state=None, handle=None):
     ''' Update the last-shown plot in a Jupyter notebook with the new data
diff --git a/bokeh/resources.py b/bokeh/resources.py
index ca2ffa0..807a571 100644
--- a/bokeh/resources.py
+++ b/bokeh/resources.py
@@ -114,6 +114,10 @@ class _SessionCoordinates(object):
             self._session_id = generate_session_id()
         return self._session_id

+    @session_id.setter
+    def session_id(self, value):
+        self._session_id = value
+
     @property
     def session_id_allowing_none(self):
         """ Session ID provided in kwargs, keeping it None if it hasn't been generated yet.
diff --git a/bokeh/tests/test_io.py b/bokeh/tests/test_io.py
index 74fbd28..9d368e7 100644
--- a/bokeh/tests/test_io.py
+++ b/bokeh/tests/test_io.py
@@ -352,7 +352,6 @@ class Test_ShowNotebookWithState(DefaultStateTester):
         mock_autoload_server.return_value = "snippet"

         io._show_notebook_with_state("obj", s)
-        self._check_func_called(mock_push, (), {"state": s})
         self._check_func_called(mock_publish_display_data, ({"text/html":"snippet"},), {})

     @patch('bokeh.io.get_comms')

@bryevdv
Copy link
Member Author

bryevdv commented Jan 1, 2016

The elif is to keep it from opening a new browser tab with a server autoload in it, in addition to inlining a server autoload in the notebook. It also keeps comms and server exclusive, since the notebook server handling is only handled in _show_notebook_with_state where one branch does comms and another does server.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 1, 2016

Regarding opening the websocket port, how about --notebook that is just an alias for adding an --allow-websocket-origin=localhost:8888

It's slightly more work to make users add --notebook but I'm just thinking "foo is a shortcut for bar" is easier to document than "foo does bar, except when it does baz" This would also be cleaner to implement and easier to test. We could add it as a suggestion in the "can't connect" error message too: "if you are using the Jupyter notebook, need --notebook" etc.

@pzwang do you have UX thoughts here? Do you have a strong preference?

@bryevdv
Copy link
Member Author

bryevdv commented Jan 1, 2016

Regarding the release: I'm -1 on changing the logic much but I will see about keeping track of ID's for users implicitly in the notebook case in case that is simple.

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

I think the elif needs to be only on the show-in-browser aspect and not the push ?

if you output_server and output_notebook then show has to push to the server sometime right

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

check over the docs on all these functions, whatever we do they need to be kept in sync ...

@havocp
Copy link
Contributor

havocp commented Jan 1, 2016

--notebook sounds like an ok approach

@damianavila damianavila modified the milestones: 0.11.1, 0.11.0 Jan 4, 2016
@damianavila damianavila modified the milestones: 0.11.1, short-term Feb 5, 2016
@bryevdv bryevdv modified the milestones: 0.12, short-term Apr 15, 2016
@bryevdv bryevdv assigned bryevdv and unassigned havocp Apr 15, 2016
@MulticastMatt
Copy link

Issue here: Sever doesn't work in conjunction with the notebook. If you want to add Bokeh widgets that trigger callbacks, it doesn't work. There are some suggested ways to get it working, but they're hacking. How do we want to startup the Bokeh server? There are a lot of plumbing issues. We want to do tests. Selenium--open up a notebook and automated.

Did this work with the old server? Yes, somewhat. You had to start the server separately, etc.

Some groundwork has been laid here. We want documentation and examples as well

Sarah - Is the notebook going to restart the server?
Bryan - Out of order execution, build up something incrementally, combine widgets with plot, etc. There are technical questions in getting it working overall.

Bryan - The goal is getting it to work in the notebook and what should it look like.
Fabio - not talking about publishing it out.

Priority - This is important as it will let us turn notebooks into apps. High priority from Peter.

@MulticastMatt
Copy link

MulticastMatt commented Apr 22, 2016

Team decided that this is a 40 (the maximum) and decided it would make sense to break this down.

Sarah's suggestion - the first person that goes through this should be breaking it down. Bryan 👍

@bryevdv bryevdv removed this from the 0.12 milestone Jul 8, 2016
@bryevdv
Copy link
Member Author

bryevdv commented Dec 20, 2016

Here is a notebook that demonstrates piggybacking a bokeh server on the notebook ioloop, and displaying inline:

https://gist.github.com/bryevdv/ff84871fcd843aceea4f0197be9c57e0

Things that would be needed to consider this "done"

  • some convenience functions to wrap some of the boilerplate
  • some thought on an API that would minimize "bad" situations (i.e. some cells in that notebook cannot be re-executed, is there API that could precent such situations?)

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

Right now the major problem with re-running is that in the example/proof of concept the server is started as normal by giving a port, etc:

from bokeh.server.server import Server
server = Server({'/': app}, io_loop=loop, port=5002, allow_websocket_origin=["127.0.0.1:8888"])
server.start()

Re-running blows up with a "port in use" error. Instead of having to give a port and websocket origin like this, perhaps there an explicit NotebookServer subclass that will connect BokehJS and the server via notebook comms with just the arguments:

server = NotebookServer({'/': app}, io_loop=loop)
server.start()

I believe the basic operation would need to create one of these for every output cell that shows/embeds an app, so there would need to be an API that wraps up taking an app, creating a server for it, and displaying the autoload HTML. Then also, if the notebook comm is broken, the server should remove itself from the IOloop.

Something like:

from bokeh.embed import notebook_app

bokeh_app = Application(handler)

# This will detect that a Bokeh app is passed in. Internally it will:
* create a server connected via notebook comms with a URL based on output cell number
* start it
* display an autoload_server script that loads that URL
show(bokeh_app)

Does that seem reasonable? Alternatively could have a completely new function and not show but overloading show to detect apps seems OK to me.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

An alternative general approach would be to allow apps to be loaded removed from one server "on the fly" but given that a server is really just some tornado callbacks (that are otherwise idle unless events happen) hitching onto the existing IOLoop, it seems much simpler to just create them and tear them down as needed.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

Another specific approach instead of implementing a notebook comms connection would be to add auto-port selection when there is a conflict, and just leave old (unused) servers to accumulate. Certainly that might be a quicker "first cut".

As a concrete, hacky example, this something like this:

def show_app(app):
    from bokeh.server.server import Server
    PORT = 5006 
    while True:
        try:
            server = Server({'/': app}, io_loop=loop, port=PORT, 
                            allow_websocket_origin=["127.0.0.1:8888"])
            server.start()
        except OSError:
            PORT += 1
        finally:
            break;
    
    from IPython.display import HTML, display
    script = autoload_server(model=None, url='http://127.0.0.1:%d' % PORT)
    display(HTML(script))

Then this cell to show:

show_app(app)

such a notebook can be "Run All Cells" repeatedly:

screen shot 2017-01-07 at 2 40 21 pm

@mrocklin @pzwang do you have any thoughts on preferences?

@havocp
Copy link
Contributor

havocp commented Jan 7, 2017

auto-port-selection already exists doesn't it, should be something like:

server = Server() # do NOT specify port
PORT = server.port

You don't need the loop trying a bunch of ports by hand.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

auto-port-selection already exists doesn't it

Not that I am aware of.

screen shot 2017-01-07 at 2 43 02 pm

I'm definitely not saying that code is a good long term solution. I am saying it's temporary guidance we could offer now (for the monday release) with a big "there be improvements later that simplify this" disclaimer. And then secondarily, wondering what those improvements and simplifications should look like

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

I guess the main question is: How bad is just accumulating "started" servers on the notebook ioloop at least, for a first cut? If it is not terrible, then I could probably implement port collision handling and update show to detect bokeh apps in a few hours today.

@havocp
Copy link
Contributor

havocp commented Jan 7, 2017

Try port=0

I'm pretty sure there's a way to do it. Unit tests may even do it.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

@havocp thank you :)

OK so this works:

def show_app(app):
    from bokeh.server.server import Server
    server = Server({'/': app}, io_loop=loop, port=0, host='*', allow_websocket_origin=["127.0.0.1:8888"])
    server.start()
    from IPython.display import HTML, display
    script = autoload_server(model=None, url='http://127.0.0.1:%d' % server.port)
    display(HTML(script))

So question: is accumulating these Server objects on re-run reasonable, at least temporarily?

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

I guess the answer is: it depends.

If the app has periodic callbacks, then probably not. I'm not sure how best to detect that a server should be stopped tho. Not sure if the notebook has cell hooks that could be used to detect when a previously shown app is overwritten with a new cell, otherwise the "detect closed notebook comms" might be the only viable approach.

@havocp
Copy link
Contributor

havocp commented Jan 7, 2017

It seems ok to me probably? Is it a server per cell or a server per re-run? Anyway a memory leak beats not working I suppose. But I'm not confident.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

It's a server per output cell that is created with show_app

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

it's not really the accumulated server thats a potential issue though, it's the apps. Whether we create a server per app every show, or make a way to dynamically install apps in to a single global "notebook" server every show, the case where things really need to be torn down is if there are periodic callbacks in an app. Accumulating those probably has a far higher likelihood of "being bad".

@mrocklin
Copy link
Contributor

mrocklin commented Jan 7, 2017

I was about to suggest adding a new app to an already running server.

Also, I've found that Tornado's random port=0 solution is mostly effective, but can still collide with existing ports.

In Dask we do this:

    def listen(self, port=None):
        if port is None:
            port = self.default_port
        while True:
            try:
                super(Server, self).listen(port)
                break
            except (socket.error, OSError):
                if port:
                    raise
                else:
                    logger.info('Randomly assigned port taken for %s. Retrying',
                                type(self).__name__)

https://github.com/dask/distributed/blob/master/distributed/core.py#L148

@havocp
Copy link
Contributor

havocp commented Jan 7, 2017

Random port assignment should be done by the OS kernel, should not fail. Sort of a side topic for this issue though. Maybe a tornado issue who knows.

@mrocklin
Copy link
Contributor

mrocklin commented Jan 7, 2017

In principle I totally agree. However in practice we definitely needed to add the while loop in the code above. We create and tear down thousands of concurrent TCPServers for our tests. This issue comes up if we don't address it.

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

I was about to suggest adding a new app to an already running server.

The reasons I prefer the other route is that I've been down the "try to maintain a global state thing in a notebook" already and its not fun. I'd rather 1) avoid that bookkeeping and 2) not add more complexity to the server unnecessarily

@havocp
Copy link
Contributor

havocp commented Jan 7, 2017

I'm curious now why bind fails, I guess strace might turn it up. EINPROGRESS could be possible according to http://pubs.opengroup.org/onlinepubs/9699919799/functions/bind.html don't know if EINTR is another possibility. Anyhow... sorry to hijack the issue :-) guess this could be its own issue. Your workaround loop might be worth adding in bokeh itself in server.py ...

@bryevdv
Copy link
Member Author

bryevdv commented Jan 7, 2017

@mrocklin can you open a separate issue? I think its worth investigating (and possibly adding protection against in server.py) but I think your "thousands of concurrent TCPServers" is at the end of the usage spectrum so can be handled separately from this

@mrocklin
Copy link
Contributor

mrocklin commented Jan 7, 2017

Sorry, meant to say "set up a few concurrent TCPServers thousands of times." We only employ hundreds of concurrent servers at once in a few cases. This isn't an issue for you until it becomes an issue, just thought I'd bring it up. I'd tend to pass on making it a github issue until you actually decide to go down this path.

Regarding global state and the notebook. I understand the concern and I've never tried to do anything like this in the notebook before. Still, I'll go ahead and say the approach that seems intuitive to me. I would make a global singleton bokeh notebook server that gets instantiated the first time it's called (similar to IOLoop.current()). I would add isolated apps with unique names to this as necessary.

name = str(uuid.uuid4())
get_bokeh_server().apps[name] = MyNewApp()

I'm not sure how much state or interaction between apps there is in the Bokeh server however, so this may be naive.

@bryevdv
Copy link
Member Author

bryevdv commented Mar 17, 2017

Additional work can go in new issues/PRs

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants