Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tornado 5.0 compatibility #7308

Closed
mrocklin opened this issue Dec 10, 2017 · 23 comments · Fixed by #7329
Closed

Tornado 5.0 compatibility #7308

mrocklin opened this issue Dec 10, 2017 · 23 comments · Fixed by #7329

Comments

@mrocklin
Copy link
Contributor

Note that the next verion of Tornado, 5.0, will include breaking changes which break the existing Bokeh test suite.

@bryevdv
Copy link
Member

bryevdv commented Dec 13, 2017

Is there a description of the changes somewhere?

@bryevdv bryevdv added this to the 0.12.x milestone Dec 13, 2017
@bryevdv
Copy link
Member

bryevdv commented Dec 13, 2017

Well the majority of failures (though not all I don't think) seem to be this:

TypeError: initialize() got an unexpected keyword argument 'io_loop'

So let's start there. @bdarnell do you have any comments or guidance? Are there any migration notes available?

@bryevdv
Copy link
Member

bryevdv commented Dec 13, 2017

Well, it looks like this change to HTTPServer:

.. versionchanged:: 5.0
   The ``io_loop`` argument has been removed.

I'm not sure what the appropriate change to make is, though.

@mrocklin
Copy link
Contributor Author

I think that whenever you explicitly provide an io_loop= keyword the thing to do is to not provide that keyword, and instead ensure that you create the object in the right event loop. Given that Bokeh doesn't do particularly odd things I suspect that this is always the case, and that you can remove these keywords without concern.

@bryevdv
Copy link
Member

bryevdv commented Dec 13, 2017

Unfortunately it doesn't seem to be as simple as that. Doing that causes several tests to hang completely.

@bryevdv
Copy link
Member

bryevdv commented Dec 13, 2017

FWIW bokeh serve app invocations seem to work, but tornado_embed.py, e.g. does not:

❯ python tornado_embed.py
Traceback (most recent call last):
  File "tornado_embed.py", line 47, in <module>
    server = Server({'/bkapp': modify_doc}, num_procs=4, extra_patterns=[('/', IndexHandler)])
  File "/Users/bryan/work/bokeh/bokeh/server/server.py", line 398, in __init__
    http_server.start(opts.num_procs)
  File "/Users/bryan/anaconda/lib/python3.6/site-packages/tornado-5.0.dev1-py3.6-macosx-10.7-x86_64.egg/tornado/tcpserver.py", line 222, in start
    process.fork_processes(num_processes)
  File "/Users/bryan/anaconda/lib/python3.6/site-packages/tornado-5.0.dev1-py3.6-macosx-10.7-x86_64.egg/tornado/process.py", line 130, in fork_processes
    raise RuntimeError("Cannot run in multiple processes: IOLoop instance "
RuntimeError: Cannot run in multiple processes: IOLoop instance has already been initialized. You cannot call IOLoop.instance() before calling start_processes()

As for the tests we do manage io_loops explicitly in a way I only vaguely understand. Is there an expected timeframe for 5.0? I think in the short term may only be able to update the dependencies to <5.0 for bokeh

@bdarnell
Copy link

There's no strict timeline for Tornado 5.0 yet; I want to work with projects like bokeh to make sure the new version will work for you before it's released.

As @mrocklin says, the io_loop arguments have been removed because we're now stricter about everything running on the proper thread, and these arguments mainly give you a way to mess that up (and implementing them consistently with asyncio is tricky). Can you point to the specific tests that are failing, or where that explicit IOLoop management happens? The places that I looked at all seemed to be doing the right thing (mainly calling make_current()). I tried running the tests and after wrestling with dependencies all of the "unit and not selenium" tests are passing.

The tornado_embed.py problem looks like a regression - I think I may have broken this entry point to multi-process mode.

@bryevdv
Copy link
Member

bryevdv commented Dec 14, 2017

@bdarnell First, please let me say thank you for your help with this.

First here is a diff I'm starting with

❯ git diff
diff --git a/bokeh/server/server.py b/bokeh/server/server.py
index ba5243730..2e80103df 100644
--- a/bokeh/server/server.py
+++ b/bokeh/server/server.py
@@ -393,10 +393,7 @@ class Server(BaseServer):
         try:
             tornado_app = BokehTornado(applications, extra_websocket_origins=extra_websocket_origins, prefix=self.prefix, **kwargs)

-            if io_loop is not None:
-                http_server = HTTPServer(tornado_app, io_loop=io_loop, **http_server_kwargs)
-            else:
-                http_server = HTTPServer(tornado_app, **http_server_kwargs)
+            http_server = HTTPServer(tornado_app, **http_server_kwargs)

             http_server.start(opts.num_procs)
             http_server.add_sockets(sockets)
diff --git a/bokeh/server/tests/test_server.py b/bokeh/server/tests/test_server.py
index ac930aeaa..d851d8430 100644
--- a/bokeh/server/tests/test_server.py
+++ b/bokeh/server/tests/test_server.py
@@ -143,16 +143,14 @@ def test__lifecycle_hooks():
         def check_done():
             if len(handler.hooks) == 4:
                 server.io_loop.stop()
-        server_load_checker = PeriodicCallback(check_done, 1,
-                                               io_loop=server.io_loop)
+        server_load_checker = PeriodicCallback(check_done, 1)
         server_load_checker.start()
         server.io_loop.start()
         server_load_checker.stop()

         # now we create a session
         client_session = pull_session(session_id='test__lifecycle_hooks',
-                                      url=url(server),
-                                      io_loop=server.io_loop)
+                                      url=url(server))
         client_doc = client_session.document
         assert len(client_doc.roots) == 1

diff --git a/bokeh/server/tornado.py b/bokeh/server/tornado.py
index 153e64644..10903b9b7 100644
--- a/bokeh/server/tornado.py
+++ b/bokeh/server/tornado.py
@@ -232,15 +232,13 @@ class BokehTornado(TornadoApplication):
         self._clients = set()

         self._stats_job = PeriodicCallback(self._log_stats,
-                                           self._stats_log_frequency_milliseconds,
-                                           io_loop=self._loop)
+                                           self._stats_log_frequency_milliseconds)

         self._cleanup_job = PeriodicCallback(self._cleanup_sessions,
-                                             self._check_unused_sessions_milliseconds,
-                                             io_loop=self._loop)
+                                             self._check_unused_sessions_milliseconds)

         if self._keep_alive_milliseconds > 0:
-            self._ping_job = PeriodicCallback(self._keep_alive, self._keep_alive_milliseconds, io_loop=self._loop)
+            self._ping_job = PeriodicCallback(self._keep_alive, self._keep_alive_milliseconds)
         else:
             self._ping_job = None

Then if I run:

py.test -m "not (examples or integration or js or quality)" bokeh/server bokeh/command/subcommands/tests/test_serve.py

I see these two failures

_________________________________________________________ TestCallbackGroup.test_periodic_runs _________________________________________________________

self = <bokeh.server.tests.test_callbacks.TestCallbackGroup testMethod=test_periodic_runs>

    def test_periodic_runs(self):
        with (LoopAndGroup()) as ctx:
            func = _make_invocation_counter(ctx.io_loop, stop_after=5)
            self.assertEqual(0, len(ctx.group._periodic_callbacks))
            ctx.group.add_periodic_callback(func, period_milliseconds=1)
            self.assertEqual(1, len(ctx.group._periodic_callbacks))
>       self.assertEqual(5, func.count())
E       AssertionError: 5 != 0

bokeh/server/tests/test_callbacks.py:74: AssertionError
----------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------
Keyboard interrupt
----------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------
DEBUG:asyncio:Using selector: KqueueSelector
------------------------------------------------------------------ Captured log call -------------------------------------------------------------------
selector_events.py          65 DEBUG    Using selector: KqueueSelector
_______________________________________________ TestCallbackGroup.test_same_callback_as_all_three_types ________________________________________________

self = <bokeh.server.tests.test_callbacks.TestCallbackGroup testMethod=test_same_callback_as_all_three_types>

    def test_same_callback_as_all_three_types(self):
        with (LoopAndGroup()) as ctx:
            func = _make_invocation_counter(ctx.io_loop, stop_after=5)
            # we want the timeout and next_tick to run before the periodic
            ctx.group.add_periodic_callback(func, period_milliseconds=2)
            ctx.group.add_timeout_callback(func, timeout_milliseconds=1)
            ctx.group.add_next_tick_callback(func)
>       self.assertEqual(5, func.count())
E       AssertionError: 5 != 2

bokeh/server/tests/test_callbacks.py:129: AssertionError

But those tests hang completely. This is only if I hit ctrl-c to stop them. Then test_server.py hangs entirely and hitting ctrl-c exits the entire test run so I don't know yet what might be failing there.

@bryevdv
Copy link
Member

bryevdv commented Dec 14, 2017

As for explicit management I think this is about it:

https://github.com/bokeh/bokeh/blob/master/bokeh/server/tests/utils.py#L59

@bryevdv
Copy link
Member

bryevdv commented Dec 15, 2017

Jus as a heads up, ipython notebeook does not work either:

File "/Users/bryan/anaconda/lib/python3.6/site-packages/notebook/notebookapp.py", line 43, in <module>
    ioloop.install()
  File "/Users/bryan/anaconda/lib/python3.6/site-packages/zmq/eventloop/ioloop.py", line 215, in install
    ioloop.IOLoop.configure(ZMQIOLoop)
  File "/Users/bryan/anaconda/lib/python3.6/site-packages/tornado-5.0.dev1-py3.6-macosx-10.7-x86_64.egg/tornado/ioloop.py", line 182, in configure
    "only AsyncIOLoop is allowed when asyncio is available")
RuntimeError: only AsyncIOLoop is allowed when asyncio is available

@mrocklin
Copy link
Contributor Author

mrocklin commented Dec 15, 2017 via email

@bdarnell
Copy link

OK, when I said all the tests were passing, I had fat-fingered my PYTHONPATH tweaks to run against tornado master, so it was still using 4.5. Now I'm able to reproduce the behavior from #7308 (comment)

Here's a diff that gets all of those tests passing but one:

diff --git a/bokeh/server/server.py b/bokeh/server/server.py
index ba5243730..2e80103df 100644
--- a/bokeh/server/server.py
+++ b/bokeh/server/server.py
@@ -393,10 +393,7 @@ class Server(BaseServer):
         try:
             tornado_app = BokehTornado(applications, extra_websocket_origins=extra_websocket_origins, prefix=self.prefix, **kwargs)
 
-            if io_loop is not None:
-                http_server = HTTPServer(tornado_app, io_loop=io_loop, **http_server_kwargs)
-            else:
-                http_server = HTTPServer(tornado_app, **http_server_kwargs)
+            http_server = HTTPServer(tornado_app, **http_server_kwargs)
 
             http_server.start(opts.num_procs)
             http_server.add_sockets(sockets)
diff --git a/bokeh/server/tests/test_callbacks.py b/bokeh/server/tests/test_callbacks.py
index c08bf752f..05f07f645 100644
--- a/bokeh/server/tests/test_callbacks.py
+++ b/bokeh/server/tests/test_callbacks.py
@@ -31,6 +31,12 @@ def run(loop):
 class LoopAndGroup(object):
     def __init__(self, quit_after=None):
         self.io_loop = IOLoop()
+        try:
+            import asyncio
+        except ImportError:
+            pass
+        else:
+            asyncio.set_event_loop(self.io_loop.asyncio_loop)
         self.group = _CallbackGroup(self.io_loop)
 
         if quit_after is not None:
diff --git a/bokeh/server/tests/test_server.py b/bokeh/server/tests/test_server.py
index ac930aeaa..15b6a9d20 100644
--- a/bokeh/server/tests/test_server.py
+++ b/bokeh/server/tests/test_server.py
@@ -143,8 +143,7 @@ def test__lifecycle_hooks():
         def check_done():
             if len(handler.hooks) == 4:
                 server.io_loop.stop()
-        server_load_checker = PeriodicCallback(check_done, 1,
-                                               io_loop=server.io_loop)
+        server_load_checker = PeriodicCallback(check_done, 1)
         server_load_checker.start()
         server.io_loop.start()
         server_load_checker.stop()
diff --git a/bokeh/server/tests/utils.py b/bokeh/server/tests/utils.py
index b43c2e540..417f306d5 100644
--- a/bokeh/server/tests/utils.py
+++ b/bokeh/server/tests/utils.py
@@ -41,7 +41,7 @@ def websocket_open(io_loop, url, origin=None):
     request = HTTPRequest(url)
     if origin is not None:
         request.headers['Origin'] = origin
-    websocket_connect(request, callback=handle_connection, io_loop=io_loop)
+    websocket_connect(request, callback=handle_connection)
 
     io_loop.start()
 
@@ -60,6 +60,12 @@ class ManagedServerLoop(object):
     def __init__(self, application, **server_kwargs):
         loop = IOLoop()
         loop.make_current()
+        try:
+            import asyncio
+        except ImportError:
+            pass
+        else:
+            asyncio.set_event_loop(loop.asyncio_loop)
         server_kwargs['io_loop'] = loop
         self._server = Server(application, **server_kwargs)
     def __exit__(self, type, value, traceback):
diff --git a/bokeh/server/tornado.py b/bokeh/server/tornado.py
index 153e64644..10903b9b7 100644
--- a/bokeh/server/tornado.py
+++ b/bokeh/server/tornado.py
@@ -232,15 +232,13 @@ class BokehTornado(TornadoApplication):
         self._clients = set()
 
         self._stats_job = PeriodicCallback(self._log_stats,
-                                           self._stats_log_frequency_milliseconds,
-                                           io_loop=self._loop)
+                                           self._stats_log_frequency_milliseconds)
 
         self._cleanup_job = PeriodicCallback(self._cleanup_sessions,
-                                             self._check_unused_sessions_milliseconds,
-                                             io_loop=self._loop)
+                                             self._check_unused_sessions_milliseconds)
 
         if self._keep_alive_milliseconds > 0:
-            self._ping_job = PeriodicCallback(self._keep_alive, self._keep_alive_milliseconds, io_loop=self._loop)
+            self._ping_job = PeriodicCallback(self._keep_alive, self._keep_alive_milliseconds)
         else:
             self._ping_job = None

I'm going to change IOLoop.make_current() to call asyncio.set_event_loop for you; clients shouldn't have to deal with that. You might need to add a make_current() call in LoopAndGroup, though.

Note that I had to reintroduce the io_loop passed to pull_session. Passing None there doesn't work as expected due to this special case in ClientConnection:

if io_loop is None:
# We can't use IOLoop.current because then we break
# when running inside a notebook since ipython also uses it
io_loop = IOLoop()

The remaining failure is test__lifecycle_hooks, where the order of two hooks was reversed:

===================================================== FAILURES ======================================================
_______________________________________________ test__lifecycle_hooks _______________________________________________

    def test__lifecycle_hooks():
        application = Application()
        handler = HookTestHandler()
        application.add(handler)
        with ManagedServerLoop(application, check_unused_sessions_milliseconds=30) as server:
            # wait for server callbacks to run before we mix in the
            # session, this keeps the test deterministic
            def check_done():
                if len(handler.hooks) == 4:
                    server.io_loop.stop()
            server_load_checker = PeriodicCallback(check_done, 1)
            server_load_checker.start()
            server.io_loop.start()
            server_load_checker.stop()
    
            # now we create a session
            client_session = pull_session(session_id='test__lifecycle_hooks',
                                          url=url(server),
                                          io_loop=server.io_loop)
            client_doc = client_session.document
            assert len(client_doc.roots) == 1
    
            server_session = server.get_session('/', client_session.id)
            server_doc = server_session.document
            assert len(server_doc.roots) == 1
    
            client_session.close()
            # expire the session quickly rather than after the
            # usual timeout
            server_session.request_expiration()
    
            def on_done():
                server.io_loop.stop()
    
            server.io_loop.call_later(0.1, on_done)
    
            server.io_loop.start()
    
>       assert handler.hooks == ["server_loaded",
                                 "next_tick_server",
                                 "timeout_server",
                                 "periodic_server",
                                 "session_created",
                                 "next_tick_session",
                                 "modify",
                                 "timeout_session",
                                 "periodic_session",
                                 "session_destroyed",
                                 "server_unloaded"]
E       assert ['server_load...session', ...] == ['server_loade...session', ...]
E         At index 6 diff: 'timeout_session' != 'modify'
E         Full diff:
E         ['server_loaded',
E         'next_tick_server',
E         'timeout_server',
E         'periodic_server',
E         'session_created',
E         'next_tick_session',
E         +  'modify',
E         'timeout_session',
E         -  'modify',
E         'periodic_session',
E         'session_destroyed',
E         'server_unloaded']

bokeh/server/tests/test_server.py:174: AssertionError

This is a known consequence of the move to asyncio: the timing of callbacks in some situations has changed. If this is significant for you I can look into this further, but otherwise I would suggest relaxing this test.

@bryevdv bryevdv mentioned this issue Dec 17, 2017
3 tasks
@bryevdv
Copy link
Member

bryevdv commented Dec 17, 2017

@bdarnell thanks, I have made a PR with your initial diff to help focus work and iteration. A few observations on my machine:

  • With your diff and tornado 4, many tests fail with

    >           asyncio.set_event_loop(loop.asyncio_loop)
    E           AttributeError: 'KQueueIOLoop' object has no attribute 'asyncio_loop'
    

    I believe your recent proposed PR to Tornado would solve this? But wanted to confirm.

  • bokeh serve apps under example/apps seem to work (spot testing)

  • none of the example under howto/server_embed seem to work anymore. It's possible we will need to maintain different guidance for embedding with 4.x vs 5? That's fine if so but if there's a way to make them work identically that would be a plus

  • All unit tests pass for me except this one, which actually hangs and I have to Ctrl-C:

def test__yield_for_all_futures():
        loop = IOLoop()
        loop.make_current()

        @gen.coroutine
        def several_steps():
            value = 0
            value += yield async_value(1)
            value += yield async_value(2)
            value += yield async_value(3)
            raise gen.Return(value)

        result = {}
        def on_done(future):
            result['value'] = future.result()
            loop.stop()

        loop.add_future(yield_for_all_futures(several_steps()),
                        on_done)

        try:
            loop.start()
        except KeyboardInterrupt:
            print("keyboard interrupt")

>       assert 6 == result['value']
E       KeyError: 'value'
bokeh/util/tests/test_tornado.py:38: KeyError

@bdarnell
Copy link

I believe your recent proposed PR to Tornado would solve this? But wanted to confirm.

Correct. That whole block goes away.

none of the example under howto/server_embed seem to work anymore. It's possible we will need to maintain different guidance for embedding with 4.x vs 5? That's fine if so but if there's a way to make them work identically that would be a plus

I haven't dug into this yet but I believe I'll be able to make it work the way it did before (at least as long as there's not a deeper issue lurking behind the multiprocess issue)

I think the test__yield_for_all_futures failure is the same issue as before (and this test just wasn't included in my command line). The same patch (tornadoweb/tornado#2218) should fix it (by changing make_current()). I'll try it again once that patch is merged.

@bryevdv
Copy link
Member

bryevdv commented Dec 18, 2017

@bdarnell great I pulled the new changes and updated the PR to remove the explicit asyncio blocks. I believe all tests are passing now on both 4.x and 5.0dev. The examples/app spot testing still seems favorable, it's just the server_embed ones that seem problematic, with messages like

   raise RuntimeError("Cannot run in multiple processes: IOLoop instance "
RuntimeError: Cannot run in multiple processes: IOLoop instance has already been initialized. You cannot call IOLoop.instance() before calling start_processes()

(there may be other messages in other examples) So perhaps just some usage on our end to tighten up.

@bdarnell
Copy link

The server_embed issues are my fault: tornadoweb/tornado#2220 should fix them (I think - my bokeh venv appears to have bit-rotted somehow so I'm currently unable to test them)

@bryevdv
Copy link
Member

bryevdv commented Dec 19, 2017

@bdarnell great, I look forward to testing it out. On another issue, do you expect to make 5.0 pre-releases available any time? At some point we will need to add a build stage to test with Tornado 5.0

@bdarnell
Copy link

Yes, I've just uploaded an alpha to pypi: https://pypi.python.org/pypi/tornado/5.0a1

@bryevdv
Copy link
Member

bryevdv commented Dec 28, 2017

@bdarnell fantastic, I will start to figure out how to include it in our CI tests.

@bdarnell
Copy link

And there's now a beta, 5.0b1.

I've been filing this issue on a number of other projects; since you already have one open I'm including the message here.


Tornado 5.0 is currently in beta, and I'm filing issues with prominent
users of Tornado to give them a chance to test it before release and
provide feedback on the changes and make adjustments as necessary. You
can install it with pip install --pre tornado==5.0b1. I expect to
release the final version of Tornado 5.0 in early March, about a month
away.

Full release notes are
here. The
major change is that the IOLoop is now always backed by asyncio on
Python 3, which leads to various subtle changes at runtime but is
mostly compatible with previous versions. A more superficial but
source-incompatible change is that the io_loop argument to many
functions in Tornado has been removed and IOLoop.current() must be
used instead.

If you've already tested with Tornado 5.0b1 (or temporarily pinned
your dependency to 4.x so you can postpone dealing with this), feel
free to close this issue.

@mrocklin
Copy link
Contributor Author

Dask tests show the following bokeh/tornado issues

Traceback (most recent call last):
  File "/home/mrocklin/workspace/tornado/tornado/gen.py", line 831, in callback
    result_list.append(f.result())
  File "/home/mrocklin/workspace/tornado/tornado/gen.py", line 301, in wrapper
    yielded = next(result)
  File "/home/mrocklin/workspace/distributed/distributed/worker.py", line 352, in _start
    self.start_services(listen_host)
  File "/home/mrocklin/workspace/distributed/distributed/worker.py", line 313, in start_services
    self.services[k].listen((listen_ip, port))
  File "/home/mrocklin/workspace/distributed/distributed/bokeh/core.py", line 29, in listen
    **self.server_kwargs)
  File "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/bokeh/server/server.py", line 397, in __init__
    http_server = HTTPServer(tornado_app, io_loop=io_loop, **http_server_kwargs)
  File "/home/mrocklin/workspace/tornado/tornado/util.py", line 312, in __new__
    instance.initialize(*args, **init_kwargs)
TypeError: initialize() got an unexpected keyword argument 'io_loop'

Also

../../Software/anaconda/lib/python3.6/site-packages/bokeh/server/server.py:413: in __init__
    super(Server, self).__init__(io_loop, tornado_app, http_server)
../../Software/anaconda/lib/python3.6/site-packages/bokeh/server/server.py:125: in __init__
    self._tornado.initialize(io_loop)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <bokeh.server.tornado.BokehTornado object at 0x7f6d6ba5c390>
io_loop = <tornado.platform.asyncio.AsyncIOLoop object at 0x7f6d6bd14fd0>

    def initialize(self, io_loop):
        ''' Start a Bokeh Server Tornado Application on a given Tornado IOLoop.
    
            '''
        self._loop = io_loop
    
        for app_context in self._applications.values():
            app_context._loop = self._loop
    
        self._clients = set()
    
        self._stats_job = PeriodicCallback(self._log_stats,
                                           self._stats_log_frequency_milliseconds,
>                                          io_loop=self._loop)
E       TypeError: __init__() got an unexpected keyword argument 'io_loop'

Versions

In [1]: import bokeh, tornado

In [2]: tornado.version
Out[2]: '5.0b1'

In [3]: bokeh.__version__
Out[3]: '0.12.14dev6'

@bryevdv
Copy link
Member

bryevdv commented Jan 31, 2018

@mrocklin this PR has not been merged yer

@mrocklin
Copy link
Contributor Author

Ah! I missed the reference to the PR. I'll test against that. Thanks @bryevdv

@bryevdv bryevdv removed this from the 0.12.x milestone Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants