Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a Future constructor is missing the loop kwarg #747

Open
petri opened this issue Nov 2, 2016 · 20 comments
Open

a Future constructor is missing the loop kwarg #747

petri opened this issue Nov 2, 2016 · 20 comments

Comments

@petri
Copy link

petri commented Nov 2, 2016

There's a Future created by the _consume method of autobahn.asyncio.websocket.WebSocketAdapterProtocol.

It is not passing the loop kwarg, which leads to NotImplementedErrorwhen the implicit default 'global' loop is not being used. If I understand correctly.

I am guessing it's possible to reproduce this simply by creating a custom loop, rather than asking for the default via asyncio.get_event_loop(), and passing that to the connection factory.

I encountered the issue when creating what amounts to an autobahn websocket connection test using py.test (with help from the pytest-asyncio package that provides some useful asyncio testing fixtures). To repeat that, write a test function to make a websocket connection so that:

  • the test is decorated with the pytest.mark.asyncio decorator, with the forbid_global_loopparameter set to True)
  • the event_loop fixture is used for the test, and explicitly used as the loop passed to autobahn

So the 'stanza' for such a test becomes like this:

from pytest import mark

@mark.asyncio(forbid_global_loop=True)
async def test_ws_connect(event_loop):
   "some test that connects"
...

Then, upon connect, the NotImplementedError will be raised.

As for fixing the issue, the loop appears to be available in the protocol as self.transport._loop, and I can confirm that passing it as loopkwarg fixes the issue.

@meejah
Copy link
Contributor

meejah commented Nov 2, 2016

Hmm, yeah that's wrong. It actually should be using txaio.create_future(), probably? (Although, inside autobahn.asyncio.* we can often get away with using asyncio-specifics).

If we keep the bare Future(), it should probably use txaio.config.loop in any case...which then makes the "loop=" kwarg a bit of a lie. I'm not actually sure why this websocket code isn't using txaio.create_future and friends already ...

Thanks for the pointers on your unit-test. We're not using pytest in this particular case, but do have a similar test-helper in txaio.testutil.replace_loop https://github.com/crossbario/txaio/blob/master/txaio/testutil.py#L33

@petri if you can post your code somewhere that'd be ideal :) (For your use-case, replacing "the" loop globally via txaio would work, correct? i.e. you don't actually need different loops per websocket listener/client?)

@soulmerge
Copy link

I was having the exact some problem, but I think the loop needs to be referenced as self.factory.loop. Added a pull request that should resolve this issue.

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

Yes, #748 may be "a" stop-gap solution, but we need some tests for this and also need to figure out the implications of not using the txaio stuff -- that is, most of Autobahn uses txaio so that we can have mostly the same code across Twisted and asyncio. The problem here is that this loop= code bypasses that.

Several options:

  • only use the loop= loop
  • if no loop= is passed, use txaio's loop (via txaio.config.loop) if one is configured
  • just always use txaio directly (i.e. no bare Future(..) use, switch to txaio.create_future() like elsewhere). This option would imply removing the loop= kwarg to these factories, IMO, which may be a compatibility problem.

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

I have a use-case question, too: does anyone need a loop-per-WebSocket* factory, or is the ability to change "the" loop sufficient?

txaio is basically a lowest-common-denominator library between asyncio and Twisted -- and since Twisted only has "a" reactor loop currently, so does txaio.

@petri
Copy link
Author

petri commented Nov 3, 2016

I thought txaio is only meant to be used to provide an abstraction layer so code can run unmodified both on asyncio and twisted? So why would it be needed in the asyncio-specific package?

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

Yes, we can use asyncio-only APIs inside this asyncio-specific package. However, txaio has its own concept of "the" event loop and this stuff bypasses all that -- so needs some consideration.

That is, if you set up an event loop in txaio and then run this you'll be sad that your event-loop isn't getting used by these websocket connections. (Also, yields() in particular is exactly the same as txaio.is_future() but that's relatively minor)

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

What I think we should do is: remove the loop= kwarg entirely, and fix this code so it uses the txaio-configured event-loop (if available) or the default event loop otherwise (whether through txaio APIs or straight Future() calls). Also add test-coverage.

...BUT that removes a currently-existing public API is mostly why I added the needs-discussion tag.

@laz2
Copy link

laz2 commented Nov 3, 2016

Explicit pass of loop is de facto for asyncio applications. So another hard option:

  • refactor txaio for instantiate configured object with loop or reactor
  • call methods on this object instead on txaio module

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

Well, we can't have txaio.create_future(loop=..) because Twisted doesn't support that -- this is why txaio has one event-loop configured globally.

But, I guess you're suggesting some sort of intermediate object that gets passed a loop (and in Twisted case I guess it would just have to throw an exception if the loop isn't "the one true reactor"). e.g. something like txaio.with_loop(loop) or something, and the object that returns has all the API methods of txaio (create_future, etc).

I still ask, though: is there really a use-case for N loops? Or is "default loop or 1 explicit loop" sufficient? Can you point me to examples of asyncio code that use > 1 loop (and why they do that)? The use-case in this particular issue is for testing, which seems completely sufficient to have just one replacement loop.

Another possible solution:

  • use asyncio-specific code in this module (i.e. Future(loop=) directly
  • make sure that if no loop is passed (or None), we use txaio.config.loop (if it's not None)

This would, hopefully (and, should have tests!) mean that someone who never passed loop= but did use txaio to set an explicit loop would still get that txaio-known explicit loop but people who either currently use the loop= API or who need > 1 loop can do so. I still think this will lead to odd situations and weird edge-cases, though: any "common" code in the generic stuff will be using txaio.* and only see the one txaio loop (e.g. any code in autobahn.websocket.protocol).

@soulmerge
Copy link

I don't know about the cross-platform compatibility, but the autobahn.asyncio.websocket.WebSocketServerFactory (which is a dedicated class for use with asyncio) is explicitly accepting a loop keyword in its constructor and the usage of the Future() without a loop keyword breaks the promise of that factory's constructor.

As for an example usage with multiple event loops: I am currently working on a deployment script that can manage multiple services, each running in a separate thread, and each service (possibly) using different main-loop implementations. So this library needs to be able to handle a deployment of (I'm making up an example scenario here) four different services, where one service implements a threaded web-server where the main thread just accepts connections and spawns a new thread for each connection, an FTP service that makes use of an ftp library that operates using calls to select.epoll(), and two other services that use asyncio, but possibly using different helper libraries.

The library in this scenario has no clue whatsoever what technologies are being used for each service and the services don't know about each other's implementations, either. This is why each of these service implementations must be able to manage a dedicated event loop in my case.

I encountered this bug during the implementation of an autobahn-powered websocket service.

@meejah
Copy link
Contributor

meejah commented Nov 3, 2016

I agree this class is breaking the implied contract from the loop= kwarg.
It's also breaking the contract of "I set up an alternate event-loop in txaio".

BOTH of the above need fixing, IMO. Possibly this means augmenting txaio to deal with multiple event-loops.

Is your example scenario covered if "all the Autobahn WebSockets use the custom event loop you gave to txaio"? (e.g. by setting txaio.config.loop = custom_loop)

@soulmerge
Copy link

Not really. The issue for me is that there is a global event loop, which prevents me from running multiple loops in different threads.

Actually my requirements are even more restrictive than that: I cannot even work with a thread-local loop, since the loop might receive new tasks from other threads.

I'm aware, that these are pretty unusual constraints.

@oberstet
Copy link
Contributor

oberstet commented Nov 4, 2016

Here is my view on this:

  • there are valid, though probably mostly uncommon use cases for multiple loops
  • Twisted made a historic design mistake in not allowing for multiple/different event loops, asyncio is doing it right
  • allowing Twisted to work with multiple event loops (reactors) would probably be a pretty big effort, so I wouldn't hold my breath

In principle there seem to be 3 options of what we could do:

a) remove loop argument in Autobahn factories, use the txaio configured (global) loop (which is fixed for Twisted, and which could be changed for asyncio - but globally) - I think this is what @meejah suggests
b) add loop as an argument to all Autobahn factories, also for the Twisted ones, but for Twisted bail out if that is not "the one true reactor" - also hinted at by @meejah
c) let Autobahn interfaces for Twisted vs asyncio classes diverge (like today)


a) covers more use cases: it would allow switching the one global event loop, but it does not allow multiple loops concurrently - neither in one and the same, nor in different threads (unless the loop in txaio would be stored in thread local storage - but that seems to be a can of worms).

b) covers all use cases (for asyncio). However, it would trigger a lot of change in Autobahn both in impl. and interfaces - with practically zero gain for Autobahn itself, and for most use cases.

c) is obviously not so cool. we already have a few places diverged (ApplicationRunner etc), and this doesn't feel right - it's a goal conflict: we want to paper over differences in Twisted/asyncio, without rewrapping everything and expose all framework specific extra features. Mission impossible.

In any case, I think this is an important discussion. Right now, I tend to a) ..

@meejah
Copy link
Contributor

meejah commented Nov 4, 2016

I hadn't really considered a "thread-local" txaio loop (which obviously would just be "the" reactor for Twisted). I wonder: would this cover @soulmerge's use-cases?

Twisted I believe realizes the mistake of a global loop (and recent suggested-practice is to take a "reactor" argument for things that require it) -- but yes, it'll be a looong road to multiple reactors "actually" working in Twisted. I think the real mistake here is having an implicit reactor (which, unfortunately, asyncio also has).

Pragmatically, right now the only public API in Autobahn that accepts an extra loop= argument are the WebSocket factories -- and the merged #750 goes a decent way to at least making this "sort-of work when you squint" vs. txaio etc. (i.e. it fixes the original issue in this ticket). I don't believe anyone has ever asked for this from the WAMP factories.

As a refinement of "c" (basically @oberstet implies this mostly) we could do this: allow a loop= kwarg to txaio.create_future() (on all platforms). On Twisted, you get an exception if loop is not reactor or similar. On asyncio, one of two things could happen: either it's just blindly passed on to the Future() call using any global loop as a backup/default or it checks if there's a global txaio loop configured (and fails if you've configured a loop and passed in a loop). Then, we could selectively add ways to do custom loops for WebSocket (e.g. leave as now, probably) or WAMP protocols (would also take consideration for the Connection class -- which is probably where it make sense to do this for the WAMP side, instead of [or in addition to] loop= args on the WAMP factories).

I'd still like to see actual open-source asyncio code that does really use multiple loops ;)

(Personally, I'd recommend strongly against mixing threads and async code -- especially with WAMP, where you already have good tools to go multi-process which a) avoids confusion and b) gives way better performance). That said, asyncio does give you multiple event-loops out of the box (which is nice) and obviously people are using these even without threads.

@oberstet
Copy link
Contributor

oberstet commented Nov 4, 2016

Personally, I'd recommend strongly against mixing threads and async code -- especially with WAMP

Absolutely. (few exceptions, eg implicit background thread pools to use block database client libraries)

Thread local storage: I was more mentioning this for the sake of completeness .. I would strongly argue against introducing that into txaio. Thread local storage can get tricky, very hard to debug and in particular when combined with event loops, the GIL, etc - so no, pls no;)

As far as I understand @soulmerge , it wouldn't be enough anyway: he wants multiple concurrently running loops in one thread - but I might be wrong here.

As a refinement of "c" (basically @oberstet implies this mostly) we could do this: allow a loop= kwarg to txaio.create_future() (on all platforms). ....

I like that: essentially, first add support for this in txaio (where changes are required anyway), and then selectively see where we use / expose that in Autobahn.

Where else within txaio interfaces would we need to add that? Essentially everything refering a loop/reactor in the implementation needs the additional kwargs.

Minor: should we go into that direction, a trivial, but annoying question: what's the parameter name? loop, reactor, loop_or_reactor, ..?

I'd still like to see actual open-source asyncio code that does really use multiple loops ;)

Yes, me too! Because of two reasons:

  • if we sunk work into this, we should at least make sure it actually covers the use cases supposed to be addressed by the change
  • we have too much to do, and though all this seems desirable from a design point of view, we need to concentrate to where the real gains for common use cases / most users are

@meejah
Copy link
Contributor

meejah commented Nov 4, 2016

Things that use txaio.config.loop right now are:

  • txaio.create_future
  • txaio.as_future
  • txaio.call_later
  • txaio.make_batched_timer

(The above for the asyncio side). For Twisted, it's a subset of that -- just the call_later and make_batched_timer.

soulmerge added a commit to score-framework/py.websockets that referenced this issue Nov 10, 2016
The autobahn library's asyncio support is limited by the authors'
attempt to support other asyncio libraries:

crossbario/autobahn-python#747

These limitations would have prevented us from using score.ws with other
asyncio-based Workers. That's why we decided to switch to the websockets
library instead.
@meejah
Copy link
Contributor

meejah commented Dec 6, 2016

In addition to the above, the web-socket base protocol uses make_batched_timer to reduce the number of timer objects we make when running with lots of connections. Currently there is one of these.

This would need to changed to be a per-loop batched-timer object. Twisted can use a single batched-timer (as now) and asyncio would need a per-loop implementation (so, push the batched-timer down to autobahn.{asycnio,twisted}.websocket from the generic class).

@JoshHeitzman
Copy link

Just hit this when moving a websocket server from the main thread to a background thread to enable keyboard interrupts to work on Windows. Never heard of txaio prior to this, and was definitely surprised to find its aio.config.loop attribute being used instead of the loop I provided to the factory. In my case there is still only a single loop in use, it just isn't the one from the main thread. Here's the relevant part of the callstack where I discovered what was going wrong:

event_loop.py, line 28, in run_loop
loop.run_forever()
base_events.py, line 309, in run_forever
self._run_once()
base_events.py, line 1217, in _run_once
handle._run()
events.py, line 136, in _run
self._callback(*self._args)
websocket.py, line 114, in process
self._dataReceived(data)
protocol.py, line 1174, in _dataReceived
self.consumeData()
protocol.py, line 1203, in consumeData
self.processHandshake()
protocol.py, line 2785, in processHandshake
f = txaio.as_future(self.onConnect, request)
aio.py, line 305, in as_future
return create_future_success(res)
aio.py, line 285, in create_future_success
return create_future(result=result)
aio.py, line 276, in create_future
f = Future(loop=config.loop)

@graingert
Copy link

graingert commented Mar 9, 2021

asyncio is deprecating the loop kwarg (for removal in 3.10) throughout its api. It's also not recommended to call asyncio.Future() or asyncio.get_event_loop():

The rule of thumb is to never expose Future objects in user-facing APIs, and the recommended way to create a Future object is to call loop.create_future()

Because this function has rather complex behavior (especially when custom event loop policies are in use), using the get_running_loop() function is preferred to get_event_loop() in coroutines and callbacks.

Specifically futures should only be created with:

def create_future():
    return asyncio.get_running_loop().create_future()

@oberstet
Copy link
Contributor

oberstet commented Mar 10, 2021

actually, we only want to determine "the default loop" when the user does not explicitly provides one. the other aspect: yeah, we should create future using the factory present on the loop object for that. and creating a future in autobahn should use the txaio function for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants