New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: highlevel strategy discussion #42

Open
njsmith opened this Issue Sep 7, 2018 · 6 comments

Comments

Projects
None yet
4 participants
@njsmith
Member

njsmith commented Sep 7, 2018

I feel like trio-asyncio's core functionality is been pretty solid for a while, but we're still flailing a bit trying to find the most usable way to expose it to users. I guess right now it has 3 different APIs, we're discussing a 4th, and it's not clear whether any of them is actually what we want? And I'm frustrated that I don't feel like I understand what the pitfalls are or what features users actually need. And my main coping strategy for that kind of frustration is to open an issue and write a bunch of text to organize my (our?) thoughts, so here we go. Hopefully laying out all the information in one place will give us a clearer picture of what the trade-offs actually are, and help us find the best way forward. (4 overlapping APIs and constant churn cannot be the solution!)

What I think I know

Strategies we've tried:

  • run_asyncio, run_future, run_trio: explicit call-in-other-mode transitions, inspired by trio.run
    • Advantages: very explicit, captures the core functionality (these are technically enough to do anything), and the idiom is familiar to trio users
    • Limitation: while you can do anything with just these, it often requires frustratingly verbose code. Some cases that aren't handled well: context managers, iterables, handlers (in particular, defining a function that asyncio can call but runs in trio mode)
    • Averted limitation: Recall that await whatever() is shorthand for coro = whatever(); await coro. It turns out that there is some asyncio code (in aiohttp? which code?), that calls asyncio.current_task() from inside the coro = whatever() part, not within the await coro part. This breaks a naive implementation of run_asyncio, that only runs the await coro part inside an asyncio.Task. However, now that we know about this, it's easy to fix by making run_asyncio perform both halves of the await whatever() dance inside the new task. (Code like this is also broken in pure asyncio if you do asyncio.create_task(whatever()), but never mind that...)
  • aio_as_trio, trio_as_aio: translators for common protocols (specifically: CM, async CM, iterable, async iterable, and async callable – but not sync callable).
    • Advantages: can handle the three problem cases listed above.
    • Limitation: the syntax in aio_as_trio(fn)(...) is going to confuse the heck out of newcomers ("Readability counts")
    • Limitation: often CM/iterables are constructed by making a synchronous call, and we still don't have any good way to write that. And... this shouldn't matter, because generally speaking it's fine to use synchronous trio APIs from aio mode, and vice-versa... with the ONE exception of asyncio.current_task, and it turns out that people (aiohttp) call this annoyingly often. And... you can't really spawn a task just to call asyncio.current_task, that's not going to give useful results; the task it returns will be gone before you can do anything with it.
    • Limitation: some context managers, like the one in async_timeout, assume that their __aenter__, __aexit__, and body all execute in the same task. That's not true for a naive implementation of aio_as_trio that just uses run_asyncio to call the __aenter__ and __aexit__. Is a non-naive implementation even possible? I don't see how...
      • This is also very problematic when going in the other direction: any trio context manager that contains a nursery or cancel scope is currently totally broken if used with trio_as_aio
  • allow_asyncio: use some Clever Coroutine Tricks to create a hybrid asyncio/trio mode where you can call either kind of function.
    • Advantages: super ergonomic!
    • Limitation: We don't know where the boundaries are between trio code and asyncio code, so we can't translate asyncio.CancelledError <-> trio.Cancelled at the boundary. It's not clear how bad this is in practice... mainly it means that trio.Cancelled might pass through asyncio code. This will probably be treated like a generic unhandled exception, which in many cases will be what you want. Grepping aiohttp, there are a number of places that look for asyncio.CancelledError, but they mostly seem to be ad hoc attempts to implement something like Trio's cancel scope delimination, so maybe we already handle that fine?
    • Limitation: We know how to go from trio-mode to hybrid-mode, but currently can't go from asyncio-mode to hybrid-mode. I guess composing run_trio with allow_asyncio would do it though.
    • Limitation: current_task doesn't work at all. This breaks the popular async_timeout library (used by e.g. aiohttp, homeassistant, and others). This seems like a showstopper problem. If we can't fix this, then I don't think we can in good conscience offer "hybrid mode" as a feature. Even if we document that it only works in "simple cases", then the first thing people will try is writing a simple little program... that uses aiohttp, and it won't work.
  • async with aio_mode, async with trio_mode, or maybe even async with hybrid_mode: Hypothetical approach that would let you switch modes in the middle of a function (details: python-trio/trio#649).
    • Basically run_asyncio/run_trio, but with better ergonomics because you don't need to define and call another function.

Some tentative conclusions?

  • The allow_asyncio hybrid-mode approach is never going to be 100% reliable (because of the cancellation issue), and currently is pretty badly broken (because of the current_task issue making it incompatible with super-popular libraries like aiohttp). Conclusion: it will never be the only option (we want to give people the option of using something less magical and more reliable when they have to), and right now we probably shouldn't be shipping it at all. So let's put it aside for the moment and focus on the other options.

  • From an API design standpoint, I think it makes sense to provide the basic run_asyncio, run_future, run_trio primitives. They aren't necessarily the thing we expect people to use all the time, but they provide a set of simple, reliable core primitives that you can always fall back to if you have some confusing situation where our more ergonomic options don't work. Alternatively, if we get async with aio_mode/async with trio_mode working, those could also serve as a set of basic primitives.

  • The aio_as_trio/trio_as_aio are... maybe not actually a good idea after all? That's not the conclusion I was expecting to reach; I thought I was going to end up arguing for them as convenience shorthand on top of the run_* primitives. But I'm really concerned about the issues caused by running __aenter__ and __aexit__ in different tasks – that really will break all kinds of stuff. The async with aio_mode/async with trio_mode approach avoids this problem, e.g.:

    async with trio_mode:
        async with trio.open_nursery() as nursery:
            nursery.start_soon(...)
            async with aio_mode:
                ...
            # Here we leave aio_mode and return back to *the original trio task*
        # So when we exit the nursery here, that happens in the same task where it was created

So....... maybe I've argued myself into thinking that async with aio_mode / async with trio_mode really are something we have to dig into more, and might even be The Solution To All Our Problems? There are still a bunch of details to work out first to figure out how these can work, but maybe we should do that.

They even make a reasonable substitute for @aio2trio/@trio2aio, e.g. instead of

@aio2trio
async def this_is_called_by_homeassistant():
    # And I write Trio code inside it

You write

async def this_is_called_by_homeassistant():
    async with trio_mode:
        # And I write Trio code inside it

One extra level of indentation, but the same number of lines, and no need for anything really annoying like writing a trampoline function.

What am I missing?

Probably a bunch of stuff, but hopefully laying out my thinking will make it obvious to someone what terrible mistakes I'm making? Please help me be smarter :-)

@njsmith

This comment has been minimized.

Member

njsmith commented Sep 7, 2018

A few more questions that didn't fit into the above:

  • Can we salvage allow_asyncio by making asyncio.current_task return a pseudo-Task object that we define? Basically the only thing you can do after calling current_task is call task.cancel(), so if we have a custom object we could just... make that do whatever we want (e.g. set a flag so that the next time we await an asyncio future, it gets cancelled).

  • Why do we have trio_asyncio.run, it's confusing (I can never remember whether it's a replacement for trio.run or asyncio.run), and I don't understand the use case (I feel like either people are writing a trio app that uses aio libraries, in which case they want trio.run, or they're adapting a big asyncio framework like homeassistant or aiohttp server to use trio as a main loop, in which case you want a replacement for asyncio.run – so trio_asyncio.run isn't useful for either of these groups?)

@pquentin

This comment has been minimized.

Member

pquentin commented Sep 7, 2018

Thanks for the write up!

I don't know what you're missing and never used trio-asyncio, but if your premises are correct, I do agree with your conclusions: async with aio_mode / async with trio_mode should be the primary API and run_asyncio / run_future / run_trio should be put in a hazmat/core module.

@Fuyukai

This comment has been minimized.

Fuyukai commented Sep 7, 2018

In my opinion, the API should be simply the async with ... style, maybe with the allow_asyncio helper included optionally.

The basic primitives should be exposed, but like, not the way to do anything unless you need them. This would be, in my opinion, the simplest and easiest way to do everything.

@smurfix

This comment has been minimized.

Collaborator

smurfix commented Sep 7, 2018

@njsmith You're right in that aio_as_trio etc. are a bit cumbersome (though I got quite used to it). The problem I have with run_asyncio etc. (and I've gotten some feedback saying that I'm not the only one) is that it's not quite as obvious whether the wrapped code is asyncio code, or the caller is asyncio code …

The async with trio_mode: idea would be cool to pull off / have as the main interface, but I'm afraid that somebody else will have to implement it – I've been laid up for a month+ and the Real Work I've accumulated feels like it'll take a couple of lifetimes to clean up. :-/

@njsmith

This comment has been minimized.

Member

njsmith commented Sep 8, 2018

I've been laid up for a month+ and the Real Work I've accumulated feels like it'll take a couple of lifetimes to clean up. :-/

@smurfix Oof, I hear you on that :-(. Hope you're feeling better, and good luck.

njsmith added a commit to njsmith/pytest-trio that referenced this issue Sep 26, 2018

Avoid using deprecated run_asyncio
I guess this will break in the future as we work through

  python-trio/trio-asyncio#42

but right now I don't care I just want these freaking tests to pass.
@njsmith

This comment has been minimized.

Member

njsmith commented Oct 12, 2018

It looks like the three primitives we need to implement coroutine runner switching are:

  • Pause a task temporarily, in such a way that we get some notification when it gets cancelled (but don't have to resume immediately). Might also be nice to be able to get at the next value sent/thrown into the coroutine. Used when transitioning into a different coroutine runner temporarily (e.g., when entering an async with trio_mode block, we need to tell asyncio to stop scheduling this coroutine object temporarily).

  • Resume a task that was paused as per above. This is done by yielding some value to the other coroutine runner, and then telling the first coroutine runner to start scheduling this task again. (E.g., when exiting an async with trio_mode block, so asyncio needs to start scheduling the coroutine object again.)

  • Kill a task permanently, as if the coroutine object had completed (with some value or exception), even though it hasn't. Used when exiting a task that we don't plan to re-use (e.g., when exiting an async with aio_mode block, we need to free the asyncio backing task that we created, because it will never run again).

On the Trio side, we control everything, so these are all pretty straightforward to implement. On the asyncio side I think we can implement these as:

  • Temporary task pausing: yield a custom Future-like object (it has to be custom so we can override cancellation handling – otherwise we could just use a regular Future).

  • Task resuming: call whatever callback was passed to our Future object's add_done_callback method.

  • Task killing: this is the tricky one.

A first attempt would be to (a) yield a Future that will never be resumed, (b) call asyncio.tasks._unregister_task. This is a bit gross (uses internal APIs), and has a flaw:

https://github.com/python/cpython/blob/fc439d20de32b0ebccca79a96e31f83b85ec4eaf/Lib/asyncio/tasks.py#L137-L146

self._state == futures._PENDING will be true whenever the task hasn't completed, so in general this logic might fire after we abandon a task to the abyss. If we only had to care about the Python implementation of Task shown above, this wouldn't be an issue, because we could just mutate task._state ourselves to pretend that it had finished. But in the C implementation that's used by default, _state is immutable. (Try it: f = asyncio.Future(); f._state = "CANCELLED" → exception.) We could mutate it with ctypes, but that's pretty ugly, so let's see if there are any other options. I looked for methods that mutate _state, and found Future.cancel. (Note: not Task.cancel. Task inherits from Future, but overrides the cancel method to do something completely different, so normally Future.cancel would never run on a Task object. But, Python allows us to do it explicitly if we want, by writing Future.cancel(task_obj).) This could work, but it also schedules all registered callbacks to run, which is... maybe not what we want? Or is it? If we're killing a task like this, maybe we should run any registered callbacks? For trio-asyncio's purposes, where the task represents a single async with aio_mode embedded inside a larger function, it would be pretty perverse to have any callbacks registered for the task. You'd have to do something like asyncio.current_task().add_done_callback(...), which makes no sense. Anyway, those seem to be the options for manipulating task._state.

Alternatively we could try mutating task._log_destroy_pending. If that's False, then it defangs the Task.__del__ code shown above. And... oh nice, it looks like _log_destroy_pending actually is mutable from the Python level. I guess because asyncio.gather mutates it for some reason.

We should also check Future.__del__, since Task.__del__ invokes it:

https://github.com/python/cpython/blob/fc439d20de32b0ebccca79a96e31f83b85ec4eaf/Lib/asyncio/futures.py#L90-L104

So we might want to make sure __log_traceback is set to False. Fortunately it looks like we can do that by simply writing task._log_traceback = False.

These options are all pretty terrible; if we go ahead with this we should talk to the asyncio folks about making sure that in 3.8 there's a nicer API for this, or at least that our terrible hacks don't break, to avoid a repeat of #23. BUT, it does look like there aren't any show-stoppers currently, so we should probably prototype this out and decide whether we're actually going to commit to it before we talk to the asyncio folks about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment