Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trio concurency backend PoC #214

Closed
wants to merge 9 commits into from
Closed

Trio concurency backend PoC #214

wants to merge 9 commits into from

Conversation

florimondmanca
Copy link
Member

Refs #120

For now I basically copy-pasted the asyncio backend from concurrency module and replaced asyncio-specific parts with the trio equivalent. Did the same thing for the async client tests to get a quick idea of how things were going.

Tests are read timing out for now, I'll try and look as to why that is the case tomorrow. :-)

@sethmlarson
Copy link
Contributor

This is awesome, thanks for being the first to take a stab at this! :)

Something I think we ought to do for a trio backend: have some way we can run all our test cases with the trio backend instead of the asyncio backend so we can double the value of our test suite easily. :)

This'll definitely need @tomchristie's eyes as well.

@florimondmanca
Copy link
Member Author

have some way we can run all our test cases with the trio backend instead of the asyncio backend so we can double the value of our test suite easily. :)

Absolutely. I copy-pasted the async client tests knowing that it was definitely not ideal. I think making use of pytest fixtures and using the backend inside fixtures/tests instead of asyncio/trio would allow something like this.

@tomchristie
Copy link
Member

Ah fabulous, great stuff!

Something I think we ought to do for a trio backend: have some way we can run all our test cases with the trio backend instead of the asyncio backend so we can double the value of our test suite easily.

Indeed. I'm open to suggestions here around how best to marshal support for alternate backends, but this PR is a great start!

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 15, 2019

So, having looked a bit more into this, I think a necessary step before pulling off a complete trio backend would be to have end-to-end usage of the concurrency backend in the whole code base.

This requires to remove remaining asyncio-specific code, in particular in ASGIDispatch — I already have something up and running and will open a separate PR for it, then rebase this PR off it to confirm it would be sufficient to implement a trio backend. :-)

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 15, 2019

This now contains the changes from #217, and there's progress: only tests for the blocking client are now failing! 🎉 (Again because of a ReadTimeout, indicating that the client can't reach the server running in the other thread.)

Edit: talked too soon — still having issues with timeouts. I suspect there's a bug in my trio read implementation. Need more investigation…

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 16, 2019

@sethmlarson Today's update: I realized that the timeouts I was having were due to incorrectly handling the necessary coexistance of the asyncio and trio event loops.

Since the uvicorn server has to run on asyncio (because uvicorn is asyncio-only), I figured that:

  1. Any test using the parametrized concurrency backend should run in its own thread, and that this thread had to be fired via the asyncio threadpool (previously it was fired by the backend itself).
  2. We can't use pytest-trio/pytest.mark.trio on tests that use the trio backend, because they must run on asyncio anyway.

So what I did is add a pytest hook that automatically defers to the asyncio threadpool tests that use the backend fixture. (It starts to look like a lot of black magic, but once I get my head around all this we can clean things up for maintainability.)

The hook works like a charm on async tests, but I'm having issues with the blocking ones, e.g. test_client.py. In that case, httpx.Client() ends up calling concurrency_backend.run() (keep in mind — this would be running within the thread fired by asyncio in the hook). I'd expect the resulting new trio event loop to do its business as usual (it works on the asyncio backend), but instead I get a mysterious TrioInternalError (see on CI). 🤷‍♂ I tried running the blocking test function in a plain thread as well (with good ol' threading.Thread()), but I ended up with the same result.

Any clues on this? Am I missing a crucial piece of info that would allow simplifying things considerably?

@sethmlarson
Copy link
Contributor

Your guess is as good as mine. Gonna cc the I/O wizard: @njsmith

@njsmith
Copy link

njsmith commented Aug 17, 2019

I don't really understand what you've been trying – it sounds complicated! – and I can't debug a TrioInternalError without any details :-). So uh, here's some general comments?

  • It might be worth considering using hypercorn instead of uvicorn, since hypercorn already supports both asyncio and trio?
  • It should be totally fine to run asyncio from a trio thread (trio.to_thread.run_sync(asyncio.run, ...)), and to run trio from an asyncio thread (loop.run_in_executor(..., trio.run, ...))
  • I'm missing something about why a "blocking test" involves multiple async backends?

@florimondmanca
Copy link
Member Author

Thanks for pitching in @njsmith!

Here's a minimal repro example for the TrioInternalError:

# app.py
import asyncio

import httpx
from httpx.contrib.trio import TrioBackend


async def app(scope, receive, send):
    assert scope["type"] == "http"
    await send(
        {
            "type": "http.response.start",
            "status": 200,
            "headers": [[b"content-type", b"text/plain"]],
        }
    )
    await send({"type": "http.response.body", "body": b"Hello, world!"})


def test_get(backend):
    url = "http://127.0.0.1:8000/"
    with httpx.Client(backend=backend) as http:
        response = http.get(url)
    assert response.status_code == 200


async def main():
    loop = asyncio.get_event_loop()
    await loop.run_in_executor(None, test_get, TrioBackend())


if __name__ == "__main__":
    asyncio.run(main())

Run it with:

  • uvicorn app:app
  • python app.py

Full traceback below:

$ python app.py
Traceback (most recent call last):
  File "app.py", line 51, in <module>
    asyncio.run(main())
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "app.py", line 47, in main
    await loop.run_in_executor(None, test_get, TrioBackend())
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "app.py", line 40, in test_get
    response = http.get(url)
  File "/Users/florimond/Developer/python-projects/httpx/httpx/client.py", line 750, in get
    trust_env=trust_env,
  File "/Users/florimond/Developer/python-projects/httpx/httpx/client.py", line 718, in request
    response.read()
  File "/Users/florimond/Developer/python-projects/httpx/httpx/models.py", line 969, in read
    self._content = b"".join([part for part in self.stream()])
  File "/Users/florimond/Developer/python-projects/httpx/httpx/models.py", line 969, in <listcomp>
    self._content = b"".join([part for part in self.stream()])
  File "/Users/florimond/Developer/python-projects/httpx/httpx/models.py", line 980, in stream
    for chunk in self.raw():
  File "/Users/florimond/Developer/python-projects/httpx/httpx/models.py", line 1008, in raw
    for part in self._raw_stream:
  File "/Users/florimond/Developer/python-projects/httpx/httpx/interfaces.py", line 224, in iterate
    yield self.run(async_iterator.__anext__)
  File "/Users/florimond/Developer/python-projects/httpx/httpx/contrib/trio.py", line 175, in run
    return trio.run(coroutine, *args)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1769, in run
    run_impl(runner, async_fn, args)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1918, in run_impl
    runner.task_exited(task, final_outcome)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1404, in task_exited
    raise TrioInternalError
trio.TrioInternalError

The issue seems to be with running async_iterator.__anext__, where async_iterator refers to the result from H11Connection._receive_response_data.

I am seemingly getting the same error if slimming down the repro example to just this:

import asyncio
import trio


def test_get():
    async def gen():
        yield "hi"

    g = gen()
    trio.run(g.__anext__)


async def main():
    loop = asyncio.get_event_loop()
    await loop.run_in_executor(None, test_get)


if __name__ == "__main__":
    asyncio.run(main())

Traceback:

$ python app.py
Traceback (most recent call last):
  File "app.py", line 54, in <module>
    asyncio.run(main())
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "app.py", line 50, in main
    await loop.run_in_executor(None, test_get)
  File "/Users/florimond/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "app.py", line 45, in test_get
    trio.run(g.__anext__)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1769, in run
    run_impl(runner, async_fn, args)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1918, in run_impl
    runner.task_exited(task, final_outcome)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.7/site-packages/trio/_core/_run.py", line 1404, in task_exited
    raise TrioInternalError
trio.TrioInternalError

Is it an issue to run a step of an async generator with trio.run()?

@njsmith
Copy link

njsmith commented Aug 17, 2019

Filed and debugged at: python-trio/trio#1191

(The problem is that Trio is not expecting you to have an async function that's implemented in C, because those are extremely rare, and it's crashing while trying to do some introspection that doesn't work on C code. Then because of some annoying details the error gets lost.)

We can fix that but there's a more fundamental problem: the only way it makes sense to call trio.run(__anext__) like this is if you're going to call trio.run on every iteration. But not only is this super wasteful (trio has to reconstruct the whole async environment on every iteration, including allocating a self-pipe etc.), it's also a violation of trio's semantics – you're not in general allowed to pass trio objects from one trio.run call to another.

I assume that this is happening because httpx's sync API is using a strategy where it allocates an asyncio loop and then calls loop.run_until_complete inside every sync call. That strategy only works with asyncio. Fortunately, I don't think httpx needs it to work with anything except asyncio? I don't think the sync API should accept a backend argument at all; the details of how it does sync I/O should be up to it.

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 17, 2019

Hmm, now that you say it, what would prevent us from always using asyncio for the sync API? I don’t see a use case where we’d use the sync client instead of the async one in a trio (async) context anyway.

I believe the backend argument on Client was added in #60 — any thoughts on this @tomchristie?

Also, thanks for debugging this @njsmith!

@sethmlarson
Copy link
Contributor

I think you're totally right on the sync client not accepting a backend @njsmith. Thanks for all the debugging! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants