[RFC] Option 1 for an async API to libcloud #1016

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
4 participants
@tonybaloney
Contributor

tonybaloney commented Apr 2, 2017

This is one option for having asynchronous support in libcloud.

The idea is to have a async context manager, that takes either a single instane, or a collection (tuple) of NodeDrivers, then provides the user with the ability to await common methods like list_nodes and create_node.

import asyncio

from integration.driver.test import TestNodeDriver
from libcloud.async_util import AsyncSession

driver = TestNodeDriver('apache', 'libcloud', secure=False,
                        host='localhost', port=9898)

async def run():
    async with AsyncSession(driver) as async_instance:
        nodes = await async_instance.list_nodes()

    assert len(nodes) == 2

loop = asyncio.get_event_loop()
loop.run_until_complete(run())
loop.close()

tonybaloney added some commits Apr 2, 2017

@tonybaloney

This comment has been minimized.

Show comment
Hide comment
@tonybaloney

tonybaloney Apr 2, 2017

Contributor

@Kami keen for your thoughts. One of the biggest bits of feedback with the requests re-implementation was why didn't we just go straight to aiohttp. It would have wrecked the backward compat and pretty much every user. So I'm seeking a common ground that doesn't mean having a base class with 2x as many methods (async_list_nodes and list_nodes).

This way replaces list_nodes and would other methods, with an awaitable coroutine. It could also swap out requests with aiohttp in the underlying connection attribute for the duration of the context session.

Contributor

tonybaloney commented Apr 2, 2017

@Kami keen for your thoughts. One of the biggest bits of feedback with the requests re-implementation was why didn't we just go straight to aiohttp. It would have wrecked the backward compat and pretty much every user. So I'm seeking a common ground that doesn't mean having a base class with 2x as many methods (async_list_nodes and list_nodes).

This way replaces list_nodes and would other methods, with an awaitable coroutine. It could also swap out requests with aiohttp in the underlying connection attribute for the duration of the context session.

@Kami

This comment has been minimized.

Show comment
Hide comment
@Kami

Kami Apr 2, 2017

Member

I like that approach 👍

One option would also be to have async version of each driver (e.g. EC2AsyncNodeDriver / AsyncEC2NodeDriver) or simply libcloud.compute.drivers.async.EC2NodeDriver. User would then select which one to use when using get_driver factory method.

This way user wouldn't need to wrap the whole code or multiple operations in the context manager.

Having said that, I do think that backward compatibility is very important and as you have said this limits some of the options which are available to us.

Member

Kami commented Apr 2, 2017

I like that approach 👍

One option would also be to have async version of each driver (e.g. EC2AsyncNodeDriver / AsyncEC2NodeDriver) or simply libcloud.compute.drivers.async.EC2NodeDriver. User would then select which one to use when using get_driver factory method.

This way user wouldn't need to wrap the whole code or multiple operations in the context manager.

Having said that, I do think that backward compatibility is very important and as you have said this limits some of the options which are available to us.

@Kami

This comment has been minimized.

Show comment
Hide comment
@Kami

Kami Apr 2, 2017

Member

On a related note - would also be good to check what other Python HTTP libraries which support both, sync and async implementations do (and maybe take some inspiration from that).

We also had some debate about this in the past, maybe also worth digging out the mailing list thread and see if we can salvage some ideas from there.

Member

Kami commented Apr 2, 2017

On a related note - would also be good to check what other Python HTTP libraries which support both, sync and async implementations do (and maybe take some inspiration from that).

We also had some debate about this in the past, maybe also worth digging out the mailing list thread and see if we can salvage some ideas from there.

@s0undt3ch

This comment has been minimized.

Show comment
Hide comment
@s0undt3ch

s0undt3ch Apr 2, 2017

Contributor

While this is a good start, you're still blocking the loop because your transport still isn't async.
And just to make sure, you know you're returning a future which needs to be await'ed right?

Contributor

s0undt3ch commented Apr 2, 2017

While this is a good start, you're still blocking the loop because your transport still isn't async.
And just to make sure, you know you're returning a future which needs to be await'ed right?

@s0undt3ch

This comment has been minimized.

Show comment
Hide comment
@s0undt3ch

s0undt3ch Apr 2, 2017

Contributor

For my own projects I tend to choose asyncio but if you still want to support py2, either you have 2 versions of each driver, py2 sync(threadpool for async?), py3 async or sync, or, you choose something like tornado which still works with py2 and py3.
The drawback is that you're no longer using native coroutines.

Contributor

s0undt3ch commented Apr 2, 2017

For my own projects I tend to choose asyncio but if you still want to support py2, either you have 2 versions of each driver, py2 sync(threadpool for async?), py3 async or sync, or, you choose something like tornado which still works with py2 and py3.
The drawback is that you're no longer using native coroutines.

@pquentin

This comment has been minimized.

Show comment
Hide comment
@pquentin

pquentin Apr 2, 2017

Contributor

I think trying to add async support to libcloud is a worthwile goal. Unfortunately, you did not really accomplish much here. On the API side, to use the async version of libcloud, users will still need to use the await keyword, so avoiding users to write list_nodes_async (or something) is not a huge win. On the internal side, the world is still divided in two because you still have two versions of list_nodes.

The real problem now is: how do you make this manageable internally in libcloud? If you look at the call tree of a method like "list_nodes", what it's going to do is to make network calls around leaves of the call tree. Each such call has to be in an async function to avoid blocking the event loop, and each caller of those functions has to be an async function too! Everything is being tainted. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle. The very nature of libcloud makes this a real challenge. Try to get an async list_nodes to work without changing any other functions, and look at all the code you had to modify/copy paste. How do you keep the two versions in sync?

One option is to use the approach of https://github.com/njsmith/h11 or https://effect.readthedocs.io/en/latest/index.html which totally decouples the I/O from the business logic, which allows to use different wrappers. But that basically requires rewriting libcloud, introducing lots of bugs.

Another one is to write fully async code and provide sync wrappers. This is probably not a good idea because the library now needs to introduce its own event loop and it's not the away asyncio is supposed to work in Python. There are probably other, more subtle concerns (C# had a lot more time to think about this.) And I'm not even talking about Python 2 support, which does not have an usable event loop. (Trollius is not maintained and should not be used right now.)

See this requests issue for more discussions about this (requests/requests#1390). No wonder no popular library I know of currently offers async/sync support. It's much easier to actually write two different libraries. But that's probably not a good idea for libcloud to divide the eyeballs even more.

Contributor

pquentin commented Apr 2, 2017

I think trying to add async support to libcloud is a worthwile goal. Unfortunately, you did not really accomplish much here. On the API side, to use the async version of libcloud, users will still need to use the await keyword, so avoiding users to write list_nodes_async (or something) is not a huge win. On the internal side, the world is still divided in two because you still have two versions of list_nodes.

The real problem now is: how do you make this manageable internally in libcloud? If you look at the call tree of a method like "list_nodes", what it's going to do is to make network calls around leaves of the call tree. Each such call has to be in an async function to avoid blocking the event loop, and each caller of those functions has to be an async function too! Everything is being tainted. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle. The very nature of libcloud makes this a real challenge. Try to get an async list_nodes to work without changing any other functions, and look at all the code you had to modify/copy paste. How do you keep the two versions in sync?

One option is to use the approach of https://github.com/njsmith/h11 or https://effect.readthedocs.io/en/latest/index.html which totally decouples the I/O from the business logic, which allows to use different wrappers. But that basically requires rewriting libcloud, introducing lots of bugs.

Another one is to write fully async code and provide sync wrappers. This is probably not a good idea because the library now needs to introduce its own event loop and it's not the away asyncio is supposed to work in Python. There are probably other, more subtle concerns (C# had a lot more time to think about this.) And I'm not even talking about Python 2 support, which does not have an usable event loop. (Trollius is not maintained and should not be used right now.)

See this requests issue for more discussions about this (requests/requests#1390). No wonder no popular library I know of currently offers async/sync support. It's much easier to actually write two different libraries. But that's probably not a good idea for libcloud to divide the eyeballs even more.

@tonybaloney

This comment has been minimized.

Show comment
Hide comment
@tonybaloney

tonybaloney Apr 3, 2017

Contributor

I'll try and get those points in order,

  1. the transport isn't async - correct. I'm going to raise a few pull requests with different design patterns and ask the community to discuss pro's and con's. We'll see the benefits and performance gains from the 2nd pattern. I don't completely believe the transport needs to be a coroutine

  2. "you know you're returning a future", yes- that was kind of the point, the idea here is that the user can build a coroutine which runs a sequence of steps against the API and then wait for the event loop to close out.

  3. On the API side, to use the async version of libcloud, users will still need to use the await keyword- exactly. I work on another C# library, it's common practice now for a C#.NET package to expose all Async methods as awaitable to give the caller control over which Task API to use, they can just await it, then can use a Thread pool, etc. etc.

All the logic inside the drivers is about manipulating request and response dictionaries, authenticating APIs and sometimes doing list comprehensions and model projections, whether the call to the underlying API is the coroutine or the method itself is a coroutine won't make a difference since the method will still need to await the response.

In the requests-futures example, I want to copy a similar principle but apply it in a context manager, the context manager in this PR changes a NodeDrivers core methods to become coroutine's.

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)
Contributor

tonybaloney commented Apr 3, 2017

I'll try and get those points in order,

  1. the transport isn't async - correct. I'm going to raise a few pull requests with different design patterns and ask the community to discuss pro's and con's. We'll see the benefits and performance gains from the 2nd pattern. I don't completely believe the transport needs to be a coroutine

  2. "you know you're returning a future", yes- that was kind of the point, the idea here is that the user can build a coroutine which runs a sequence of steps against the API and then wait for the event loop to close out.

  3. On the API side, to use the async version of libcloud, users will still need to use the await keyword- exactly. I work on another C# library, it's common practice now for a C#.NET package to expose all Async methods as awaitable to give the caller control over which Task API to use, they can just await it, then can use a Thread pool, etc. etc.

All the logic inside the drivers is about manipulating request and response dictionaries, authenticating APIs and sometimes doing list comprehensions and model projections, whether the call to the underlying API is the coroutine or the method itself is a coroutine won't make a difference since the method will still need to await the response.

In the requests-futures example, I want to copy a similar principle but apply it in a context manager, the context manager in this PR changes a NodeDrivers core methods to become coroutine's.

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)
@tonybaloney

This comment has been minimized.

Show comment
Hide comment
@tonybaloney

tonybaloney Apr 3, 2017

Contributor

It would make sense to write down some use cases and demonstrate them using each design pattern. I'll work on that.

Contributor

tonybaloney commented Apr 3, 2017

It would make sense to write down some use cases and demonstrate them using each design pattern. I'll work on that.

@pquentin

This comment has been minimized.

Show comment
Hide comment
@pquentin

pquentin Apr 3, 2017

Contributor

It looks like we have trouble understanding each other. Maybe stating what are the goals and non-goals of async in libcloud would help. And code examples would help even more, thanks for offering that while we only look from the distance!

Contributor

pquentin commented Apr 3, 2017

It looks like we have trouble understanding each other. Maybe stating what are the goals and non-goals of async in libcloud would help. And code examples would help even more, thanks for offering that while we only look from the distance!

@tonybaloney

This comment has been minimized.

Show comment
Hide comment
@tonybaloney

tonybaloney Apr 3, 2017

Contributor

@pquentin I only started async in Python last week so there's a good chance I'm confused 🙃

Assume for a second that Libcloud had async support, how would you want it to work? Write a code snippet showing the example and we can work back from there. @s0undt3ch too

Contributor

tonybaloney commented Apr 3, 2017

@pquentin I only started async in Python last week so there's a good chance I'm confused 🙃

Assume for a second that Libcloud had async support, how would you want it to work? Write a code snippet showing the example and we can work back from there. @s0undt3ch too

@pquentin

This comment has been minimized.

Show comment
Hide comment
@pquentin

pquentin Apr 3, 2017

Contributor

Taking inspiration from a gist using storage you posted in January, I would like to be able to do things like:

GoogleStorageDriver = get_driver(Provider.GOOGLE_STORAGE)
driver = await GoogleStorageDriver(key='GOOG***', secret='****')

tasks = []
async for container in driver.list_containers():
    async for object in driver.list_container_objects(containers[0]):
        tasks.append(asyncio.ensure_future(do_stuff_with_object(object))
await asyncio.gather(*tasks)

If the event loop was blocked by synchronous transport, then this code would be just as slow as synchronous code. With async transport, then the do_stuff_with_object calls would be done in parallel. The other calls (especially the async fors) would still be sequential, but would free the event loop to allow other unrelated code to run concurrently.

Contributor

pquentin commented Apr 3, 2017

Taking inspiration from a gist using storage you posted in January, I would like to be able to do things like:

GoogleStorageDriver = get_driver(Provider.GOOGLE_STORAGE)
driver = await GoogleStorageDriver(key='GOOG***', secret='****')

tasks = []
async for container in driver.list_containers():
    async for object in driver.list_container_objects(containers[0]):
        tasks.append(asyncio.ensure_future(do_stuff_with_object(object))
await asyncio.gather(*tasks)

If the event loop was blocked by synchronous transport, then this code would be just as slow as synchronous code. With async transport, then the do_stuff_with_object calls would be done in parallel. The other calls (especially the async fors) would still be sequential, but would free the event loop to allow other unrelated code to run concurrently.

@tonybaloney

This comment has been minimized.

Show comment
Hide comment
Contributor

tonybaloney commented Apr 7, 2017

@pquentin

This comment has been minimized.

Show comment
Hide comment
@pquentin

pquentin Apr 7, 2017

Contributor

@tonybaloney to make things clear, I'm not saying I don't want to use an async context manager. I'm just saying I'd like to be able to use await and asyncio.ensure_future() with an async transport. As an user, the details of the API itself are somewhat irrelevant.

Contributor

pquentin commented Apr 7, 2017

@tonybaloney to make things clear, I'm not saying I don't want to use an async context manager. I'm just saying I'd like to be able to use await and asyncio.ensure_future() with an async transport. As an user, the details of the API itself are somewhat irrelevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment