Async python support: aiopynamodb #802

kamadorueda · 2020-06-21T21:00:51Z

https://github.com/aio-libs/aiobotocore
https://github.com/terrycain/aioboto3

garrettheel · 2020-07-25T17:58:16Z

The discussion here is relevant: #525 (comment)

I'd like to support asyncio natively in the library, but I'm still a little hesitant to adopt aiobotocore right as it's not maintained by AWS. We don't rely on all that much of botocore right now, so one option would be to drop that altogether and provide a separate async interface

dwatkinsweb · 2020-08-20T17:53:00Z

Any idea when this might happen? We could really use this feature right now. I've been attempting to do this myself but I've been having to duplicate a lot of your code for a few small changes.

kamadorueda · 2020-09-05T04:21:52Z

There is another approach that is used by many libraries out there (keep reading for examples):

When a library exposes a high-latency function, for instance:

for item in TestModel.view_index.query(1):
    print("Item queried from index: {0}".format(item))

One can wrap the calls in a sub-thread via loop.run_in_executor.

Since that's is a little verbose there are nice libraries to make it human-friendly, for example aioextensions

So the syntax would be something like:

from aioextensions import in_thread

for item in await in_thread(TestModel.view_index.query, 1):
    print("Item queried from index: {0}".format(item))

Which would run the high-latency thing in a sub-thread that allows for concurrency.

It's a very minimalistic interface and requires no work from pynamodb since it's on the consumer side to do the wrapping:

from aioextensions import in_thread, collect

# Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4)
one_query = await in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=3, kwarg_b=4)

# Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) but all queries concurrently (overlapping in time) and fast!!
many_queries = await collect([
    in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=kwarg_a, kwarg_b=kwarg_b)
    for arg_1, arg_2, kwarg_a, kwarg_b in [long list of things to fetch]
])

There is another alternative and is providing _async versions of the functions, which internally could use the mentioned wrappers hiding them from the final user:

def pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) -> Data:
    ....

async def async_pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) -> Data:
   return await in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=kwarg_a, kwarg_b=kwarg_b)

The library also offers some nice helpers that we could find useful like workers, batching and rate limits.

I think I'm volunteering to implement the async wrappers if you think it's a nice approach, you tell me! @garrettheel

These are examples of the mentioned sub-thread wrapping:

I've personally used it in production and the benefits from concurrency are worth the small overhead it adds to every call

It's common to use a_, async_ or _async notation when both flavors are offered by a library

garrettheel · 2020-09-28T14:13:05Z

loop.run_in_executor is an interesting approach, but I have tried this before and seen performance issues with high-throughput applications trying this. Introducing threads also introduces new and interesting failure modes that didn't exist before. I'd be concerned about going down that path, especially since the vast majority of users would still use the sync interface and pay that tax

I've been experimenting with a different approach in #853, which could be characterized as a hackier version of the above suggestion (to the benefit of not requiring threads).

brunobelloni · 2022-11-25T16:11:43Z

Can also be done using asyncio. Will already be prepared for an eventual real async PynamoDB
Working on Python 3.9.14+

asyncio.to_thread uses ThreadPoolExecutor under the hood

import asyncio


async def main():
    # Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4)
    one_query = await asyncio.to_thread(pynamodb_func, arg_1, arg_2, kwarg_a=3, kwarg_b=4)

    # Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) but all queries concurrently (overlapping in time) and fast!!
    many_queries = await asyncio.gather([
        asyncio.to_thread(pynamodb_func, arg_1, arg_2, kwarg_a=kwarg_a, kwarg_b=kwarg_b)
        for arg_1, arg_2, kwarg_a, kwarg_b in [long list of things to fetch]
    ])


if __name__ == '__main__':
    asyncio.run(main())

aaronclong · 2023-02-07T14:07:21Z

Would it be possible to create a separate async module in this library and create a similar but async api for people to use?

There are a few of third party async dyanmo/boto3 libraries available for use. It could be used until Amazon finally updates boto3 to support asyncio (😔 cries from botocore maintainer).

I think this approach has a lot of benefits. PynamoDB will have a working async module when boto3 supports it, and if designed correctly, could be swapped out with these third party libs dynamically. Would the maintainer be okay with that?

aaronclong · 2023-02-07T14:16:58Z

@tasn I notice you tried to do this with threading: #968

abend-arg · 2023-02-09T15:31:57Z

I am working on a project that we will benefit from adding async support to this package. We will implement our solution basically wrapping everything you have using Gevent. Why Gevent? Because you do not need to worry about async/await syntax, you do not need to rewrite everything defining async methods.

We will probably implement this before June, so as soon as I get some results from it, I will come back with a PR implementing it.

In the meantime, I would really appreciate some feedback providing you with more context. Gevent is great but for example, the support for Windows is limited:

http://www.gevent.org/install.html#supported-platforms

Probably it will narrow the supported Python versions that your library already supports as well.

aaronclong · 2023-02-11T15:02:26Z

@AbendGithub I think long-term async/await is the future of python, though. Gevent isn't native or widely used by most python programmers.

ikonst · 2023-02-11T18:50:58Z

We use pynamodb with gevent pretty much everywhere at Lyft without any modifications to this library (with standard gevent monkey-patching).

There's been a lot of community interest in adding an asyncio layer to this library over the years. It's not entirely trivial and will probably result in lots of duplication (seen this in redis-py) which is probably why we haven't yet.

I'd also see it as a negative testimony to the asyncio approach (aka blue/green functions), but this train left the station and most of us are invested into one of those two approaches, so I can definitely see the value in an asyncio layer.

aaronclong · 2023-02-14T01:12:41Z

Yeah, I know the blue/green function debate is quite polarizing. However, as you said, the language is natively adopting the once approach. Eventually, I feel like even boto3 will be forced to adopt asyncio.

dbfreem · 2024-09-11T22:26:08Z

Hey just curious if this ever caught traction. I feel like asyncio is one of the easiest ways I find to improve io bound apps.

krrishdholakia mentioned this issue Jan 9, 2024

[Feature]: Proxy - Allow users to set DynamoDB URL for Key Management BerriAI/litellm#1307

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async python support: aiopynamodb #802

Async python support: aiopynamodb #802

kamadorueda commented Jun 21, 2020

garrettheel commented Jul 25, 2020

dwatkinsweb commented Aug 20, 2020

kamadorueda commented Sep 5, 2020

garrettheel commented Sep 28, 2020 •

edited

Loading

brunobelloni commented Nov 25, 2022

aaronclong commented Feb 7, 2023 •

edited

Loading

aaronclong commented Feb 7, 2023

abend-arg commented Feb 9, 2023 •

edited

Loading

aaronclong commented Feb 11, 2023

ikonst commented Feb 11, 2023 •

edited

Loading

aaronclong commented Feb 14, 2023

dbfreem commented Sep 11, 2024

Async python support: aiopynamodb #802

Async python support: aiopynamodb #802

Comments

kamadorueda commented Jun 21, 2020

garrettheel commented Jul 25, 2020

dwatkinsweb commented Aug 20, 2020

kamadorueda commented Sep 5, 2020

garrettheel commented Sep 28, 2020 • edited Loading

brunobelloni commented Nov 25, 2022

aaronclong commented Feb 7, 2023 • edited Loading

aaronclong commented Feb 7, 2023

abend-arg commented Feb 9, 2023 • edited Loading

aaronclong commented Feb 11, 2023

ikonst commented Feb 11, 2023 • edited Loading

aaronclong commented Feb 14, 2023

dbfreem commented Sep 11, 2024

garrettheel commented Sep 28, 2020 •

edited

Loading

aaronclong commented Feb 7, 2023 •

edited

Loading

abend-arg commented Feb 9, 2023 •

edited

Loading

ikonst commented Feb 11, 2023 •

edited

Loading