Consider supporting promise-based dataloaders in v3 #1391

AlexCLeduc · 2021-11-25T22:14:00Z

I know graphql-core dropped support for promises, but the author seems to think promise-support can be added via hooks like middleware and execution context (see response to my identical issue in graphql-python/graphql-core#148).

Since most people using syrusakbary's promise library are probably already graphene users, if anyone is going to help make graphql-core 3 and promises play together, it makes sense for that to be done in graphene.

Why not just use async?

I think I have a decent use-case for non-async, promise-based resolution. Async is nice and all, and having a standard is great, but many of us are just using dataloaders as an execution pattern, not because we actually have async data-sources. Moving everything to run in async environment can have consequences.

We are calling the django ORM from our dataloaders. Because django 3.0 forces us to isolate ORM calls and wrap them in sync_to_async, we stuck with promise based dataloaders for syntactic reasons. Examples below:

What we'd like to do, but django doesn't allow

class MyDataLoader(...):
    async def batch_load(self, ids):
        data_from_other_loader = await other_loader.load_many(ids)
      	data_from_orm = MyModel.objects.filter(id__in=ids) # error! can't call django ORM from async context. 
        # return processed combination of orm/loader data

What django would like us to do

class MyDataLoader(...):
    async def batch_load(self, ids):
        data_from_other_loader = await other_loader.load_many(ids)
        data_from_orm = await get_orm_data()
        # return processed combination of orm/loader data
    
@sync_to_async
def get_orm_data(ids):
    return MyModel.objects.filter(id__in=ids)

What we settled on instead (use generator-syntax around promises)

class MyDataLoader(...):
    def batch_load(self,ids):
        data_from_other_loader = yield other_loader.load_many(ids)
        data_from_orm = MyModel.objects.filter(id__in=ids)
        # return processed combination of orm/loader data

A simple generator_function_to_promise is used as part of dataloaders, as well as a middleware that converts generators returned from resolvers into promises. I have hundreds of dataloaders following this pattern. I don't want to be stuck isolating all the ORM calls as per django's recommendations. That would be noisy and decrease legibility.

It seems there may be other issues around using async dataloaders with connections. Admittedly that problem sounds more easily surmountable and wouldn't require supporting promises.

The text was updated successfully, but these errors were encountered:

Topher-the-Geek · 2022-02-01T18:37:09Z

Graphene's own docs still talk about using promises for DataLoaders. How does one actually implement DataLoaders in v3?

MaehMaeh · 2022-02-04T11:16:47Z

Probably no longer per Promise. One possibility should be aiodataloader.
https://github.com/syrusakbary/aiodataloader

There is an entry about it here:
Graphene 3 dataloader #1273

Here you can see the modified documentation:
https://github.com/graphql-python/graphene/pull/1190/files/380166989d9112073c170d795801e2cd068ea5db#diff-3c9528a2b7a1be6a0549aeea89b4090c51f2ae426fd9ffb72d96112abecd02d8

If you can get a working solution, I would be very interested :)

p.s.: Your linked manual should still be for Graphene 2!

DenisseRR · 2022-04-06T23:01:43Z

Probably no longer per Promise. One possibility should be aiodataloader. https://github.com/syrusakbary/aiodataloader

There is an entry about it here: Graphene 3 dataloader #1273

Here you can see the modified documentation: https://github.com/graphql-python/graphene/pull/1190/files/380166989d9112073c170d795801e2cd068ea5db#diff-3c9528a2b7a1be6a0549aeea89b4090c51f2ae426fd9ffb72d96112abecd02d8

If you can get a working solution, I would be very interested :)

p.s.: Your linked manual should still be for Graphene 2!

tested with aiodataloader as suggested. It doesn't work.

I get:

There is no current event loop in thread 'ThreadPoolExecutor-0_0'.

superlevure · 2022-06-23T17:11:05Z

I run into the same error as @DenisseRR, @AlexCLeduc did you manage to make aiodataloader work with Django?

If yes, could you share an example of doing so?

AlexCLeduc · 2022-06-23T19:54:11Z

@superlevure
I'm still on graphene 2 and promise-based dataloaders, but the async dataloaders definitely work. The docs should ideally cover this, I can't think of a scenario where I'd use the GraphQLView as is, without subclassing it with a bunch of stuff.

Below, an explicit async loop is created within a synchronous view. The important part is that you make sure you're executing the resolvers in "async mode" and make sure you're using the same dataloader instances across resolver calls. There's some extra stuff you'll almost certainly want, like logging and attaching the user to the context:

# app_name/my_graphql_view.py
import asyncio

from graphene_django.views import GraphQLView
from graphql.execution.executors.asyncio import AsyncioExecutor

from .loaders import UserLoader

class GraphQLContext:
    def __init__(self, dataloaders, user):
        self.dataloaders = dataloaders
        self.user = user

class MyGraphQLView(GraphQLView):

    def __init__(self, *args, executor=None, **kwargs):
        self.loop = asyncio.new_event_loop()
        asyncio.set_event_loop(self.loop)
        executor = AsyncioExecutor()
        super().__init__(*args, **kwargs, executor=executor)

    def execute_graphql_request(self, *args, **kwargs):
        result = super().execute_graphql_request(*args, **kwargs)
        self.loop.close()
        if result.errors:
            self.log_errors(result.errors)
        return result

    def log_errors(self, errors):
        # your logging logic

    def get_dataloaders(self):
        return {
            "user_loader": UserLoader() 
        }

    def get_context(self, request):
        dataloaders = self.get_dataloaders()
        return GraphQLContext(
            user=self.request.user
            dataloaders=dataloaders
        )

Now your resolvers can access the user dataloader through context, e.g.

class Person(graphene.ObjectType)
    # ...
    @staticmethod
    def resolve_parent(parent,info):
        return info.context.dataloaders['user_loader'].load(parent.id)

    # or, using async resolver
    @staticmethod
    async def resolve_parent_name(parent,info):
        parent = await info.context.dataloaders['user_loader'].load(parent.id)
        return parent.name

By the way, it would be nice if graphene provided an AsyncGraphqlView, like Strawberry. Directly messing with event loops can be a little daunting.

erikwrede · 2022-06-24T11:10:31Z

@AlexCLeduc w.r.t. the AsyncGraphqlView, I think this is a graphene-django issue. Maybe there is a way to upgrade graphql-server instead, supporting both sync and async views for all possible backends. But that is definitely a larger task.
I'm not using Django - how's the progress on event loops there? Can we expect event loops to be available for future versions or will it always be necessary to create our own graphene-exclusive event loop?

superlevure · 2022-06-27T12:51:14Z

Thank you for the explanation and the snippet @AlexCLeduc. AsyncioExecutor is not available anymore in graphene 3, I'm working on porting your example to the new API, I'll post the result if I manage to make it work.

AlexCLeduc · 2022-06-27T17:54:52Z

@superlevure, after playing around a bit with v3, getting async resolvers/dataloaders working correctly is quite easy when calling the schema directly (e.g. schema.execute_async) but there's more work involved in getting graphene_django's GraphQLView class to play nice with that.

There is an open PR adding an AsyncGraphqlView, although that's more ambitious. It's an asgi django view, not just a plain wsgi view that uses async purely for batching purposes (Like I had working in v2 above). Since I'm not particularly interested in an asgi view, I just copied graphene_djangos's GraphQLView's execute_graphql_request to call a wrapped version of schema.execute_async:

class MyGraphQLView(GraphQLView):
    def execute_graphql_request(
        self, request, data, query, variables, operation_name, show_graphiql=False
    ):
        """
            copied from GraphQLView, but swapping self.schema.execute for  self.execute_graphql_as_sync
        """
        if not query:
            if show_graphiql:
                return None
            raise HttpError(HttpResponseBadRequest("Must provide query string."))

        try:
            document = parse(query)
        except Exception as e:
            return ExecutionResult(errors=[e])

        if request.method.lower() == "get":
            operation_ast = get_operation_ast(document, operation_name)
            if operation_ast and operation_ast.operation != OperationType.QUERY:
                if show_graphiql:
                    return None

                raise HttpError(
                    HttpResponseNotAllowed(
                        ["POST"],
                        "Can only perform a {} operation from a POST request.".format(
                            operation_ast.operation.value
                        ),
                    )
                )

        validation_errors = validate(self.schema.graphql_schema, document)
        if validation_errors:
            return ExecutionResult(data=None, errors=validation_errors)

        try:
            extra_options = {}
            if self.execution_context_class:
                extra_options["execution_context_class"] = self.execution_context_class

            options = {
                "source": query,
                "root_value": self.get_root_value(request),
                "variable_values": variables,
                "operation_name": operation_name,
                "context_value": self.get_context(request),
                "middleware": self.get_middleware(request),
            }
            options.update(extra_options)

            operation_ast = get_operation_ast(document, operation_name)
            if (
                operation_ast
                and operation_ast.operation == OperationType.MUTATION
                and (
                    graphene_settings.ATOMIC_MUTATIONS is True
                    or connection.settings_dict.get("ATOMIC_MUTATIONS", False) is True
                )
            ):
                with transaction.atomic():
                    result = self.execute_graphql_as_sync(**options) # alex changed this line
                    if getattr(request, MUTATION_ERRORS_FLAG, False) is True:
                        transaction.set_rollback(True)
                return result

            return self.execute_graphql_as_sync(**options) # alex changed this line
        except Exception as e:
            return ExecutionResult(errors=[e])


    def execute_graphql_as_sync(self, **options):
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
        options["context_value"] = GraphQLContext(
            user=self.request.user,
            dataloaders=self.get_dataloaders() # dataloaders must be created in the loop in which they're used
        )
        result = loop.run_until_complete(self.schema.execute_async(**options))
        if result.errors:
            self.log_errors(result.errors)
        loop.close()
        return result

    def log_errors(self, errors):
        # your logging logic


    def get_dataloaders(self):
        return {
            "user_loader": UserLoader() 
        }

It seems to work, and I validated that the batching is working properly, but it would not surprise me if mutations were broken, they sometimes behave strangely when executed asynchronously

superlevure · 2022-06-28T13:30:04Z

Thank you @AlexCLeduc !

In the mean time I had found about graphql-python/graphene-django#1256 too and especially this repo https://github.com/fabienheureux/graphene-async (thanks to @fabienheureux) that provides a working example of iaodataloader with Django and theAsyncGraphqlView of that PR.

I could make both of your examples work (the gain in performances is quite significant, I went from 1102 SQL queries to.. only 2 in my specific use case) but the introduction of asyncio in my app leads to many issues. In particular, it seems that I need to wrap every calls to the ORM (even the ones not part of a dataloader) inside a sync_to_async which is not feasible in my case.

I'm considering going back to Graphene 2 and promised based dataloader.

felixmeziere · 2022-09-22T10:44:20Z

This working would be life-changing for us!

jkimbo · 2022-09-23T13:51:13Z

So after investigating lots of other options for supporting DataLoaders with Django I've come to the conclusion that the best approach is by implementing a custom execution context class in graphql-core to handle "deferred/future" values. I've proposed this change here graphql-python/graphql-core#155 but until we can get it merged into graphql-core I've published a library to let you integrate it with Graphene (v3) now: https://github.com/jkimbo/graphql-sync-dataloaders

@superlevure @felixmeziere @AlexCLeduc I hope this is able to help you.

AlexCLeduc · 2022-09-23T15:08:00Z

Thanks @jkimbo! Will this work in a nested context, e.g. dataloaders that compose one another, or a resolver that calls multiple dataloaders? It's not obvious to me how to chain futures

jkimbo · 2022-09-23T15:52:26Z

@AlexCLeduc can you give an example of what you mean?

AlexCLeduc · 2022-09-23T16:10:31Z

@jkimbo Promises were chainable using the .then(handler) api. Are your future objects also chainable?

If a resolver requires 2 dataloader calls, e.g. a grandparent field that calls a parent_loader twice, using the old promise API would look something like this,

def resolve_grandparent(person,info):
  return parent_loader.load(person.id).then(lambda parent: parent_loader.load(parent.id) )

jkimbo · 2022-09-23T17:28:21Z

@AlexCLeduc ah yeah that's not possible yet but I'll have a look at adding it.

ericls · 2022-10-04T17:55:43Z

Thanks @jkimbo for graphql-sync-dataloader.
We were having the same issue and decided to translate the default executecontext to be promise aware and promise based. I started translating it as a fork of graphql-core a few weeks ago, but decided to publish the execute context as a separate package here: https://github.com/fellowapp/graphql-core-promise.

AlexCLeduc · 2022-10-04T23:50:20Z

@jkimbo thanks for adding chaining! I tried to use it but I couldn't find an analog to Promise.all or Dataloader.load_many

Thanks @ericls, your execution-context class was a drop in replacement and works beautifully! I didn't think this was possible without modifying graphql-core.

ericls · 2022-10-05T01:13:10Z

Glad you find it useful!

That was basically a re-write of execution context, I'm glad graphql-core provides an API to override it. By the way, are you in Ottawa Canada?

felixmeziere · 2022-10-05T10:31:10Z

Thanks a lot @jkimbo !!!

AlexCLeduc · 2022-10-05T12:57:00Z

@ericls yes I'm in that Ottawa haha. Glad to hear fellowapp is still using promises, makes me feel less crazy 😄

If anyone is interested in the generator syntactic sugar I described in the OP, ~~it's quite simple to implement. Not having to rewrite it is a motivation to keep using the old promise library~~ It's now its own package

AlexCLeduc · 2022-10-21T17:54:52Z

When I opened this issue, I was unaware the execution context was capable of adding promise support. I'm happy enough with @ericls's solution, we can close this

erikwrede · 2024-06-22T10:10:38Z

Closing this as this, as further work on this is currently not planned and there is no available capacity to bring this to the main library.

AlexCLeduc added the ✨ enhancement label Nov 25, 2021

superlevure mentioned this issue Jun 23, 2022

Support/help with promise-based resolvers graphql-python/graphql-core#148

Open

jkimbo mentioned this issue Oct 3, 2022

Chain dataloaders jkimbo/graphql-sync-dataloaders#6

Closed

tony mentioned this issue Oct 21, 2023

Backport optimizations for graphql-core fellowapp/graphql-core-promise#1

Merged

pushyamig mentioned this issue Nov 17, 2023

I1544 backend dependencies tl-its-umich-edu/my-learning-analytics#1554

Merged

erikwrede closed this as not planned Won't fix, can't repro, duplicate, stale Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider supporting promise-based dataloaders in v3 #1391

Consider supporting promise-based dataloaders in v3 #1391

AlexCLeduc commented Nov 25, 2021

Topher-the-Geek commented Feb 1, 2022

MaehMaeh commented Feb 4, 2022 •

edited

Loading

DenisseRR commented Apr 6, 2022

superlevure commented Jun 23, 2022

AlexCLeduc commented Jun 23, 2022

erikwrede commented Jun 24, 2022

superlevure commented Jun 27, 2022

AlexCLeduc commented Jun 27, 2022

superlevure commented Jun 28, 2022

felixmeziere commented Sep 22, 2022

jkimbo commented Sep 23, 2022

AlexCLeduc commented Sep 23, 2022

jkimbo commented Sep 23, 2022

AlexCLeduc commented Sep 23, 2022

jkimbo commented Sep 23, 2022

ericls commented Oct 4, 2022

AlexCLeduc commented Oct 4, 2022

ericls commented Oct 5, 2022

felixmeziere commented Oct 5, 2022

AlexCLeduc commented Oct 5, 2022 •

edited

Loading

AlexCLeduc commented Oct 21, 2022

erikwrede commented Jun 22, 2024

Consider supporting promise-based dataloaders in v3 #1391

Consider supporting promise-based dataloaders in v3 #1391

Comments

AlexCLeduc commented Nov 25, 2021

Why not just use async?

Topher-the-Geek commented Feb 1, 2022

MaehMaeh commented Feb 4, 2022 • edited Loading

DenisseRR commented Apr 6, 2022

superlevure commented Jun 23, 2022

AlexCLeduc commented Jun 23, 2022

erikwrede commented Jun 24, 2022

superlevure commented Jun 27, 2022

AlexCLeduc commented Jun 27, 2022

superlevure commented Jun 28, 2022

felixmeziere commented Sep 22, 2022

jkimbo commented Sep 23, 2022

AlexCLeduc commented Sep 23, 2022

jkimbo commented Sep 23, 2022

AlexCLeduc commented Sep 23, 2022

jkimbo commented Sep 23, 2022

ericls commented Oct 4, 2022

AlexCLeduc commented Oct 4, 2022

ericls commented Oct 5, 2022

felixmeziere commented Oct 5, 2022

AlexCLeduc commented Oct 5, 2022 • edited Loading

AlexCLeduc commented Oct 21, 2022

erikwrede commented Jun 22, 2024

MaehMaeh commented Feb 4, 2022 •

edited

Loading

AlexCLeduc commented Oct 5, 2022 •

edited

Loading