Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRO] Connection based query splitting #4917

Closed
crpahl opened this issue Apr 17, 2024 · 3 comments
Closed

[PRO] Connection based query splitting #4917

crpahl opened this issue Apr 17, 2024 · 3 comments

Comments

@crpahl
Copy link

crpahl commented Apr 17, 2024

Is your feature request related to a problem? Please describe.

We have a public API that is rate limited. This rate limiting is painful for reporting based apps that want to query large amounts of data. They are guaranteed to be rate limited if their query has two nested connection fields. They will have to paginate until they gather all of the data.

Ultimately, this means our API is very friendly for feature focused apps that are executing mutations and dealing with small payloads, but unfriendly for reporting based apps. To allow apps to make arbitrarily large queries, we have created a system that will split queries with multiple connections into individual queries. We then query each nested query individually. For example, this query:

query MyQuery {
  invoices {
    nodes {
      id
      invoiceNet
      invoiceNumber
    }
  }
  jobs {
    nodes {
      id
    }
  }
}

is split like so:

{:base=>"query MyQuery",
 :nested=>
  [{:path=>".MyQuery",
    :paginated=>
     "query MyQuery($__appPlatformCursor: String!) {\n  invoices(after: $__appPlatformCursor, first: 50) {\n    nodes {\n      id\n      invoiceNet\n      invoiceNumber\n      __typename\n    }\n    __typename\n    __appPlatformPageInfo: pageInfo {\n      hasNextPage\n      endCursor\n    }\n  }\n}",
    :unrolled=>[]},
   {:path=>".MyQuery",
    :paginated=>
     "query MyQuery($__appPlatformCursor: String!) {\n  jobs(after: $__appPlatformCursor, first: 50) {\n    nodes {\n      id\n      __typename\n    }\n    __typename\n    __appPlatformPageInfo: pageInfo {\n      hasNextPage\n      endCursor\n    }\n  }\n}",
    :unrolled=>[]}]}

We then execute each of the separated, paginated queries and collect the results in a JSON file which the app can query for when it is completed. This will work for an arbitrarily large query that I can give you more examples of if needed!

This is a working system that we currently have. The challenge we're facing is that this has broken on the last two GraphQL Ruby upgrades we have done. The breakage is very understandable as you wouldn't be expecting us to use parts of the GraphQL::Language API that we need to here. For example, it breaks on this when attempting to upgrade our current version because of a change in this PR:

modified_node = node.merge_selection(
  {
    name: "__typename",
  }
)
Failure/Error:
 modified_node = node.merge_selection(
   {
     name: "__typename",
   }
 )

ArgumentError:
 wrong number of arguments (given 1, expected 0)

Describe the solution you'd like

It would be amazing if this was possible directly through GraphQL Ruby. For example:

YourSchema.split(query)

Which would split an arbitrarily large query with N number of nested connections similar to the way I shared above. We could then use that and not have to worry about future breakages 😄

Describe alternatives you've considered

Nothing currently. We're considering dropping support for this feature as it's become a challenge for GraphQL Ruby upgrades with only a few people here knowing how to fix it when those breakages happen.

Additional context

Thank you for taking the time to read this!

@rmosolgo
Copy link
Owner

rmosolgo commented Apr 17, 2024

Hey, thanks for the detailed writeup. I definitely agree it'd be good to find a way to make this more stable for you.

That said ... if I re-wrote the parser from scratch and the only thing that broke was a change from an options hash to "real" keyword arguments, I'd say we're doing pretty good!

Some precedent here is definition_slice, added to this gem for GraphQL-client: #241. Also related, graphql-client's InsertTypename visitor: https://github.com/github-community-projects/graphql-client/blob/8f8918657a2634610f1f1238fdefa440d6a39fec/lib/graphql/client/query_typename.rb#L39

Since I don't know of anyone manipulating queries in exactly this way, my first choice would be to understand what existing Nodes APIs you use and properly test them so that I don't accidentally break them in the future (at least, not without due heads-up in the changelog).

Alternatively, if you want to open a PR or share a gist with your implementation of .split and a couple of example inputs and outputs, I'm open to considering adding it here and maintaining it along with the rest of the gem. Even if it's not widely used, I doubt it would be much of a maintenance burden since the Nodes APIs are (usually 😅 ) stable. (I am planning to redo initialize sometime, #4854 )

Let me know what you think.

@crpahl
Copy link
Author

crpahl commented Apr 17, 2024

That said ... if I re-wrote the parser from scratch and the only thing that broke was a change from an options hash to "real" keyword arguments, I'd say we're doing pretty good!

Yeah, absolutely 😄 I can start by opening a PR that adds our functionality directly to GraphQL Ruby and go from there! That will at least give you a clearer picture of the Node APIs we're currently using.

@rmosolgo
Copy link
Owner

Thanks again for sharing that code. I added some tests for the accidentally-broken methods in 1c3af64, so I think that's all we need for now. Please let me know if you run into any more trouble with AST nodes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants