Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse & cache queries #205

Merged
merged 23 commits into from
Nov 29, 2021
Merged

Parse & cache queries #205

merged 23 commits into from
Nov 29, 2021

Conversation

cecton
Copy link
Contributor

@cecton cecton commented Nov 26, 2021

Changes

  • Removal of feature flag post-processing
  • Introduction of new object QueryCache in core crate, shamelessly copy-pasting the logic from CachingQueryPlanner (will be refactored in a future PR)
  • Move Query to its own file (query.rs)
  • Add CLI argument --query-cache-limit
  • Query parsing in parallel to plan execution (because it's in a spawn_blocking)
  • Regrets (refactor in a future PR)

Task list from #106

  • fragments can be defined in the supergraph (this is the most common case) so fragments definition should be obtained from there too, not only in the query
  • we should check the operation name, to apply the correct selection set (a query can contain multiple operations)

Superseeds #106
Closes #100
Related to #166

I don't store any object coming from apollo_parser here because they are
!Send. This prevents the whole object to be moved to another thread.
This is why I need to create my own objects (enum Selection) which won't
have this limitation.
We won't serialize or deserialize on object creation because this
process is costly. So Request's `query` field stays a simple String and
Query becomes a new object more aware of the actual content of the
query.
Maybe I should have refactor that caching first...
@cecton cecton mentioned this pull request Nov 26, 2021
6 tasks
@cecton
Copy link
Contributor Author

cecton commented Nov 26, 2021

we should check the operation name, to apply the correct selection set (a query can contain multiple operations)

@Geal I have a question on this one. Can you give me a scenario where an incorrect selection set is applied??

I'm asking because the query defines all its operations so the selection set comes from the query itself, I don't see how it can be the wrong one?

@Geal
Copy link
Contributor

Geal commented Nov 26, 2021

there can be multiple named queries in one request, and the operation field decides which one is used. I think the query planner is already validating this though, so maybe we can avoid checking it again?

@cecton cecton marked this pull request as ready for review November 26, 2021 17:39
Copy link
Contributor

@garypen garypen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One small change I think.

I had intended to polish up the caching stuff in a later PR to make it a reusable component. I think it can be generic over keys/values and should be fairly simple to achieve.

I note in your PR comment that you talk about re-factoring this later, so I think it's fine to duplicate in this PR and then we can re-factor the caching component later.

apollo-router-core/src/query_planner/mod.rs Outdated Show resolved Hide resolved
apollo-router-core/src/query_planner/mod.rs Outdated Show resolved Hide resolved
schema: Arc<Schema>,
}

impl QueryCache {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some basic unit tests for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose we delay that for the PR that will refactor the caching thing as this should be done only once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest then we just use a simple cache implementation without all the fancy stuff until that time.

Copy link
Contributor

@o0Ignition0o o0Ignition0o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good first step!

+1 for a more generic cache (with only one implementation)

I think we could also work on our router_factory builder to decrease the number of parameters create and recreate need, maybe as a followup ?

@cecton
Copy link
Contributor Author

cecton commented Nov 29, 2021

+1 for a more generic cache (with only one implementation)

Yes I agree. As I said I plan to do this in a later PR. Unless you think there is an argument for doing this here I will consider this out-of-scope for now.

Copy link
Contributor

@o0Ignition0o o0Ignition0o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, once we have follow up issues on having one generic cache implementation with tests, and possibly on the router_factory refactoring (router_refactoring? 😁 )

@cecton
Copy link
Contributor Author

cecton commented Nov 29, 2021

Routerifactorio

@Geal
Copy link
Contributor

Geal commented Nov 29, 2021

Routerifactorio

the router must grow

@BrynCooke
Copy link
Contributor

This got squashed in another comment, but the question still stands. Did we check for an off the shelf cache implementation? If not why not?

@Geal
Copy link
Contributor

Geal commented Nov 29, 2021

We went with lru as it's commonly used. I proposed https://github.com/al8n/stretto but it's a bit young

cf #42

@garypen
Copy link
Contributor

garypen commented Nov 29, 2021

This got squashed in another comment, but the question still stands. Did we check for an off the shelf cache implementation? If not why not?

We are using an off the shelf cache implementation, lru. We've written a small amount of code around it to optimise for our particular use case.

@BrynCooke
Copy link
Contributor

BrynCooke commented Nov 29, 2021

@cecton Has said this is going to be revisited, so I won't stand in the way of merging. Apparently it will get unit tested later.

I understand that this is a small wrapper around the commonly used LRUCache, But the fact remains that it is adding a bunch of async code, which honestly looks correct, but might not be.

I think the thing that surprises me is that there isn't a good high quality implementation that does what we want in the Rust ecosystem already, it's such a common thing to want to do.

@cecton cecton merged commit 884d3e0 into main Nov 29, 2021
@cecton cecton deleted the cecton-caching-queries branch November 29, 2021 19:40
tinnou pushed a commit to Netflix-Skunkworks/router that referenced this pull request Oct 16, 2023
added an `is_config` field to `BuildErrors` to provide more helpful
suggestions in Rover when a `supergraph.yaml` cannot be resolved.
Assumption here is that you will always have either config errors _or_
composition errors. It's unfortunate that we didn't specify those types
quite properly when this was first launched as we're stuck w/the
existing JSON structure for this kind of error (meaning we should have
had `CompositionErrors` and `ConfigErrors` instead of just
`BuildErrors`, but this is where we are).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

finish experimental response post-processing
5 participants