Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PreparsedDocumentProvider #50

Closed
bergerandrew opened this issue Feb 9, 2021 · 20 comments
Closed

Add support for PreparsedDocumentProvider #50

bergerandrew opened this issue Feb 9, 2021 · 20 comments
Assignees

Comments

@bergerandrew
Copy link

I’ve started exploring an implementation https://github.com/bergerandrew/dgs-framework/tree/query-caching but since a PreparsedDocumentProvider can return existing entries without re-running query validation, it is incompatible with schema reloading or custom ReloadSchemaIndicator implementations.

Are there any plans to add support for query caching using PreparsedDocumentProvider?

@paulbakker
Copy link
Collaborator

This looks interesting. I don't think we've looked at PreparsedDocumentProvider before, but I'll look into that.

Since I don't know anything about PreparsedDocumentProvider yet, can you explain what the issues are with what you're trying to implement?

@bergerandrew
Copy link
Author

PreparsedDocumentProvider can be used to avoid parsing and validating the same query repeatedly. It also looks like graphql-java 16 uses it for persisted query support. The solution I have so far works as long as ReloadSchemaIndicator.reloadSchema() returns false or the default no-op PreparsedDocumentProvider is used while schema reloading is enabled.

The only issue I see with this approach would be if a schema change occurs in combination with using a ReloadSchemaIndicator that toggles the result of reloadSchema(). Since a PreparsedDocumentProvider can return cached documents without parsing and validating an ExecutionInput against the schema, it's possible to return a cached document without validating that a query is still valid for the updated schema.

It seems a bit of an unlikely edge case though since this issue would only occur if ReloadSchemaIndicator.reloadSchema() changes in a running application. But if it needs to be guarded against, the cache would need a way to be cleared if the schema is reloaded.

@paulbakker
Copy link
Collaborator

Thanks for the details! I’ll look into this in detail soon.

@golfingal72
Copy link

golfingal72 commented Mar 23, 2021

My understanding is that the PreparsedDocumentProvider avoids parsing the query over and over again... you can set it on a cache that you can control how long it lives:

        Cache<String, PreparsedDocumentEntry> cache =
                Caffeine.newBuilder().maximumSize(250)
                .expireAfterAccess(5, TimeUnit.MINUTES).recordStats().build();
        PreparsedDocumentProvider preparsedCache = new PreparsedDocumentProvider() {
            @Override
            public PreparsedDocumentEntry getDocument(
                    ExecutionInput executionInput,
                    Function<ExecutionInput, PreparsedDocumentEntry> computeFunction) {
                log.debug("Using Preparsed Document Provider from MY GRAPHQL");
                Function<String, PreparsedDocumentEntry> mapCompute =
                        key -> computeFunction.apply(executionInput);
                return cache.get(executionInput.getQuery(), mapCompute);
            }
        };

       graphQL = GraphQL.newGraphQL(graphQLSchema)
                .preparsedDocumentProvider(preparsedCache)
                .instrumentation(new GraphQLInstrumentation())
                .build();

I have found this dramatically improves query time execution, since in our application our query fields are hundreds if not thousands of bytes using complicated schema objects. I am evaluating using DGS and would really NEED this capability.

I see in the DGS codebase, you are rebuilding the GraphQL object for each execution.. the pre-parsed document cache, would have to be a shared component (bean) for it to have any effect, creating it each time from scratch would serve no purpose...

note: the caching really only works with query formulated like this:

query($cat: String, $fts: String, $offset: Int, $limit: Int) {
remoteProductSearch(cat:$cat, fts:$fts,
offset:$offset, limit: $limit) {
metadata {
.....
}
}
}
where variables are provided

@golfingal72
Copy link

golfingal72 commented Apr 2, 2021

Just as a note, your client query generators in the code-gen package do NOT create queries that could be cached, the queries created need to be of the format above where the variables are passed in. The parsed query can be cached because it does NOT have the real variables in it. The code-gen creates a query with the specific variables in the actual query.

Do you have an ETA on when this feature can be available?

@bergerandrew
Copy link
Author

The way I tried to implement this was by expecting a PreparsedDocumentProvider as a bean. However, since the ReloadSchemaIndicator is also configured as a bean, the schema has a chance to change on each execution.
What could happen with a ReloadSchemaIndicator is a client may send an old query that’s already represented in the cache but no longer valid for the new schema and cause an exception during execution instead of getting back a validation error.

For instance using graphql-dgs-example-java if hello(name: String): String is removed from the schema after first caching a query using this field:

2021-04-02 19:12:43.803 ERROR 20687 --- [nio-8080-exec-3] c.n.g.d.i.DefaultDgsQueryExecutor        : Encountered an exception while handling query {
    hello
}

graphql.AssertException: Unknown field 'hello'
	at graphql.Assert.assertTrue(Assert.java:77) ~[graphql-java-15.0.jar:na]
	at graphql.introspection.Introspection.getFieldDef(Introspection.java:593) ~[graphql-java-15.0.jar:na]
	at graphql.execution.ExecutionStrategy.getFieldDef(ExecutionStrategy.java:735) ~[graphql-java-15.0.jar:na]
	at graphql.execution.ExecutionStrategy.getFieldDef(ExecutionStrategy.java:722) ~[graphql-java-15.0.jar:na]
	at graphql.execution.ExecutionStrategy.resolveFieldWithInfo(ExecutionStrategy.java:195) ~[graphql-java-15.0.jar:na]
	at graphql.execution.AsyncExecutionStrategy.execute(AsyncExecutionStrategy.java:74) ~[graphql-java-15.0.jar:na]
	at graphql.execution.Execution.executeOperation(Execution.java:167) ~[graphql-java-15.0.jar:na]
	at graphql.execution.Execution.execute(Execution.java:108) ~[graphql-java-15.0.jar:na]
	at graphql.GraphQL.execute(GraphQL.java:598) ~[graphql-java-15.0.jar:na]
	at graphql.GraphQL.parseValidateAndExecute(GraphQL.java:529) ~[graphql-java-15.0.jar:na]
	at graphql.GraphQL.executeAsync(GraphQL.java:493) ~[graphql-java-15.0.jar:na]
	at graphql.GraphQL.execute(GraphQL.java:426) ~[graphql-java-15.0.jar:na]
	at com.netflix.graphql.dgs.internal.DefaultDgsQueryExecutor.execute(DefaultDgsQueryExecutor.kt:125) [main/:na]
	at com.netflix.graphql.dgs.internal.DefaultDgsQueryExecutor.execute(DefaultDgsQueryExecutor.kt:81) [main/:na]
	at com.netflix.graphql.dgs.mvc.DgsRestController$graphql$executionResult$1.invoke(DgsRestController.kt:139) [main/:na]
	at com.netflix.graphql.dgs.mvc.DgsRestController$graphql$executionResult$1.invoke(DgsRestController.kt:66) [main/:na]
	at com.netflix.graphql.dgs.internal.utils.TimeTracer.logTime(TimeTracer.kt:24) [main/:na]
	at com.netflix.graphql.dgs.mvc.DgsRestController.graphql(DgsRestController.kt:139) [main/:na]

Using the default no-op PreparsedDocumentProvider if schema reloading is true will make this scenario very unlikely. But to ensure that the cache doesn’t have any items that may no longer be valid, I think a few choices are either:

  1. Switch to the default no-op PreparsedDocumentProvider if schema reloading is turned on at any point
  2. Clear the cache used by the PreparsedDocumentProvider when ReloadSchemaIndicator.reloadSchema is toggled

I could spend some time to submit a PR for the first option that simply leaves it disabled and writes a warning out to the log since that should be a simple change to what I already have worked out. The other option looks a bit more complex since the state of schema reloading would need to be tracked and a way to clear a PreparsedDocumentProvider would need to be introduced.

Any other thoughts or suggestions are appreciated!

@jregistr
Copy link

jregistr commented Apr 6, 2021

👋 Hi, y'all. I'd like to contribute some thoughts to this. I have two preliminary takes on this.
I had the thought that modifying the default executor for each GraphQL Builder item we desire might be a bit much. Perhaps we can also let people do things like this by allowing them to provide a Consumer<GraphQL.Builder> and call graphQl.transform in the DefaultDgsExecutor.

DefaultDgsQueryExecutor (
...
private val graphqlTransformer: Consumer<GraphQL.Builder> = Consumer { }
)

The implementor can choose to use the private val reloadIndicator: ReloadSchemaIndicator as a dependency if they create a preparsedProvider; can drop its cache. At least that's my theory anyway.
I have a branch to experiment with this.


But on the topic of passing a PreparsedDocProvider into the DgsExecutor.
I took inspiration from the DgsSchemaProvider and made a PreparsedDocumentProvider factory.
It has a provider function in it that will be called by the default executor the same way it calls schemaProvider.schema()

fun provider(): PreparsedDocumentProvider

That would require that the executor class has a private val preparsedProvider and a factory class in its constructor.

defaultSchema: GraphQLSchema,
    defaultPreparsedProvider: PreparsedDocumentProvider,
    ...
    private val preparsedDocumentFactory: DgsPreparsedDocumentFactory,
    private val reloadIndicator: ReloadSchemaIndicator = ReloadSchemaIndicator { false }

This feels like too much to be doing just so we can make sure a reload is properly handled for a provider.
Initial branch experiment here

@golfingal72
Copy link

I like the idea of having the ability to override the Default Preparsed DocumentProvider with your own bean, if that is possible... if the ReloadSchema thing is in force, that is counter-intuitive to the PreParsed Document cache, so if it is reloading the schema, then by default, use the NoOp document provider? make it so it's either 1 OR the other??

@golfingal72
Copy link

Thanks everybody who is taking a look at this!!!

@aoudiamoncef
Copy link

Hi @jregistr

Any news about this integration ? I'm interested to contribute.

Thanks

@jregistr
Copy link

👋 Somewhat related... I'm doing some experimenting with adding APQ support currently.
Please see this discussion and This branch.

I'd greatly appreciate some feedback.

@golfingal72
Copy link

golfingal72 commented Aug 26, 2021

Hi everybody, so I was following up on progress on this ticket.

So the Graphql-java/graphql-java that has this fix is in V17.0+ release and I see no changes in the Netflix code to support this. The Graphql-java V17.+ release is a breaking change, and I attempted to bootstrap it into an existing DGS deploy. The breaking changes are to the GraphQLArgument because a new InputStateValue object got introduced. The Apollo-federation library schema printer code needs to be updated... that library is also dependent on graphql-java v 16.x, so it needs patched.. I suppose I could file a ticket to have them update their library as well (which I will do) ...graphql-java v17.0 seems to have performance improvements as well, which would be great to have

What is your plan for integration of the new version? After I manually fixed the Apollo-federation library, which was basically patching the schema printer which was easy because the graphql.SchemaPrinter.java had the right code, everything did work in a test playground, so didn't seem to be a breaking change, but would need testing.

I filed a request:
apollographql/federation-jvm#126

They already have done the work, it hasn't been merged or released:

apollographql/federation-jvm#122

Since I see Paul was commenting on the Apollo-federation ticket, it appears DGS is awaiting this as well!

@paulbakker
Copy link
Collaborator

Yes, we're waiting for the federation-lib release, which always seems to take a bit of time after a graphql-java release (folks are working on it though).

As soon as there is a release, we'll update the framework to the new version.

@golfingal72
Copy link

@paulbakker does that mean the new release will have some sort of support for the PreparsedDocumentProvider or will the new Apollo library inherently support the persisted query extension? I wasn't quite sure by reading this.

@paulbakker
Copy link
Collaborator

Sorry, my comment was just about the graphql-java upgrade, not specifically about PreparsedDocumentProvider. No updates on that front.

@paulbakker
Copy link
Collaborator

I finally added support for providing a PreparsedDocumentProvider as a bean. See PR #583

@paulbakker
Copy link
Collaborator

Merged in: PR #583

pushpagarwal added a commit to pushpagarwal/dgs-framework that referenced this issue Dec 14, 2021
…uration. (#1)

Please refer to Netflix#50. Support of PreparsedDocumentProvider was added through Netflix#583.
Above PR added it to DefaultDgsReactiveQueryExecutor but missed changes in DgsWebFluxAutoConfiguration.
In this PR, we are fixing that so that WebFlux application can define own implementation PreparsedDocumentProvider.
@pushpagarwal
Copy link
Contributor

PR #583 missed adding support of PreparsedDocumentProvider to Webflux Auto configuration.
As a result Webflux application can't use PreparsedDocumentProvider.
Raised PR #786 for fix.

@Cuica20
Copy link

Cuica20 commented Jan 28, 2022

Dont work the example:

`@Configuration
public class MyDgsConfiguration {

@Bean
public public PreparsedDocumentEntry getDocument(ExecutionInput executionInput,
                                                 Function<ExecutionInput, PreparsedDocumentEntry> parseAndValidateFunction) {
    return new CachingPreparsedDocumentProvider();
}

}
`

@pushpagarwal
Copy link
Contributor

Dont work the example:

@Configuration public class MyDgsConfiguration {

@Bean
public public PreparsedDocumentEntry getDocument(ExecutionInput executionInput,
                                                 Function<ExecutionInput, PreparsedDocumentEntry> parseAndValidateFunction) {
    return new CachingPreparsedDocumentProvider();
}


} 

Bean should be of type PreparsedDocumentProvider. Above code use PreparsedDocumentEntry.
For example refer to graphql-dgs-example-java/src/main/java/com/netflix/graphql/dgs/example/ExampleApp.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants