Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support for Automatic Persisted Queries (APQ) #759

Closed
zanettin opened this issue Jan 4, 2022 · 8 comments · Fixed by #794
Closed

RFC: Support for Automatic Persisted Queries (APQ) #759

zanettin opened this issue Jan 4, 2022 · 8 comments · Fixed by #794
Assignees
Labels
feature New feature or request

Comments

@zanettin
Copy link
Contributor

zanettin commented Jan 4, 2022

Description

We offer currently a persisting method thanks to @salmanm's PR. with this method we create the hashes on build time and they need to be available on the backend as well. with the automatic approach the client would hash the query on request and sends it to the server. if the server already has a cached response for this hash, it would return it. otherwise he'll send an error code after which the client has to resend the request using the query and the hash. illustrated here:
Bildschirmfoto 2022-01-04 um 10 48 55

Server Implementations with APG support

Suggested implementation

  • add a new option on request level to enable it
  • on request:
    • hash the query using sha256
    • enforce GET request
    • change request format to smth like. curl -g 'http://localhost:4000/graphql?extensions={"persistedQuery":{"version":1,"sha256Hash":"ecf4edb46db40b5132295c0291d62fb65d6759a9eedfa4d5d612dd5ec54a6b38"}}'
    • on success: nothing has to be changed
    • on error:
      • if error type PERSISTED_QUERY_NOT_FOUND: resend query using POST together with the hash
      • other errors are handled as they're currently handled

Questions

  • do we want to offer these option on the client?
  • how would we like to enable it to not conflict with the existing persisted option
@zanettin zanettin added the feature New feature or request label Jan 4, 2022
@simoneb
Copy link
Member

simoneb commented Jan 4, 2022

Hi @zanettin, before getting into the details of this request there's one thing I would like to clarify. You mention

if the server already has a cached response for this hash, it would return it

Are we talking about persisted queries as a caching mechanism or am I not understanding this right?

@zanettin
Copy link
Contributor Author

zanettin commented Jan 5, 2022

Hi @simoneb

Sorry that I was not clear about that point. It's not caching related but eliminates the step of persisting the queries at build time. As visualized on the sequence diagram above the client sends always just an sha256 hash to the graphql server. if the server knows the hash, he translates it to the real query and executes it. if the server is not aware of that hash since he never had to answer a request with that hash, he returns an error to the client. the client then has to send the query alongside with the sha256 hash. the server will execute the query and if it was successful he'll stores the hash/query map on any (shared) cache.

the cache could look like this:

{ 
   "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae":"query { foo }",
   "fcde2b2edba56bf408601fb721fe9b5c338d10ee429ea04fae5511b68fbf8fb9":"query { bar }",
   "baa5a0964d3320fbc0c6a922140453c8513ea24ab8fd0577034804a967248096":"query { baz }",
}

so on any upcoming requests with the hash baa5a0964d3320fbc0c6a922140453c8513ea24ab8fd0577034804a967248096 the server finds an entry and can execute the query query { baz }. so the result is not cached at all.

pros and cons

pro

  • no persistence needed on build time
  • the server does not need access to the query map or has to be fed with a map on startup

cons

  • not that easy to protect against malicious queries since all queries will be executed on the server and not just the ones that have been "whitelisted" based on the provided query map. imo not smth the client lib has to care about but the server

thanks for your reply and considering adding that feature 🙏

@simoneb
Copy link
Member

simoneb commented Jan 5, 2022

Thanks for clarifying. Can you explain what is the benefit of doing so? Persisted queries are (to the best of my understanding) a security mechanism which prevents clients from sending queries outside of those already known by the server.

What you're proposing basically bypasses this mechanism, so I'm not sure I understand what is the value.

@zanettin
Copy link
Contributor Author

zanettin commented Jan 5, 2022

The security aspect is one thing, yes. The others are sending less data over the wire (on the request) as well as make it easier or even possible to cache it on a CDN. Another thing is that with regularly persisted query (on build time) the client has to load the map as well to get the ID needed for the request. This slows down the client load time and increase the bundle size.

So regarding your question about the value: as developer you don't need to configure babel plugins to create your query map and upload them to your graphql server implementation. this happens all on request base.

i saw in the readme that you compare the lib with apollo in many cases and try to motivate developers to switch to this lib so this could be another selling point. i am not sure how the graphql server implementations will evolve over time and if this will be the new standard or not. at least for the above mentioned implementation it is already in place for those who want to use it 👍

@simoneb
Copy link
Member

simoneb commented Jan 6, 2022

Sounds good. As a follow-up to this issue we should consider supporting this protocol in mercurius as well.

@nuragic nuragic self-assigned this Jan 10, 2022
@nuragic
Copy link
Member

nuragic commented Jan 11, 2022

Hi @zanettin, thanks for the proposal and for your initial analysis!

Regarding other server implementations supporting persisted queries:

  • mercurius has support for both build time and run time (AKA automatic), and it's basically matching the Apollo Server implementation
  • Relay also has support but it adopts a different "protocol" let say, e.g. it uses md5 instead of sha256 for hashing, it doesn't do automatic/runtime by default

This entire mechanism is something that would need to be implemented on a middleware (like Apollo does with links) because of the custom logic* you need to configure depending on which GraphQL server you're using; however graphql-hooks still doesn't have support for middleware and I don't know if we want to add that.

*For "custom logic" I mean:

  • the hashing algorithm you want to use (sha256? md5? other?)
  • the error code that tells the client to re-send both hash AND the query (which custom code? HTTP status code?)
  • the follow up query "format" (do we need the extensions query param? other format?)

Regarding the current persisted implementation, i.e. babel plugin that compiles the code and sets a flag on each useQuery call: I don't think we can extend it to add support for this feature; It also seems that is not compatible with Apollo Server right now, because it's not using the same request format. So I believe that if we want to be aligned with Apollo and have both build time and run time (by default) persisted queries we would need to re-implement the entire persisted support from scratch using middleware (i.e. major release). I'll need to start a discussion with the team, I'll keep you informed anyway.

@nuragic
Copy link
Member

nuragic commented Jan 11, 2022

I started a public discussion on the repo about adding middleware support: #761

@nuragic
Copy link
Member

nuragic commented Jan 19, 2022

Once middleware support will ship, this feature can be worked out, i.e. blocked by #769.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants