Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apollo Server 2.0 - Caching #1163

Merged
merged 15 commits into from
Jun 15, 2018
Merged

Apollo Server 2.0 - Caching #1163

merged 15 commits into from
Jun 15, 2018

Conversation

martijnwalraven
Copy link
Contributor

This PR adds a first version of an apollo-server-caching package that will be used by the new Data Source API. It uses jest for testing, which means that its tests don't yet run as part of CI.

In addition to the new package, this PR attempts to fix types for fetch and related classes like Request and Response. Previously, the lib entry in tsconfig.json contained dom, which brings in a lot of types that don't make sense in server environments. I tried using webworker instead, but that also doesn't accurately represent the actual environments we'll be running in. In addition, the types we do use from those declarations contained various omissions and mistakes.

To fix this, I've added a top-level types directory with slimmed down declarations for fetch and related types, as well as the WHATWG URL and URLSearchParams (which are used in Data Sources). These are all available in global scope on Cloudflare Workers and fly.io. On Node, this requires polyfills that for now I've included in apollo-server-caching. We may want to move these to apollo-server-core or extract them into a separate package. I'm not sure what the best approach is, so suggestions are very welcome.

Copy link
Contributor

@evans evans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! We could probably merge now, so that we have an api that people can start playing with. Without some documentation, I hesitate to hit the button. I'll attempt to put together some thing for your morning.

My biggest worry right now is that the caching does not occur cross-request, since a new cache is created during each request. Also I didn't see a place for the user to pass in their own cache, which will soon be possible in #1169. Completely understand the hesitancy, since the api's have been changing so much

export abstract class RESTDataSource<TContext = any> {
abstract baseURL: string;

httpCache!: HTTPCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like variable scopes explicit, public in this case.

const httpCache = new HTTPCache();

for (const dataSource of Object.values(dataSources)) {
dataSource.httpCache = httpCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explicit coupling of data sources to an httpCache is worrisome. Maybe a lifecycle method, such as willReceiveCache or willReceiveContext might be a bit better. In addition to the httpCache, the primitive cache should be an argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option would be to provide them as arguments to the dataSources api. Also I'm worried that these methods are called for each request, so with the current setup, it doesn't appear that there would be any caching, since the HttpCache is created each time


export type Params = { [name: string]: any };

export abstract class RESTDataSource<TContext = any> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! We'll eventually create an interface to share between DataSources. Starting with a concrete use case is definitely the way to go.

@evans
Copy link
Contributor

evans commented Jun 15, 2018

Starting to write some tests. It might be nice to have the context value be sources instead of dataSources, since I got confused by the camel case and there wasn't a warning

@evans
Copy link
Contributor

evans commented Jun 15, 2018

It appears that we require that a rest response be parsable json. If a rest endpoint returns a string, then we should treat it as such

Copy link
Member

@glasser glasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice!

That said, the examples so far don't really make use of the fact that each DataSource is attached to an incoming request (created anew) — not clear to me that it gives you anything you don't get just from import { id } from './datasources. Presumably there are plans here, like integrating DataLoader, tracing, auth/context, etc? Would be nice to see at least one of those things existing in practice before 2.0 release.

@@ -121,6 +122,10 @@ export class ApolloServerBase {
delete requestOptions.persistedQueries;
}

if (!requestOptions.cache) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should stick to the philosophy that all extra features can be turned off with false.

@@ -0,0 +1,114 @@
import fetch, { Request, Response, Headers } from 'node-fetch';

Object.assign(global, { fetch, Request, Response, Headers });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be conditional, or ideally not actually global? I think it would be cleaner to have code explicitly import these symbols from apollo-server-env. I know I would want to be able to find (while reading say RESTDataSource) where that URL and Request symbol come from...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been going back and forth on this, so definitely open to changing it. The reason for making them global was that this is how environments other than Node (browser, Cloudflare, fly.io) expose them.

We also have existing code (including apollo-engine-reporting) relying on the types being global (because it compiles with the dom lib), so this would require those to also import from apollo-server-env.

@@ -0,0 +1,47 @@
import { URL, URLSearchParams } from 'url';

Object.assign(global, { URL, URLSearchParams });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be conditional, or ideally not actually global?

"dependencies": {
"apollo-server-env": "2.0.0-beta.7",
"http-cache-semantics": "^4.0.0",
"lru-cache": "^4.1.3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using quick-lru for PQs — unless there's a compelling reason we should try to use the same one for both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lru-cache supports max size based on the size of the items in it (not just the number of items), so that seemed important for a response cache.

);
}

private async fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this might as well be protected if you wanted to be able to call it with a calculated method

response: Response,
policy: CachePolicy,
): Promise<Response> {
if (!response.headers.has('Cache-Control') || !policy.storable())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first clause deserves a comment. Specifically, this means we are not caching the "cacheable by default" response types like 200 based on a heuristic based on its last-modified. Seems reasonable but worth documenting.

It also means we don't support the somewhat older "expires" header, which maybe was an oversight?

}
const response = await this.httpCache.fetch(request, init);
if (response.ok) {
return response.json();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So several times while reading this code I've been confused by a fact about async functions that perhaps I should just internalize more strongly: you can return a Promise explicitly from them, or return an expression that is calling another async function, without having to explicitly await, because nested Promises are unwrapped.

As far as I understand, in an async function f1, calling another async function f2 like return f2() and return await f2() are identical. But I definitely felt odd reading this function and thinking "ok, you have to await response.text() but not response.json()? that's weird!".

Would it be reasonable to consistently use await when calling nested async functions (assuming you aren't actually trying to manipulate the Promise of course) or should I just learn the language?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mental model for this is that you're always returning a promise from an async function, and with that returning the result of another async function doesn't seem surprising. Also not entirely sure if the generated code with an explicit await is identical, although I'd expect it to be optimized away. We may want to see what conventions other people have adopted.

}
const response = await this.httpCache.fetch(request, init);
if (response.ok) {
return response.json();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also @evans has a much more concrete point: we should be able to use this method to fetch non-JSON results as well.


const cacheKey = cacheKeyFor(request);

const body = await response.text();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines assume that you are always getting text/UTF8 and not say some binary protocol.

quick-lru or keyv (I forget which one) uses https://www.npmjs.com/package/json-buffer which is nice I think.


if (policy.satisfiesWithoutRevalidation(policyRequestFrom(request))) {
const headers = policy.responseHeaders();
return new Response(body, { status: policy._status, headers });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of depending on this internal detail, just save status separately in our cache alongside policy and body. (There's another case of it being used below; I'm pretty sure that revalidatedPolicy._status is always equal to policy._status so you can just use the same saved value.)

willSendRequest?(request: Request): void;
public willSendRequest?(request: Request): void;

public willReceiveCache(httpCache: HTTPCache) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, not sure I understand the benefits of this. It will actually open up users to mistakes when they override these methods and forget to call super (or set the properties themselves).

If the idea is that we'd like to have a lifecycle method that's called when the datasource is fully initialized, we should probably define it separately and call it when both cache and context have been set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! We don't want users to mess with those "lifecycles", then we should probably have a ready or willReceiveRequest lifecycle for after the httpCache and context are set.


public willSendRequest?(request: Request): void;

public willReceiveCache(httpCache: HTTPCache) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glasser pointed out that these names aren't the best, since the data source is really just receiving the cache, so something like receiveCache would probably be better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we need these at all, but if we do I like the consistency with React's willReceiveProps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to make sure #1163 (comment) doesn't get buried

@evans evans added this to the Release 2.0 milestone Jun 15, 2018
@evans evans merged commit 685d339 into version-2 Jun 15, 2018
@evans evans mentioned this pull request Jun 21, 2018
@abernix abernix deleted the server-2.0/caching branch August 10, 2018 16:47
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants