Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: cache request cloning and request parsing #1914

Merged
merged 13 commits into from
Jan 5, 2024

Conversation

mattcosta7
Copy link
Contributor

@mattcosta7 mattcosta7 commented Dec 7, 2023

fixes #1911
closes #1910

I originally based this on the tests in #1910, but set the base to main so we can land it easily

This pull request mainly focuses on improving the performance of the application by introducing caching mechanisms and removing unnecessary functionality. The changes involve the GraphQLHandler.ts, RequestHandler.ts, SetupServerApi.ts, setupServer.ts and several test files. The most significant changes include the introduction of caching for parsed GraphQL requests and cloned requests, the removal of the onRequest method from the SetupServerApi class, and the removal of the setMaxListeners functionality from the SetupServerApi class and related test cases.

Performance improvements:

  • src/core/handlers/GraphQLHandler.ts: Introduced a cache for parsed GraphQL requests to avoid parsing the same request multiple times. This cache is implemented as a WeakMap where the keys are the request objects and the values are the parsed requests. [1] [2]
  • src/core/handlers/RequestHandler.ts: Introduced a cache for cloned requests to avoid cloning the same request multiple times. This cache is also implemented as a WeakMap where the keys are the original request objects and the values are the cloned requests. [1] [2]

Codebase simplification:

  • src/node/SetupServerApi.ts: Removed the onRequest method, which was not doing anything and was only meant to be overridden by subclasses. [1] [2]
  • src/node/setupServer.ts: Removed the setMaxListeners functionality from the SetupServerApi class, which was used to suppress memory leak warnings in Node.js. This functionality was removed because it is no longer necessary. Since we're being more memory efficient with request clones, we don't need to worry about the number of aborts

Test changes:

  • test/node/regressions/many-request-handlers-jsdom.test.ts, test/node/regressions/many-request-handlers.test.ts: Updated tests to reflect the removal of the setMaxListeners functionality and the introduction of the request cloning cache. The tests now expect each request to be cloned only once, regardless of the number of request handlers, and no longer expect setMaxListeners to be called. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

@mattcosta7 mattcosta7 changed the title Fix request listener warning fix: request listener warning (also perf win) Dec 7, 2023
Comment on lines 207 to 216
if (!RequestHandler.#mainRequestRefCache.has(args.request)) {
RequestHandler.#mainRequestRefCache.set(
args.request,
args.request.clone(),
)
}
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
const mainRequestRef = RequestHandler.#mainRequestRefCache.get(
args.request,
)!
Copy link
Contributor Author

@mattcosta7 mattcosta7 Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not the ideal way to do this, only showing how we can remove the max listener warnings by removing almost all cloning. this is also a big perf win

I think we would actually want to generate this clone before looping handlers.

If we implemented another style of resolution, like a trie path match and a non-linear match for graphql query names, I think we could do this here without issue, since we would only need to run matching handlers, rather than tesitng every handler

return undefined
},
if (!GraphQLHandler.#parseGraphQLRequestCache.has(args.request)) {
GraphQLHandler.#parseGraphQLRequestCache.set(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not the best place to do this, since we don't pass to onUnhandledRequest well, but gets rid of the O(n) cloning at least

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think onUnhandledRequest reads the request body to require a clone, does it? Maybe we shouldn't concern ourselves about it. If the consumer wants to read the request/response body, they can clone it in the onUnhandledRequest callback. Unless there's a timing issue, of course, and that clone should've been made prior to that callback being called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think onUnhandledRequest reads the request body to require a clone, does it

yep! it does this to determine which handlers might have matched -

const parsedGraphQLQuery = await parseGraphQLRequest(request).catch(
() => null,
)

it's not a big deal, as this is only 1 extra clone per request cycle, but we will have already done this parsing once, potentially, so it's also possibly a wasted clone, where a cached parseGraphQLRequest will avoid that

import { SetupServer } from './glossary'
import { isNodeExceptionLike } from './utils/isNodeExceptionLike'

class SetupServerApi extends BaseSetupServerApi implements SetupServer {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we no longer need this hack, because we aren't exposing O(handlers * c) request clones

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very much happy to get rid of this.

console.error(error)
return undefined
},
if (!GraphQLHandler.#parseGraphQLRequestCache.has(args.request)) {
Copy link
Member

@kettanaito kettanaito Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think caching the request parsing result is something all request handlers would want to benefit from. Does it make sense to move this logic to the RequestHandler base class so the child classes don't have to manage this manually?

The cache can also be a static Map, achieving caching across individual request handler instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should instead pass the request clone to the handlers from higher up the tree, but I wasn't sure if that's a reasonable option, since it changes the args to handler.run() slightly? This was mostly to show how many fewer clones can be made with a few changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the main reason behind passing it from the higher loop? The simplified implementation (we can drop this static cache)?

I think it's a good approach to think of request handlers as standalone primitives. There may not be a loop iterating over them, and request handlers should still make sense. If the implementation cost is the only concern, I'm fine with it. We can move it to the RequestHandler base class as much as possible, making individual handlers unaware of the internal caching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, a few things. I was thinking of the other cache point at first

This graphql parsing has 2 callsites. 1 per handler and 1 again when no call is matched.

Ideally, we'd only parse once per request and not in both cases, where it might be more reasonable to keep the version of parseGrapqhlRequestHandler internal to msw as a version that caches results (we'd probably expose the version that doesn't cache - but could expose a caching version).

I'd like to have parse be cached by default, but haven't quite figured out the right api to make that ergononmic as a general feature for RequestHandler because there's 2 separate types of request parsing that happens - one that uses handler specific info (endpoint) and one that doesn't (request body).

In the intermediate, handling something along these lines is probably good enough, and keeping them private should allow us a bit of internal freedom to optimize/improve over time before exposing something more generally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could break parsing into parse endpoint and parse body type calls, combining their outputs into parse, but this expands the surface area in a way we might not want to?

only body parsing would likely be cached in that split

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Let's keep the changes surface to a minimum. This is already a grand improvement as it is.

Copy link
Member

@kettanaito kettanaito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a phenomenal work. I love it. Left a few comments to discuss whenever you got a minute. Thank you!

@kettanaito kettanaito changed the title fix: request listener warning (also perf win) fix: cache request cloning and request parsing Jan 3, 2024
@mattcosta7 mattcosta7 marked this pull request as ready for review January 4, 2024 15:35
@mattcosta7
Copy link
Contributor Author

@kettanaito let me know what you think about landing this or some version as is, and then considering how to be a bit more generic on the cache side. I'm still not sure how easily a shared cache will function, and this at least gives a significant benefit without exposing any new apis

@mattcosta7 mattcosta7 changed the base branch from repro-request-listener-warning to main January 4, 2024 21:53
@kettanaito
Copy link
Member

@mattcosta7, let me clone this and look around, if I don't find anything to suggest we will merge this as it is. This is a perfect beginning to address the problem.

@kettanaito
Copy link
Member

kettanaito commented Jan 5, 2024

I'm getting a consistent test failure after I merged a few stylistic changes:

 FAIL  test/node/regressions/many-request-handlers-jsdom.test.ts > graphql handlers > does not print a memory leak warning for onUnhandledRequest
TypeError: The "emitter" argument must be an instance of EventEmitter or EventTarget. Received an instance of AbortSignal
 ❯ test/node/regressions/many-request-handlers-jsdom.test.ts:114:31
    112| 
    113|   it('does not print a memory leak warning for onUnhandledRequest', async () => {
    114|     const unhandledResponse = await fetch(httpServer.http.url('/graphql'), {
       |                               ^
    115|       method: 'POST',
    116|       headers: {

I must have screwed something up. Looking into it.

Edit: Seems to be Node.js-related. Fails on v18.14, passes on v18.18.

Copy link
Member

@kettanaito kettanaito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply fantastic. Let's have this released.

@kettanaito kettanaito merged commit a79a9d7 into main Jan 5, 2024
10 checks passed
@kettanaito kettanaito deleted the fix-request-listener-warning branch January 5, 2024 11:59
@kettanaito
Copy link
Member

Released: v2.0.12 🎉

This has been released in v2.0.12!

Make sure to always update to the latest version (npm i msw@latest) to get the newest features and bug fixes.


Predictable release automation by @ossjs/release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MaxListenersExceededWarning in msw/node
2 participants