fix: cache request cloning and request parsing #1914

mattcosta7 · 2023-12-07T06:24:06Z

fixes #1911
closes #1910

I originally based this on the tests in #1910, but set the base to main so we can land it easily

This pull request mainly focuses on improving the performance of the application by introducing caching mechanisms and removing unnecessary functionality. The changes involve the GraphQLHandler.ts, RequestHandler.ts, SetupServerApi.ts, setupServer.ts and several test files. The most significant changes include the introduction of caching for parsed GraphQL requests and cloned requests, the removal of the onRequest method from the SetupServerApi class, and the removal of the setMaxListeners functionality from the SetupServerApi class and related test cases.

Performance improvements:

src/core/handlers/GraphQLHandler.ts: Introduced a cache for parsed GraphQL requests to avoid parsing the same request multiple times. This cache is implemented as a WeakMap where the keys are the request objects and the values are the parsed requests. [1] [2]
src/core/handlers/RequestHandler.ts: Introduced a cache for cloned requests to avoid cloning the same request multiple times. This cache is also implemented as a WeakMap where the keys are the original request objects and the values are the cloned requests. [1] [2]

Codebase simplification:

src/node/SetupServerApi.ts: Removed the onRequest method, which was not doing anything and was only meant to be overridden by subclasses. [1] [2]
src/node/setupServer.ts: Removed the setMaxListeners functionality from the SetupServerApi class, which was used to suppress memory leak warnings in Node.js. This functionality was removed because it is no longer necessary. Since we're being more memory efficient with request clones, we don't need to worry about the number of aborts

Test changes:

test/node/regressions/many-request-handlers-jsdom.test.ts, test/node/regressions/many-request-handlers.test.ts: Updated tests to reflect the removal of the setMaxListeners functionality and the introduction of the request cloning cache. The tests now expect each request to be cloned only once, regardless of the number of request handlers, and no longer expect setMaxListeners to be called. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

mattcosta7 · 2023-12-07T13:50:23Z

src/core/handlers/RequestHandler.ts

+    if (!RequestHandler.#mainRequestRefCache.has(args.request)) {
+      RequestHandler.#mainRequestRefCache.set(
+        args.request,
+        args.request.clone(),
+      )
+    }
+    // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
+    const mainRequestRef = RequestHandler.#mainRequestRefCache.get(
+      args.request,
+    )!


not the ideal way to do this, only showing how we can remove the max listener warnings by removing almost all cloning. this is also a big perf win

I think we would actually want to generate this clone before looping handlers.

If we implemented another style of resolution, like a trie path match and a non-linear match for graphql query names, I think we could do this here without issue, since we would only need to run matching handlers, rather than tesitng every handler

mattcosta7 · 2023-12-07T17:01:49Z

src/core/handlers/GraphQLHandler.ts

-        return undefined
-      },
+    if (!GraphQLHandler.#parseGraphQLRequestCache.has(args.request)) {
+      GraphQLHandler.#parseGraphQLRequestCache.set(


not the best place to do this, since we don't pass to onUnhandledRequest well, but gets rid of the O(n) cloning at least

I don't think onUnhandledRequest reads the request body to require a clone, does it? Maybe we shouldn't concern ourselves about it. If the consumer wants to read the request/response body, they can clone it in the onUnhandledRequest callback. Unless there's a timing issue, of course, and that clone should've been made prior to that callback being called.

I don't think onUnhandledRequest reads the request body to require a clone, does it

yep! it does this to determine which handlers might have matched -

msw/src/core/utils/request/onUnhandledRequest.ts

Lines 144 to 146 in ea28bd9

const parsedGraphQLQuery = await parseGraphQLRequest(request).catch(

() => null,

)

it's not a big deal, as this is only 1 extra clone per request cycle, but we will have already done this parsing once, potentially, so it's also possibly a wasted clone, where a cached parseGraphQLRequest will avoid that

mattcosta7 · 2023-12-11T14:28:54Z

src/node/setupServer.ts

 import { SetupServer } from './glossary'
-import { isNodeExceptionLike } from './utils/isNodeExceptionLike'
-
-class SetupServerApi extends BaseSetupServerApi implements SetupServer {


we no longer need this hack, because we aren't exposing O(handlers * c) request clones

I am very much happy to get rid of this.

src/core/handlers/RequestHandler.ts

kettanaito · 2024-01-03T18:56:09Z

src/core/handlers/GraphQLHandler.ts

-        console.error(error)
-        return undefined
-      },
+    if (!GraphQLHandler.#parseGraphQLRequestCache.has(args.request)) {


I think caching the request parsing result is something all request handlers would want to benefit from. Does it make sense to move this logic to the RequestHandler base class so the child classes don't have to manage this manually?

The cache can also be a static Map, achieving caching across individual request handler instances.

I think we should instead pass the request clone to the handlers from higher up the tree, but I wasn't sure if that's a reasonable option, since it changes the args to handler.run() slightly? This was mostly to show how many fewer clones can be made with a few changes

What would be the main reason behind passing it from the higher loop? The simplified implementation (we can drop this static cache)?

I think it's a good approach to think of request handlers as standalone primitives. There may not be a loop iterating over them, and request handlers should still make sense. If the implementation cost is the only concern, I'm fine with it. We can move it to the RequestHandler base class as much as possible, making individual handlers unaware of the internal caching.

Actually, a few things. I was thinking of the other cache point at first

This graphql parsing has 2 callsites. 1 per handler and 1 again when no call is matched.

Ideally, we'd only parse once per request and not in both cases, where it might be more reasonable to keep the version of parseGrapqhlRequestHandler internal to msw as a version that caches results (we'd probably expose the version that doesn't cache - but could expose a caching version).

I'd like to have parse be cached by default, but haven't quite figured out the right api to make that ergononmic as a general feature for RequestHandler because there's 2 separate types of request parsing that happens - one that uses handler specific info (endpoint) and one that doesn't (request body).

In the intermediate, handling something along these lines is probably good enough, and keeping them private should allow us a bit of internal freedom to optimize/improve over time before exposing something more generally?

We could break parsing into parse endpoint and parse body type calls, combining their outputs into parse, but this expands the surface area in a way we might not want to?

only body parsing would likely be cached in that split

Agreed. Let's keep the changes surface to a minimum. This is already a grand improvement as it is.

kettanaito

This is a phenomenal work. I love it. Left a few comments to discuss whenever you got a minute. Thank you!

mattcosta7 · 2024-01-04T21:52:53Z

@kettanaito let me know what you think about landing this or some version as is, and then considering how to be a bit more generic on the cache side. I'm still not sure how easily a shared cache will function, and this at least gives a significant benefit without exposing any new apis

kettanaito · 2024-01-05T11:22:42Z

@mattcosta7, let me clone this and look around, if I don't find anything to suggest we will merge this as it is. This is a perfect beginning to address the problem.

kettanaito · 2024-01-05T11:46:25Z

I'm getting a consistent test failure after I merged a few stylistic changes:

 FAIL  test/node/regressions/many-request-handlers-jsdom.test.ts > graphql handlers > does not print a memory leak warning for onUnhandledRequest
TypeError: The "emitter" argument must be an instance of EventEmitter or EventTarget. Received an instance of AbortSignal
 ❯ test/node/regressions/many-request-handlers-jsdom.test.ts:114:31
    112| 
    113|   it('does not print a memory leak warning for onUnhandledRequest', async () => {
    114|     const unhandledResponse = await fetch(httpServer.http.url('/graphql'), {
       |                               ^
    115|       method: 'POST',
    116|       headers: {

I must have screwed something up. Looking into it.

Edit: Seems to be Node.js-related. Fails on v18.14, passes on v18.18.

kettanaito

Simply fantastic. Let's have this released.

kettanaito · 2024-01-05T13:54:57Z

Released: v2.0.12 🎉

This has been released in v2.0.12!

📄 Release notes
📦 npm package

Make sure to always update to the latest version (npm i msw@latest) to get the newest features and bug fixes.

Predictable release automation by @ossjs/release.

mattcosta7 added 7 commits December 6, 2023 10:17

test: failing test showing listener warning

9be4e82

test: failing test showing listener warning

1d4663e

test: failing test showing listener warning

afce957

test: show more details about request cloning

e323a3d

test: update jsdom test

7d1e577

test: update tests for extra clarity

64f06b5

fix: cache main request ref

f37031a

mattcosta7 changed the title ~~Fix request listener warning~~ fix: request listener warning (also perf win) Dec 7, 2023

mattcosta7 commented Dec 7, 2023

View reviewed changes

mattcosta7 commented Dec 11, 2023

View reviewed changes

mattcosta7 mentioned this pull request Jan 3, 2024

MaxListenersExceededWarning in msw/node #1911

Closed

4 tasks

kettanaito reviewed Jan 3, 2024

View reviewed changes

src/core/handlers/RequestHandler.ts Outdated Show resolved Hide resolved

kettanaito reviewed Jan 3, 2024

View reviewed changes

kettanaito changed the title ~~fix: request listener warning (also perf win)~~ fix: cache request cloning and request parsing Jan 3, 2024

kettanaito added the performance label Jan 3, 2024

fix: cache gql parsing

3f17258

mattcosta7 force-pushed the fix-request-listener-warning branch from d88de06 to 3f17258 Compare January 4, 2024 14:34

mattcosta7 marked this pull request as ready for review January 4, 2024 15:35

mattcosta7 changed the base branch from repro-request-listener-warning to main January 4, 2024 21:53

kettanaito added 3 commits January 5, 2024 12:26

Merge branch 'main' into fix-request-listener-warning

55e22b2

chore: drop private fields, use verbose method names

8b209b8

chore: format edited test

38bb535

chore: pin node to 18.19.0 in .nvmrc

77e7247

kettanaito approved these changes Jan 5, 2024

View reviewed changes

kettanaito added the release candidate label Jan 5, 2024

Merge branch 'main' into fix-request-listener-warning

061480c

kettanaito mentioned this pull request Jan 5, 2024

test: reproduction of MaxListenersExceededWarning in msw/node #1910

Closed

kettanaito merged commit a79a9d7 into main Jan 5, 2024
10 checks passed

kettanaito deleted the fix-request-listener-warning branch January 5, 2024 11:59

mattcosta7 mentioned this pull request Jan 6, 2024

[experiment] chore: performance updates #1905

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: cache request cloning and request parsing #1914

fix: cache request cloning and request parsing #1914

mattcosta7 commented Dec 7, 2023 •

edited

Loading

mattcosta7 Dec 7, 2023 •

edited

Loading

mattcosta7 Dec 7, 2023

kettanaito Jan 3, 2024

mattcosta7 Jan 3, 2024

mattcosta7 Dec 11, 2023

kettanaito Jan 3, 2024

kettanaito Jan 3, 2024 •

edited

Loading

mattcosta7 Jan 3, 2024

kettanaito Jan 3, 2024

mattcosta7 Jan 3, 2024

mattcosta7 Jan 4, 2024

kettanaito Jan 5, 2024

kettanaito left a comment

mattcosta7 commented Jan 4, 2024

kettanaito commented Jan 5, 2024

kettanaito commented Jan 5, 2024 •

edited

Loading

kettanaito left a comment

kettanaito commented Jan 5, 2024

	const parsedGraphQLQuery = await parseGraphQLRequest(request).catch(
	() => null,
	)

fix: cache request cloning and request parsing #1914

fix: cache request cloning and request parsing #1914

Conversation

mattcosta7 commented Dec 7, 2023 • edited Loading

mattcosta7 Dec 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kettanaito Jan 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kettanaito left a comment

Choose a reason for hiding this comment

mattcosta7 commented Jan 4, 2024

kettanaito commented Jan 5, 2024

kettanaito commented Jan 5, 2024 • edited Loading

kettanaito left a comment

Choose a reason for hiding this comment

kettanaito commented Jan 5, 2024

Released: v2.0.12 🎉

mattcosta7 commented Dec 7, 2023 •

edited

Loading

mattcosta7 Dec 7, 2023 •

edited

Loading

kettanaito Jan 3, 2024 •

edited

Loading

kettanaito commented Jan 5, 2024 •

edited

Loading