Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with URL.searchParams.append for large numbers of params #51518

Closed
MattIPv4 opened this issue Jan 19, 2024 · 3 comments · Fixed by #51520
Closed

Performance issue with URL.searchParams.append for large numbers of params #51518

MattIPv4 opened this issue Jan 19, 2024 · 3 comments · Fixed by #51520
Labels
performance Issues and PRs related to the performance of Node.js.

Comments

@MattIPv4
Copy link
Member

MattIPv4 commented Jan 19, 2024

Version

21.6.0

Platform

Darwin [snip] 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64

Subsystem

No response

What steps will reproduce the bug?

Using URL.searchParams.append:

time node -e "const url = new URL('https://example.com'); console.log(new Date()); Array.from('a'.repeat(50000)).map(a => url.searchParams.append('test', a)); console.log(new Date()); console.log(url.toString().length); console.log(new Date());"

2024-01-19T01:31:47.867Z
2024-01-19T01:34:12.696Z
350020
2024-01-19T01:34:12.697Z
node -e   140.94s user 14.34s system 106% cpu 2:25.17 total

vs. using URLSearchParams.append:

time node -e "const params = new URLSearchParams(); console.log(new Date()); Array.from('a'.repeat(50000)).map(a => params.append('test', a)); console.log(new Date()); console.log(('https://example.com/?' + params.toString()).length); console.log(new Date());"

2024-01-19T01:32:25.111Z
2024-01-19T01:32:25.120Z
350020
2024-01-19T01:32:25.127Z
node -e   0.04s user 0.01s system 79% cpu 0.066 total

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No response

What do you see instead?

The URL.searchParams.append snippet with 50,000 params being append takes 2-3 minutes to run on an M1, which is a ridiculous amount of time. The equivalent snippet with URLSearchParams.append takes 50-100 milliseconds to run.

Additional information

I talked through this performance issue with Yagiz on Twitter and got to what appears to be the cause of the issue, every time URL.searchParams.append is called, the whole URL essentially gets stringified to update properties on URL itself: https://twitter.com/yagiznizipli/status/1748151781534650545

I believe it to be this call that is the cause:

node/lib/internal/url.js

Lines 478 to 480 in eb4432c

if (this.#context) {
this.#context.search = this.toString();
}

With #context being set here:

node/lib/internal/url.js

Lines 1023 to 1024 in eb4432c

this.#searchParams = new URLSearchParams(this.search);
setURLSearchParamsContext(this.#searchParams, this);

And the #context.search setter being here:

node/lib/internal/url.js

Lines 1012 to 1017 in eb4432c

set search(value) {
const href = bindingUrl.update(this.#context.href, updateActions.kSearch, StringPrototypeToWellFormed(`${value}`));
if (href) {
this.#updateContext(href);
}
}

I'm unsure whether the performance issue is with the toString call in URLSearchParams, or whether it is within the logic of the #context.search setter. I've not worked with Node.js source before, so not sure where to start on profiling that, or submitting a fix (hence opening this issue to track it for anyone to grab).

I suspect that an "easy" fix either way is to update URLSearchParams such that it doesn't write back to URL, and instead update the relevant bits of URL (href, search, etc.) to essentially call the logic that is currently in the #context.search setter on-demand as getters when needed, if URLSearchParams has been created for the URL.

@anonrig anonrig added the performance Issues and PRs related to the performance of Node.js. label Jan 19, 2024
@MattIPv4
Copy link
Member Author

(unrelated to Node.js, but bonus points to anyone that feels like surfacing a similar fix with browser engines, as they seem to suffer from the same bottleneck)

@MattIPv4
Copy link
Member Author

MattIPv4 commented Jan 19, 2024

As I understand it currently, URL stores its entire value in #context.href as a single string (and associated pointers for search etc.). When instantiated, URL does not have an associated URLSearchParams. If, and only if, a user accesses searchParams on URL, then a URLSearchParams is instantiated for the URL. However, URL still relies on its own #context.href value always, and so the data in URLSearchParams is essentially a duplicate, and any update to it needs to call the search setter on URL so that #context.href can be updated properly.

Thinking through this, I think the approach to fixing this might be:

  • Remove the URL #context from URLSearchParams, so it is a properly independent data store
  • Update the search + href getters in URL to interface with searchParams if it exists
  • Update the search + href setters in URL to interface with searchParams if it exists, and not pass any search to the internal #context.href, so it is just stored in searchParams (or, rather, update #updateContext to slice the returned href and not store the search index when searchParams exists)
  • Update searchParams so that when it instantiates URLSeachParams, it also removes any search from the internal #context.href, so it is just stored in searchParams (probably via the same #updateContext modification)

This does mean that the internal #context.href value of URL will not contain the search string once searchParams has been accessed, but I think that is probably the best thing to do so there is no risk of future confusion with the state of that being potentially out of sync with searchParams.

@benjamingr
Copy link
Member

@anonrig @nodejs/performance

nodejs-github-bot pushed a commit that referenced this issue Jan 24, 2024
PR-URL: #51520
Fixes: #51518
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MattIPv4 added a commit to MattIPv4/node that referenced this issue Jan 24, 2024
PR-URL: nodejs#51520
Fixes: nodejs#51518
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MattIPv4 added a commit to MattIPv4/node that referenced this issue Jan 24, 2024
PR-URL: nodejs#51520
Fixes: nodejs#51518
Backport-PR-URL: nodejs#51559
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
marco-ippolito pushed a commit to marco-ippolito/node that referenced this issue Feb 2, 2024
PR-URL: nodejs#51520
Fixes: nodejs#51518
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
rdw-msft pushed a commit to rdw-msft/node that referenced this issue Feb 9, 2024
PR-URL: nodejs#51520
Fixes: nodejs#51518
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
targos pushed a commit that referenced this issue Feb 15, 2024
PR-URL: #51520
Fixes: #51518
Backport-PR-URL: #51559
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
marco-ippolito pushed a commit to marco-ippolito/node that referenced this issue Feb 19, 2024
PR-URL: nodejs#51520
Fixes: nodejs#51518
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
richardlau pushed a commit that referenced this issue Mar 25, 2024
PR-URL: #51520
Fixes: #51518
Backport-PR-URL: #51559
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
richardlau pushed a commit that referenced this issue Mar 25, 2024
PR-URL: #51520
Fixes: #51518
Backport-PR-URL: #51559
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants