feat(util): use fast-querystring for building url #1673

anonrig · 2022-09-29T21:44:27Z

Solves #1672.

Important Note: Even though none of the benchmarks focus on query-string building (can test it with just adding new Error('hello') inside buildURL function), the results of the benchmarks show significant degregation in ops/sec. My educated guess would be: Due to requiring & resolving a new package, and how cronometro creates an isolated v8 environment every time a benchmark is run, the require is never cached, and therefore the amount of time requiring the fast-querystring package is also included in the benchmark and as a result creates a false outcome.

anonrig · 2022-09-29T22:11:47Z

Destructuring fast-querystring and calling it using const { stringify } = require('fast-querystring') improved the benchmarks.

codecov-commenter · 2022-09-29T22:16:03Z

Codecov Report

Base: 94.74% // Head: 94.73% // Decreases project coverage by -0.01% ⚠️

Coverage data is based on head (f9971e7) compared to base (7c93a8d).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1673      +/-   ##
==========================================
- Coverage   94.74%   94.73%   -0.02%     
==========================================
  Files          53       53              
  Lines        4892     4877      -15     
==========================================
- Hits         4635     4620      -15     
  Misses        257      257

Impacted Files	Coverage Δ
lib/core/util.js	`97.90% <100.00%> (-0.20%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

test/client.js

lib/core/util.js

ronag · 2022-09-30T08:34:01Z

@mcollina wdyt? Another external dependency?

ronag

Do the changed tests indicate breaking changes?

kibertoad · 2022-09-30T14:14:02Z

@ronag Kinda the opposite of that, it is now less strict and accepts wider range of params. When queryParams were first implemented, I remember we already talked how we may potentially revise restrictions to be more lenient in the future, so I guess now's that moment :D

test/client.js

KhafraDev · 2022-09-30T17:05:58Z

There are very minor differences between undici's current implementation and fast-querystring. These include:

Using Object.keys instead of Object.entries
Using a traditional for loop instead of a for..of
Joining each part in a string, not an array.

Implementing these minor changes, the performance of the current impl. is faster/on par with fast-querystring.

diff --git a/lib/core/util.js b/lib/core/util.js
index e9a8384..d3fe468 100644
--- a/lib/core/util.js
+++ b/lib/core/util.js
@@ -44,9 +44,13 @@ function buildURL (url, queryParams) {
     throw new Error('Query params must be an object')
   }

-  const parts = []
-  for (let [key, val] of Object.entries(queryParams)) {
-    if (val === null || typeof val === 'undefined') {
+  let parts = ''
+  const keys = Object.keys(queryParams)
+  for (let i = 0; i < keys.length; i++) {
+    let key = keys[i] // eslint-disable-line prefer-const
+    let val = queryParams[key]
+
+    if (val == null) {
       continue
     }

@@ -54,18 +58,19 @@ function buildURL (url, queryParams) {
       val = [val]
     }

+    const isLast = i === keys.length - 1
+
     for (const v of val) {
       if (isObject(v)) {
         throw new Error('Passing object as a query param is not supported, please serialize to string up-front')
       }
-      parts.push(encode(key) + '=' + encode(v))
+      parts += encode(key) + '='
+      parts += encode(v) + (isLast ? '' : '&')
     }
   }

-  const serializedParams = parts.join('&')
-
-  if (serializedParams) {
-    url += '?' + serializedParams
+  if (parts) {
+    url += '?' + parts
   }

   return url

╔═════════════════╤═════════╤═══════════════════╤═══════════╗
║ Slower tests    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ fastquerystring │   10000 │ 3486506.00 op/sec │ ± 11.53 % ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ Fastest test    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ undici          │   10000 │ 3892565.20 op/sec │ ±  2.43 % ║
╚═════════════════╧═════════╧═══════════════════╧═══════════╝

However I also noticed that the benchmark never gives the same results between results. One run might have undici destroy fast-querystring, and another run will show the exact opposite. However the perf gains from the modified impl. destroy the current one in undici.

kibertoad · 2022-09-30T17:52:33Z

@KhafraDev Could you open an alternative PR with these changes?

anonrig · 2022-09-30T18:05:42Z

@KhafraDev Your code change does not reflect the truth, since it is not compliant with new URL and node:querystring. I recommend testing your change on the test suite of fast-querystring

The & character should only be appended to parts if i !== keys.length - 1. If you include an additional if check there, I think the result would be different. Incorrect line: parts += encode(v) + '&'
if (val == null) { check is invalid since, both new URL, node:querystring adds empty string '' to keys that have falsy values. You should not skip those key value pairs.
Your code does not care about types like BigInt, but fast-querystring does. It should include the following implementation: https://github.com/anonrig/fast-querystring/blob/main/lib/stringify.js#L5

Can you share with us the updated benchmarks after these changes? I'm curious how it will be different from the implementation of fast-querystring, and if there is a performance change, I'm happy to include the change in fast-querystring where lots of other developers can benefit from it.

In either scenario, I'm thankful for your comment/contribution.

anonrig · 2022-09-30T18:26:06Z

@KhafraDev Can you also share your input for the benchmarks? Here's fast-querystrings benchmark input: https://github.com/anonrig/fast-querystring/blob/main/benchmark/stringify.mjs#L11

KhafraDev · 2022-09-30T18:42:26Z

The idea I was going for was to match undici's current behavior, rather than mimicking fast-querystring's. It would take slightly more work in the future to add support for bigints (etc.) which seemed out of scope for the current pr (improving perf).

The & getting appended to the last value is a mistake on my part, I've amended it 👍.

Here's an update benchmark for the changes; as mentioned previously the benchmark results skew heavily between runs.

╔═════════════════╤═════════╤═══════════════════╤═══════════╗
║ Slower tests    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ undici          │   10000 │ 3829950.21 op/sec │  ± 1.93 % ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ Fastest test    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ fastquerystring │   10000 │ 3851338.34 op/sec │  ± 2.92 % ║
╚═════════════════╧═════════╧═══════════════════╧═══════════╝

edit: thought I should mention, but none of undici's benchmarks use a query - meaning the biggest perf gain was likely from cronometro itself and maybe how fast Object.entries is to Object.keys. When passing a query, fast-querystring is much faster than undici (where the biggest bottleneck comes from encodeURIComponent) when using a slightly more modified buildURL than above that handles elements that aren't an array.

╔═════════════════╤═════════╤═══════════════════╤═══════════╗
║ Slower tests    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ undici          │   10000 │ 1429265.64 op/sec │ ±  1.41 % ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ Fastest test    │ Samples │            Result │ Tolerance ║
╟─────────────────┼─────────┼───────────────────┼───────────╢
║ fastquerystring │   10000 │ 1580701.78 op/sec │ ± 18.28 % ║
╚═════════════════╧═════════╧═══════════════════╧═══════════╝

Uzlopak · 2022-09-30T20:20:17Z

Just a remark: When you run your benchmarks you should imho set in your cpu governor all cpu cores to performance mode. Your benchmark tolerance is imho too high. It should be about 1-2%. Tolerance of 18 % indicates to me that your cpu cores are in on-demand mode and need to "spin up". So your benchmark is kind of tainted because e.g. your cpu starts with e.g. 800 Mhz and then goes up to 3 Ghz while benchmarking and that results in the bigger tolerance.

But that is only a remark and maybe your benchmarks are valid.

kibertoad · 2022-09-30T20:26:45Z

When passing a query, fast-querystring is much faster than undici

This is the main scenario that we should be comparing, TBH. And it feels like we should add it to our benchmark.

anonrig · 2022-09-30T20:28:35Z

This is the main scenario that we should be comparing, TBH. And it feels like we should add it to our benchmark.

You're correct @kibertoad Can you direct me to the correct place in the code to define a complex query-string that directly triggers buildURL instead of parseURL?

kibertoad · 2022-09-30T20:41:23Z

@anonrig Main place that calls buildURL is this.path = query ? util.buildURL(path, query) : path
This is invoked when you are passing query params like this

client.request({
      path: '/',
      method: 'GET',
      query: { /* some query params */ }
    }

and not like this:

client.request({
      path: '/?someQueryParams=xyz',
      method: 'GET',
    }

So we need to add cases like that into https://github.com/nodejs/undici/blob/main/benchmarks/benchmark.js

KhafraDev · 2022-09-30T20:55:20Z

Just a remark: When you run your benchmarks you should imho set in your cpu governor all cpu cores to performance mode. Your benchmark tolerance is imho too high. It should be about 1-2%. Tolerance of 18 % indicates to me that your cpu cores are in on-demand mode and need to "spin up". So your benchmark is kind of tainted because e.g. your cpu starts with e.g. 800 Mhz and then goes up to 3 Ghz while benchmarking and that results in the bigger tolerance.

But that is only a remark and maybe your benchmarks are valid.

Unfortunately stuck using wsl. However I don't think any benchmarks here are necessarily valid regardless.

Since the benchmarks do not ever call buildURL, performance should have been equal with and without moving to fast-querystring. Then there's the question of if buildURL is even a performance bottleneck - in the opening issue it mentioned it was "slow", but that was about it. Would switching to fast-querystring improve performance in a real world situation?

kibertoad · 2022-09-30T23:09:15Z

@KhafraDev considering that it is faster at some cases and slower at none, and also reduces complexity on undici side, isn't that a net positive?

KhafraDev · 2022-09-30T23:43:46Z

I'm personally for using as few third party dependencies as possible, especially in cases where it's feasible to make such an area in the code more performant without it being complex. I don't consider my modifications complex - it was done within maybe 5-10 minutes and is still understandable imo (feel free to disagree here).

However, I decided to run the benchmarks in fast-querystring and found that if the object being stringified only consists of strings and arrays, node's builtin querystring is actually on par with fast-querystring (the benchmarks vary with one never being 10% slower than the other). For undici's current behavior of only accepting strings and arrays, it seems like the better choice is to use querystring instead of adding another dependency.

const value = {
  frappucino: "muffin",
  goat: "scone",
  pond: "moose",
  foo: ["bar", "baz", "bal"],
  bool: `${true}`,
  bigIntKey: BigInt(100).toString(),
  numberKey: `${256}`,
};

Object modified from the fast-qs benchmark to only contain string and an array.

p.s. the modified buildURL came in third place in all attempts, but isn't that important to include.

kibertoad · 2022-10-01T08:53:23Z

@KhafraDev built-in querystring is deprecated (for reasons that I do not understand), should we be depending on that?

Uzlopak · 2022-10-01T09:23:46Z

@kibertoad
querystring is not deprecated. It is legacy.

kibertoad · 2022-10-01T09:28:02Z

True, which still means "discouraged to use":
"The querystring API is considered Legacy. While it is still maintained, new code should URLSearchParams API instead", according to docs.

mcollina · 2022-10-01T10:06:37Z

URLSearchParams is way too slow for use in this library. I'm ok to use querystring or fast-querystring.

kibertoad · 2022-10-01T10:13:50Z

personally I'm +1 on fast-querystring

ronag · 2022-10-01T10:16:51Z

If fast-querystring is only negligable faster then I don't see any reason to use it over querystring.

kibertoad · 2022-10-01T13:03:14Z

It seems that majority opinion is in favour of switching to querystring
@anonrig I take it it would be a drop-in replacement for this PR, as API is the same?

kibertoad · 2022-10-01T13:04:54Z

@KhafraDev what about booleans and numbers? undici accepts these just fine too, and they are quite common

KhafraDev · 2022-10-01T14:03:33Z

Same result. Both are nearly equally fast. I can post benchmark results if you'd like.

anonrig · 2022-10-02T16:21:20Z

Thank you for all the comments. I was on my way to Ireland and just had the time to review and read the responses.

From my perspective, here are the steps we need to take in order to improve undici query-string parsing.

Add query parameter to benchmark in a separate pull-request to make sure the benchmarks include the proper input. This will give us the ability to receive proper benchmark results for the future. (I've already opened a pull request for this change)
Open a separate pull-request, when step #1 is merged, to replace existing implementation with node's legacy querystring module, to support additional value types such as bigint, null, undefined etc. The reason of making this a different pull request is that, I actually want to see the ops/sec difference in the Github workflow, since a readme of fast-querystring does not mean anything, unless its justified with a real-world scenario. My only concern in this step is that, Node's recommendation of not using legacy modules.
Open a separate pull-request when querystring module is merged into the core, to make sure the performance impact of fast-querystring is justifiable to include into the undici library as an external dependency

Even though, I don't agree with some of the comments made into this pull request, I think that we are on the correct path to improve the existing implementation.

PS: I didn't want to update and replace this pull request with a node:querystring implementation because I believe that this is a really good example of good communication, engineering, and receiving feedback. (and I just want to preserve that)

feat(util): use fast-querystring for building url

de1d693

anonrig mentioned this pull request Sep 29, 2022

Improve buildURL performance #1672

Closed

perf: destruct fast-querystring for performance

f9971e7

anonrig force-pushed the perf/querystring branch from 5df7b90 to f9971e7 Compare September 29, 2022 22:09

kibertoad reviewed Sep 30, 2022

View reviewed changes

test/client.js Show resolved Hide resolved

kibertoad reviewed Sep 30, 2022

View reviewed changes

lib/core/util.js Show resolved Hide resolved

test: add happy path for array querystring

67adc17

ronag reviewed Sep 30, 2022

View reviewed changes

kibertoad reviewed Sep 30, 2022

View reviewed changes

test/client.js Show resolved Hide resolved

test: check function and objects on query-string

29ae855

ronag approved these changes Sep 30, 2022

View reviewed changes

anonrig mentioned this pull request Oct 2, 2022

perf: add query parameter to benchmark #1676

Merged

KhafraDev mentioned this pull request Oct 2, 2022

feat(util): use node:querystring for buildURL #1677

Merged

anonrig closed this Oct 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(util): use fast-querystring for building url #1673

feat(util): use fast-querystring for building url #1673

anonrig commented Sep 29, 2022 •

edited

anonrig commented Sep 29, 2022

codecov-commenter commented Sep 29, 2022 •

edited

ronag commented Sep 30, 2022

ronag left a comment

kibertoad commented Sep 30, 2022 •

edited

KhafraDev commented Sep 30, 2022 •

edited

kibertoad commented Sep 30, 2022

anonrig commented Sep 30, 2022 •

edited

anonrig commented Sep 30, 2022

KhafraDev commented Sep 30, 2022 •

edited

Uzlopak commented Sep 30, 2022

kibertoad commented Sep 30, 2022

anonrig commented Sep 30, 2022

kibertoad commented Sep 30, 2022

KhafraDev commented Sep 30, 2022

kibertoad commented Sep 30, 2022

KhafraDev commented Sep 30, 2022 •

edited

kibertoad commented Oct 1, 2022

Uzlopak commented Oct 1, 2022

kibertoad commented Oct 1, 2022 •

edited

mcollina commented Oct 1, 2022

kibertoad commented Oct 1, 2022

ronag commented Oct 1, 2022

kibertoad commented Oct 1, 2022

kibertoad commented Oct 1, 2022

KhafraDev commented Oct 1, 2022

anonrig commented Oct 2, 2022

feat(util): use fast-querystring for building url #1673

feat(util): use fast-querystring for building url #1673

Conversation

anonrig commented Sep 29, 2022 • edited

anonrig commented Sep 29, 2022

codecov-commenter commented Sep 29, 2022 • edited

Codecov Report

ronag commented Sep 30, 2022

ronag left a comment

Choose a reason for hiding this comment

kibertoad commented Sep 30, 2022 • edited

KhafraDev commented Sep 30, 2022 • edited

kibertoad commented Sep 30, 2022

anonrig commented Sep 30, 2022 • edited

anonrig commented Sep 30, 2022

KhafraDev commented Sep 30, 2022 • edited

Uzlopak commented Sep 30, 2022

kibertoad commented Sep 30, 2022

anonrig commented Sep 30, 2022

kibertoad commented Sep 30, 2022

KhafraDev commented Sep 30, 2022

kibertoad commented Sep 30, 2022

KhafraDev commented Sep 30, 2022 • edited

kibertoad commented Oct 1, 2022

Uzlopak commented Oct 1, 2022

kibertoad commented Oct 1, 2022 • edited

mcollina commented Oct 1, 2022

kibertoad commented Oct 1, 2022

ronag commented Oct 1, 2022

kibertoad commented Oct 1, 2022

kibertoad commented Oct 1, 2022

KhafraDev commented Oct 1, 2022

anonrig commented Oct 2, 2022

anonrig commented Sep 29, 2022 •

edited

codecov-commenter commented Sep 29, 2022 •

edited

kibertoad commented Sep 30, 2022 •

edited

KhafraDev commented Sep 30, 2022 •

edited

anonrig commented Sep 30, 2022 •

edited

KhafraDev commented Sep 30, 2022 •

edited

KhafraDev commented Sep 30, 2022 •

edited

kibertoad commented Oct 1, 2022 •

edited