Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor URLPattern performance #19861

Open
kitsonk opened this issue Jul 18, 2023 · 11 comments
Open

Poor URLPattern performance #19861

kitsonk opened this issue Jul 18, 2023 · 11 comments
Labels
perf performance related

Comments

@kitsonk
Copy link
Contributor

kitsonk commented Jul 18, 2023

I was looking at the performance of my oak router versus my acorn router. Oak uses pathToRegex while acorn uses URLPattern. The implementation of URLPattern in Deno is super non-performant compared to pathToRegex and is making any routers that leverage URLPattern quite noticeably non-performant against other solutions.

By my calculations, URLPattern is ~150x slower than pathToRegex when testing/executing.

To reproduce:

benchmarks.test.ts

import { pathToRegexp } from "https://deno.land/x/path_to_regexp@v6.2.1/index.ts";

const urlPattern = new URLPattern("/", "http://localhost/");

Deno.bench({
  name: "URLPattern",
  fn() {
    if (urlPattern.test("http://localhost/")) {
      true;
    }
  },
});

const regexp = pathToRegexp("/");

Deno.bench({
  name: "pathToRegexp",
  fn() {
    if (regexp.test("/")) {
      true;
    }
  },
});

And then:

❯ deno bench benchmarks.test.ts
cpu: Apple M1 Pro
runtime: deno 1.35.1 (aarch64-apple-darwin)

file:///Users/kitsonk/github/acorn/benchmarks.test.ts
benchmark         time (avg)             (min … max)       p75       p99      p995
---------------------------------------------------- -----------------------------
URLPattern         2.33 µs/iter     (2.23 µs … 2.93 µs)   2.38 µs   2.93 µs   2.93 µs
pathToRegexp      14.23 ns/iter   (14.01 ns … 26.11 ns)   14.3 ns  16.23 ns  17.06 ns
@bartlomieju
Copy link
Member

Thanks for flagging it Kit, we actually discussed it last week. We'll look into optimizing it.

@bartlomieju bartlomieju added the perf performance related label Jul 18, 2023
@bartlomieju
Copy link
Member

Quick profile from V8 shows that op_urlpattern_process_match_input is one of the culprits - it is a "slow op" and additionally produces a lot of garbage that V8 has to clean up.

URLPattern.exec will suffer from the same problem.

@kitsonk
Copy link
Contributor Author

kitsonk commented Jul 18, 2023

Yeah, I noticed both .exec and .test have a similar performance profile delta between the two approaches.

@littledivy
Copy link
Member

image

@marvinhagemeister
Copy link
Contributor

Was looking at a couple of server boot Fresh traces and URLPattern shows up there pretty prominently. Whilst the matching itself is slow as demonstrated by the original comment, constructing URLPattern instances is equally slower than just winging it with a regex.

@bartlomieju
Copy link
Member

@littledivy is actively looking into improving performance of URLPattern.

@marvinhagemeister
Copy link
Contributor

For completeness: The pattern to regex parser I used can be found here https://github.com/denoland/fresh/compare/url-matcher#diff-7d5020e43e696993c4dee3edff778d85149f8165379a8a480824b53dfff7bbdeR1133-R1210 . Mostly used it as a starting point to see how much of a difference there is between a regex and URLPattern so it's probably not complete or spec complient.

@lucab
Copy link
Contributor

lucab commented Oct 11, 2023

On top of raw performance timing, after profiling some isolates in production we suspect that this is also causing a lot of memory pressure.
One possible culprit is the StringOrInit input argument:

  • it is a quite large enum (with unbalanced variant sizes) and has to go through untagged serialization/deserialization.
  • in most cases it is used to hold a simple URL String.
  • it normally appears in the hot-path of requests handling (possibly even in hot-loops).

@kitsonk
Copy link
Contributor Author

kitsonk commented Jan 26, 2024

Still an issue. While URLPattern did go from 166x slower to only 23x slower, it is still a challenge (as well has the highly variable p995). This really impacts any framework that attempts to do routing via URLPattern:

❯ deno task bench
Task bench deno bench
cpu: Apple M1 Pro
runtime: deno 1.40.1 (aarch64-apple-darwin)

file:///Users/kitsonk/github/acorn/routing.bench.ts
benchmark         time (avg)        iter/s             (min … max)       p75       p99      p995
------------------------------------------------------------------ -----------------------------
URLPattern       334.65 ns/iter   2,988,229.5   (287.25 ns … 5.19 µs) 307.73 ns 704.6 ns 5.19 µs
pathToRegexp      14.63 ns/iter  68,362,672.9   (14.04 ns … 56.73 ns) 14.82 ns 19.48 ns 21.65 ns

@crowdwave
Copy link

I'm using URL Pattern here and great performance would be much appreciated.

https://github.com/crowdwave/checkpoint401

@ngasull
Copy link

ngasull commented Aug 31, 2024

@kitsonk For a fairer comparison, we might want to restrict URLPattern to pathname testing:

// benchmarks.test.ts
const urlPattern = new URLPattern({ pathname: "/" });

This gave ~1.6x better results, making URLPattern ~14.3% slower than a plain RegExp.

Does the difference become insignificant for more complex path patterns?

// benchmarks.test.ts
const pattern =
  "/foo{/:date(\d{4}-\d{2}-\d{2})}?/bar{/:manydates(\d{4}-\d{2}-\d{2})}+";
const urlPattern = new URLPattern({ pathname: pattern });

const pathname = "/foo/2024-08-30/bar/2024-08-30/2024-08-30";
const url = "http://localhost" + pathname;

In this scenario URLPattern becomes ~11.7% slower than the plain RegExp, making their difference further less significant.

To what extent is its speed justified?

We can note that URLPattern does more work than a plain RegExp:

  • It parses a full URL
  • It matches on every part of an URL, including query string
  • Its match result format has a heavier impact on memory

❓ Is URLPattern's performance still that much of an issue?

Should framework routers actually rely on URLPattern to only route a pathname?

I feel like URLPattern is rather designed for on-demand URL matching rather than for app routing. Frameworks would rather benefit from an optimized pathname part of URLPattern, which is no web standard.

❓ Should we rather have a lib like path_to_regexp in @std designed for framework routers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf performance related
Projects
None yet
Development

No branches or pull requests

7 participants