Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider exposing URLPatternList #30

Open
wanderview opened this issue Nov 19, 2020 · 22 comments · May be fixed by #166
Open

consider exposing URLPatternList #30

wanderview opened this issue Nov 19, 2020 · 22 comments · May be fixed by #166
Labels
addition/proposal New features or enhancements

Comments

@wanderview
Copy link
Member

After the last face-to-face meeting we changed the plan to make service worker scope take a pathname patter string instead of a full URLPattern object and an array of strings instead of a URLPatternList object.

Given this was the main use case for URLPatternList, should we just not expose URLPatternList? Are there other use cases for URLPatternList?

@jeffposnick @jakearchibald do you think developers would need URLPatternList vs simply just managing a list of URLPattern objects themselves?

@domenic
Copy link
Member

domenic commented Nov 19, 2020

FWIW, the use cases make sense to me in the abstract still (matching one of many URL patterns). But Jeff and Jake would be better positioned to tell us whether they actually come in practice.

@jeffposnick
Copy link

There are FetchEvent routing use cases where you'd want to match against multiple patterns. E.g., matching against multiple CDNs whose URLs are different enough that they can't be combined via wildcards.

So... assuming it's not much more "work" to expose URLPatternList, I think there is some potential value.

@wanderview
Copy link
Member Author

Seems like yes we should include it.

@annevk
Copy link
Member

annevk commented Mar 13, 2021

What does it offer beyond a list of patterns? So you can pass it to match rather than pass match an array?

@wanderview
Copy link
Member Author

Basically it would be a convenience to call test() or exec() only once. Also, there might be some future use where we could provide some default sorting for sites that want to use it. For example, #45 (reply in thread).

@wanderview
Copy link
Member Author

After further thought I'm going to postpone working on URLPatternList. I have limited bandwidth and want to focus on getting URLPattern out the door. Since the list offers limited value right now and is completely polyfillable I think we can safely exclude it from the MVP. We can add it later if there is developer demand.

@wanderview
Copy link
Member Author

Note, I opened a separate issue (#61) for ordering of URLPattern objects.

@wanderview
Copy link
Member Author

Developers have pointed out that popular routing systems offer optimized route lookup using data structures like radix trees, etc. For example, see find-my-way. This allows them to find the "best" matching route in a set of routes quickly.

This feels like a compelling feature for URLPatternList or similar container type. We can offer a comparator as discussed in #61 that user space can use to build something like this, but we might be able to do further optimizations if the container can access the intermediate representation directly.

@wanderview wanderview changed the title should we still expose URLPatternList? consider exposing URLPatternList Aug 11, 2021
@wanderview
Copy link
Member Author

It would be useful to know how devs feel about these possible features:

  • Ability to associate extra data with a pattern in the list that you can access on match. Alternative is to get a ref to pattern on match you can use to look up in your own map.
  • Ability to dynamically add to the list. If we have an optimized automatic sort, then we probably would not have normal array-like mutators.
  • Ability to dynamically remove from the list.
  • Ability to have duplicate equivalent patterns in the list. Perhaps with different group names and/or different associated data, etc.
  • Ability to iterate and access indexed entries vs just "find best match for this input".

@intrnl
Copy link

intrnl commented Sep 5, 2021

I'm speaking mostly in the perspective of app routing, but as URL routing is listed as one of the use cases of URLPattern, I feel that it's necessary for me to chime in.

Being able to provide metadata within the pattern list is necessary, as the pattern result itself wouldn't be enough for us to tell whether it should show view A or B.

I'm not sure about duplicate equivalent patterns under different groups though, I'm not sure if there's any common use case to justify adding this feature specifically?

The rest seems like icing on the cake, so whether or not it makes it in will not make a whole lot of difference, save for dynamically sorted patterns, which may be useful for one feature that's also worth considering...


Nested patterns makes it easy to group patterns under the same matcher alongside its metadata, which in the example below, is used to group views under the same layout.

let routes = new URLPatternList([
  {
    // nested routes requires wildcard
    pathname: '/users/*',
    metadata: { component: UsersLayout },
    children: [
      {
        pathname: '/',
        metadata: { component: UsersIndexView },
      },
      {
        pathname: '/@me',
        metadata: { component: UsersPersonalView },
      },
      {
        pathname: '/:username',
        metadata: { component: UsersView },
      },
    ],
  },
]);

The pattern result would contain concatenated groups, and a scope field, as an array containing the metadata of matched patterns.

let match;

if ((match = routes.exec('/users/intrnl'))) {
  match.scope;
  // [ { component: UsersLayout }, { component: UsersView } ]

  match.pathname.groups;
  // { username: 'intrnl' }
}

If parent and children pattern contains a group of the same name, it might be worth throwing early?

This would only be useful for app routers specifically, but it's still something that's worth considering, as app routers would try to reimplement something like this on their own.


I believe what I've said would be enough for app routing libraries to justify using URLPatternList for path matching, but it might be worth contacting the authors of React Router and Vue Router to see what they'd say about it. ^^

@posva
Copy link

posva commented Sep 5, 2021

I still haven't been able to give this a test but plan on doing by the end of October.

For Vue router, I would handle the nesting myself and pass a flat array to URLPatternList so I can handle the normalization of the route records.
Adding metadata would be useful, I would have used a Map or something similar to store the component to render based on a url

@posva
Copy link

posva commented Sep 5, 2021

Basically it would be a convenience to call test() or exec() only once. Also, there might be some future use where we could provide some default sorting for sites that want to use it. For example

An important note on this: it's not a convenience call. It allows to differentiate urls that can be matched by multiple patterns (routes). For example /:page and /about would both match for the URL /about but we definitely want to display /about (like in https://paths.esm.dev/?p=AAMeJbiAwQEcDKbAgAIErHtguALZBQBA&t=/about#)

About the list:

  1. Ability to associate extra data with a pattern in the list that you can access on match. Alternative is to get a ref to pattern on match you can use to look up in your own map.
  2. Ability to dynamically add to the list. If we have an optimized automatic sort, then we probably would not have normal array-like mutators.
  3. Ability to dynamically remove from the list.
  4. Ability to have duplicate equivalent patterns in the list. Perhaps with different group names and/or different associated data, etc.
  5. Ability to iterate and access indexed entries vs just "find best match for this input".
  1. Very useful but as you mentioned, doable in userland
  2. Necessary because often routes are added to the router at runtime, while the user is navigating through the application to lighten the size of the app
  3. Necessary for applications that build dynamic interfaces and therefore allow the users to define routes they can visit. If patterns can be added dynamically, they should also be able to be removed.
  4. I would personally warn against a pattern that is equivalent to an existing one. To me, this means they both have the same specificity like /:p vs /:p2 and only one of them will match at runtime, making the one added last useless (it would never match) until the first one is removed.
  5. This is also very useful to provide inspection on existing routes and sometimes build sitemaps or nav menus

Regarding radix trees, if I remember correctly, they can't handle the specificity of repeatable and optional params correctly:

  • /users/:id? -> /users, /users/name
  • /:a+ -> /a, /a/b, /a/b/c, etc
  • /:a* -> /, /a, /a/b, /a/b/c, etc

@mcollina
Copy link

mcollina commented Sep 9, 2021

One of the my biggest concerns is that the evaluation algorithm of multiple routes should not be "iterate over all of them and test them all". This is not going to scale up well for many routes and larger applications.

A solution to that problem is a radix tree, there are likely other algorithms and data structures that could make that happen and be performant.

@wanderview
Copy link
Member Author

I don't think we will have to iterate all entries in the list. At a minimum we can always at least optimize "prefix match" patterns that begin with fixed text. That fixed text can be in a data structure like a radix tree to reduce the set of possible patterns that can match.

@mcollina
Copy link

mcollina commented Sep 9, 2021

I think that could optimize quite a few cases. I would still recommend a prefix trie with the known limitation of the repeatable and optional params. In other terms it might be possible to:

  1. have a "fast" path for simple/normal routing patterns
  2. have a "slow" path for /:a+ and /:a* patterns

Anything that matches the fast path is prioritized over the slow path.

@posva
Copy link

posva commented Sep 9, 2021

One of the my biggest concerns is that the evaluation algorithm of multiple routes should not be "iterate over all of them and test them all". This is not going to scale up well for many routes and larger applications.

It's true it doesn't scale well and that a radix tree performs better with very large sets of routes. As a consumer of the API, that's an implementation detail and in practice, for client side routing, the perf difference in a list or a radix tree is, by experience, not significant enough (and sometimes even completely negligeable) while on the server it is critical to the scalability of the request handling.

What is critical to me, as a consumer of the API to build client side routing, is supporting repeatable params.

@wanderview
Copy link
Member Author

What is critical to me, as a consumer of the API to build client side routing, is supporting repeatable params.

Can you tell me more about repeatable params? Is this things like /:id/:id that matches /123/abc so you need two different values referenced under a single :id name? (This might belong in a separate issue.)

@posva
Copy link

posva commented Sep 10, 2021

@wanderview
Copy link
Member Author

Oh. Parameters with a repeating modifier. Got it. Yes, that is supported in patterns today and will be supported by any container. I was afraid you were asking for something like #96. Thanks for clarifying.

@lucacasonato
Copy link
Member

lucacasonato commented Apr 5, 2022

I ran a little benchmark today, and you can really see a stark difference between a hand-written, and a "naive" URLPattern based router:

deno bench --unstable https://gist.githubusercontent.com/lucacasonato/b3906b6ac333b58b5c1c3bbdf6d75b9a/raw/56bdebfe8d85eac965598d469b2fba63390b015a/urlpattern_bench.ts

Code at https://gist.githubusercontent.com/lucacasonato/b3906b6ac333b58b5c1c3bbdf6d75b9a/raw/56bdebfe8d85eac965598d469b2fba63390b015a/urlpattern_bench.ts

The handwritten router is nearly 7x faster than the URLPattern based one. URLPatternList should be able to close the gap between the hand written, and the "naive" implementation significantly because it can optimize internally using a tree based router.

I am happy to put in the spec work for this. We just need to figure out if we want to spec a "fast" tree based implementation of URLPatternList, or a "naive" one. If we make sure the implementation is not observable, we can spec the "naive" implementation as it is simpler, and then let implementers use a tree based implementation as an optimization.

I'll comment in this issue soon (as time permits) with a more concrete proposal for how to proceed with URLPatternList.

@mcollina
Copy link

mcollina commented Apr 5, 2022

@lucacasonato I'll be very happy to review whatever you came up with!

@wanderview
Copy link
Member Author

I am happy to put in the spec work for this. We just need to figure out if we want to spec a "fast" tree based implementation of URLPatternList, or a "naive" one. If we make sure the implementation is not observable, we can spec the "naive" implementation as it is simpler, and then let implementers use a tree based implementation as an optimization.

I would recommend spec'ing the naive approach and allow non-observable optimizations. Thank you for looking into this!

@lucacasonato lucacasonato linked a pull request May 3, 2022 that will close this issue
4 tasks
@domenic domenic added the addition/proposal New features or enhancements label Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements
Development

Successfully merging a pull request may close this issue.

8 participants