Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Declarative routing #1373

Open
jakearchibald opened this issue Dec 4, 2018 · 74 comments
Open

Declarative routing #1373

jakearchibald opened this issue Dec 4, 2018 · 74 comments

Comments

@jakearchibald
Copy link
Contributor

@jakearchibald jakearchibald commented Dec 4, 2018

Here are the requirements I'm working towards:

  • Be able to bypass the service worker for particular requests.
  • Speed up simple offline-first, online-first routes by avoiding service worker startup time.
  • Be polyfillable – do not introduce things that cannot already be done in a fetch event.
  • Be extensible – consider what future additions to the API might look like.
  • Avoid state on the registration if possible – prefer state on the service worker itself.

I'm going to start with static routes, and provide additional ideas in follow-up posts.

The aim is to allow the developer to declaratively express a series of steps the browser should perform in attempt to get a response.

The rest of this post is superseded by the second draft

Creating a route

WebIDL

// Install currently uses a plain ExtendableEvent, so we'd need something specific
partial interface ServiceWorkerInstallEvent {
  attribute ServiceWorkerRouter router;
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouter {
  void add(ServiceWorkerRouterItem... items);
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouterItem {}

JavaScript

addEventListener('install', (event) => {
  event.router.add(...items);
  event.router.add(...otherItems);
});

The browser will consider routes in the order declared, and will consider route items in the order they're given.

Route items

Route items fall into two categories:

  • Conditions – These determine if additional items should be considered.
  • Sources – A place to attempt to get a response from.

Sources

WebIDL

[Exposed=ServiceWorker, Constructor(optional RouterSourceNetworkOptions options)]
interface RouterSourceNetwork : ServiceWorkerRouterItem {}

dictionary RouterSourceNetworkOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceCacheOptions options)]
interface RouterSourceCache : ServiceWorkerRouterItem {}

RouterSourceCacheOptions : MultiCacheQueryOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceFetchEventOptions options)]
interface RouterSourceFetchEvent : ServiceWorkerRouterItem {}

dictionary RouterSourceFetchEventOptions {
  DOMString id = '';
}

These interfaces don't currently have attributes, but they could have attributes that reflect the options/defaults passed into the constructor.

Conditions

WebIDL

[Exposed=ServiceWorker, Constructor(ByteString method)]
interface RouterIfMethod : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURL : ServiceWorkerRouterItem {}

dictionary RouterIfURLOptions {
  boolean ignoreSearch = false;
}

[Exposed=ServiceWorker, Constructor(USVString url)]
interface RouterIfURLPrefix : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURLSuffix : ServiceWorkerRouterItem {}

[Exposed=ServiceWorker, Constructor(optional RouterIfDateOptions options)]
interface RouterIfDate : ServiceWorkerRouterItem {}

dictionary RouterIfDateOptions {
  // These should accept Date objects too, but I'm not sure how to do that in WebIDL.
  unsigned long long from = 0;
  // I think Infinity is an invalid value here, but you get the point.
  unsigned long long to = Infinity;
}

[Exposed=ServiceWorker, Constructor(optional RouterIfRequestOptions options)]
interface RouterIfRequest : ServiceWorkerRouterItem {}

dictionary RouterIfRequestOptions {
  RequestDestination destination;
  RequestMode mode;
  RequestCredentials credentials;
  RequestCache cache;
  RequestRedirect redirect;
}

Again, these interfaces don't have attributes, but they could reflect the options/defaults passed into the constructor.

Shortcuts

GET requests are the most common type of request to provide specific routing for.

WebIDL

partial interface ServiceWorkerRouter {
  void get(ServiceWorkerRouterItem... items);
}

Where the JavaScript implementation is roughly:

router.get = function(...items) {
  router.add(new RouterIfMethod('GET'), ...items);
};

We may also consider treating strings as URL matchers.

  • router.add('/foo/') === router.add(new RouterIfURL('/foo/')).
  • router.add('/foo/*') === router.add(new RouterIfURLPrefix('/foo/')).
  • router.add('*.png') === router.add(new RouterIfURLSuffix('.png')).

Examples

Bypassing the service worker for particular resources

JavaScript

// Go straight to the network after 25 hrs.
router.add(
  new RouterIfDate({ from: Date.now() + 1000 * 60 * 60 * 25 }),
  new RouterSourceNetwork(),
);

// Go straight to the network for all same-origin URLs starting '/videos/'.
router.add(
  new RouterIfURLPrefix('/videos/'),
  new RouterSourceNetwork(),
);

Offline-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/avatars/'.
  new RouterIfURLPrefix('/avatars/'),
  // Try to get a match for the request from the cache.
  new RouterSourceCache(),
  // Otherwise, try to fetch the request from the network.
  new RouterSourceNetwork(),
  // Otherwise, try to get a match for the request from the cache for '/avatars/fallback.png'.
  new RouterSourceCache({ request: '/avatars/fallback.png' }),
);

Online-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/articles/'.
  new RouterIfURLPrefix('/articles/'),
  // Try to fetch the request from the network.
  new RouterSourceNetwork(),
  // Otherwise, try to match the request in the cache.
  new RouterSourceCache(),
  // Otherwise, if the request destination is 'document'.
  new RouterIfRequest({ destination: 'document' }),
  // Try to match '/articles/offline' in the cache.
  new RouterSourceCache({ request: '/articles/offline' }),
);

Processing

This is very rough prose, but hopefully it explains the order of things.

A service worker has routes. The routes do not belong to the registration, so a new empty service worker will have no defined routes, even if the previous service worker defined many.

A route has items.

To create a new route containing items

  1. If the service worker is not "installing", throw. Routes must be created before the service worker has installed.
  2. Create a new route with items, and append it to routes.

Handling a fetch

These steps will come before handling navigation preload, meaning no preload will be made if a route handles the request.

request is the request being made.

  1. Let routerCallbackId be the empty string.
  2. RouterLoop: For each route of this service worker's routes:
    1. For each item of route's items:
      1. If item is a RouterIfMethod, then:
        1. If item's method does not equal request's method, then break.
      2. Otherwise, if item is a RouterIfURL, then:
        1. If item's url does not equal request's url, then break.
      3. Etc etc for other conditions.
      4. Otherwise, if item is a RouterSourceNetwork, then:
        1. Let networkRequest be item's request.
        2. If networkRequest is null, then set networkRequest to request.
        3. Let response be the result of fetching networkRequest.
        4. If response is not an error, return response.
      5. Otherwise, if item is a RouterSourceCache, then:
        1. Let networkRequest be item's request.
        2. If networkRequest is null, then set networkRequest to request.
        3. Let response be the result of looking for a match in the cache, passing in item's options.
        4. If response is not null, return response.
      6. Otherwise, if item is a RouterSourceFetchEvent, then:
        1. Set routerCallbackId to item's id.
        2. Break RouterLoop.
  3. Call the fetch event as usual, but with routerCallbackId as one of the event properties.

Extensibility

I can imagine things like:

  • RouterOr(...conditionalItems) – True if any of the conditional items are true.
  • RouterNot(condition) – Inverts a condition.
  • RouterIfResponse(options) – Right now, a response is returned immediately once one is found. However, the route could continue, skipping sources, but processing conditions. This condition could check the response and break the route if it doesn't match. Along with a way to discard any selected response, you could discard responses that didn't have an ok status.
  • RouterCacheResponse(cacheName) – If a response has been found, add it to a cache.
  • RouterCloneRequest() – It feels like RouterSourceNetwork would consume requests, so if you need to do additional processing, this could clone the request.

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 4, 2018

Here's an alternative model suggested by @wanderview, which is focused on allowing developers to toggle the fetch event:

partial interface ServiceWorkerGlobalScope {
  attribute ServiceWorkerEventSubscriptions eventSubscriptions;
}

interface ServiceWorkerEventSubscriptions {
  Array<DOMString> get();
  void add(DOMString eventType);
  void remove(DOMString eventType);
}

Where get returns the set of event types to handle.

add adds to the set.

remove removes from the set.

This means the developer would be able to expire their fetch handler after some amount of time (I believe this is Facebook's use-case).

const deployTimestamp = 1543938484103;
const oneDay = 1000 * 60 * 60 * 24;

addEventListener('fetch', (event) => {
  if (Date.now() - deployTimestamp > oneDay) {
    eventSubscriptions.remove('fetch');
    return;
  }

  // …
});

This is a much simpler feature, but it doesn't allow skipping the service worker by route, or going straight to the cache by route.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 4, 2018

One more suggestion (again from @wanderview) would be to add a new CSP rule (or something similar) that defines which url-prefixes a service worker would intercept, if any.

This means the setting would be page-by-page. Navigations wouldn't be able to opt-out of service worker, but presumably navigation preload would solve a lot of performance issues there.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 4, 2018

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 5, 2018

There's a gotcha with the static routes: If you put them at top level, they'll execute without issue, but next time the service worker is started (after activating), they'll fail. Also, if the service worker starts multiple times before activation, you'll get duplicate routes.

Might make more sense to throw unless the routes are being added during the install event, so they'd always fail top-level.

Given that, it might make sense to put the API on the install event. Edit: I've moved it to the install event.

@annevk
Copy link
Member

@annevk annevk commented Dec 5, 2018

How does this relate to https://github.com/domenic/import-maps? Intuitively it feels like these should be the same thing.

@wanderview
Copy link
Member

@wanderview wanderview commented Dec 5, 2018

Just to clarify, the two suggestions I had were to support a subset of the use cases where the site might want to stop consulting the fetch handler until an update can occur. Some sites are blocked on these use cases right now and I wondered if there was a way we could unblock them without conflicting with any future possible static routes spec.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 5, 2018

@annevk I think they're different enough, and can be used together.

Import maps URLs with scheme import: to one or more built-in modules or HTTP fetches. Those HTTP fetches then go through the service worker.

Then, the service worker's fetch event (or its static routes) can decide how to conduct those fetches.

@jeffposnick
Copy link
Contributor

@jeffposnick jeffposnick commented Dec 14, 2018

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

I can see developers eagerly adopting this approach if browsers started to implement it. Being able to string together complex routing rules with network/cache interactions without the overhead of starting up a service worker or pulling in extra runtime code (except for the polyfill scenario...) would be a nice performance win.

I have something of a meta-question about the longer term, v2+, plans though. There are some common concerns that production web apps need to think about, like cache maintenance/expiration policies for runtime caches. Today, that type of activity is likely to happen inside of a fetch handler. If we move to a model where you can either get

a) fast routing/response generation, but no fetch handler or
b) the ability to run code to deal with housekeeping, but incur the overhead of a fetch handler

developers might feel conflicted about that tradeoff.

One approach could be to create a new type of event that's fired after a response has been generated by the native router, like routingcomplete, that allows developers to run their own bookkeeping code outside of the critical response generation flow.

Looking at it from a slightly different perspective, this proposal ends up creating a native implementation of a subset of things that have up until now been accomplished using service worker runtime libraries. Cache-expiration also falls into the category of things that developers commonly accomplish via service worker runtime libraries. The feeling I got from #863 is that there hasn't been an appetite for natively implementing that as part of the Cache Storage API. If this routing proposal moves forward, does that change the equation around natively implementing higher-level functionality like cache expiration?

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 15, 2018

That's a good point. I had this in earlier drafts but didn't think we needed it yet. Should be relatively simple to add.

@n8schloss
Copy link

@n8schloss n8schloss commented Dec 20, 2018

This is awesome! @jakearchibald, I really like the proposal you outlined outlined in the first comment, it solves the two big issues that we're seeing 😃

  1. There's a good amount of added overhead we are measuring for fetching user generated non-cached resources when a fetch event is enabled.

  2. After a period of time the items cached in the service worker and served via the fetch event are invalid, so when starting the service worker we end up skipping the cache and fetching from the network, however we end up paying a large cost of starting the service worker in that case and get no benefit.

@jakearchibald, @wanderview's solution that you outlined in the second comment above gives us a solution to issue 2 but not issue 1. Like @wanderview said in his comment, issue 2 impacts us more right now than issue 1. So with the first proposal, as long as the spec is written in such a way that vendors can quickly implement the experary time condition without having to block on implementing the RouterIfURL conditions then I think this is really really great!

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Dec 20, 2018

@n8schloss thanks for the feedback! It feels like IfURL is pretty fundamental to other developers, but the prefix/suffix stuff could probably wait.

@jatindersmann, @aliams, @youennf, @asutherland, @mattto, @wanderview: How do you feel about this implementation-wise?

@aaronsn
Copy link

@aaronsn aaronsn commented Jan 2, 2019

@jakearchibald - This is really great! I like the flexibility of the proposal. I’m interested in understanding better if this will extend to more complex scenarios in the future.

  • Would it be possible to have something like RouterIfTimeout which activates that route but cancels if the headers aren't received within a certain timeout? This would allow for preferring the network but falling back to cache if the network takes too long.
  • Can this API handle racing sources rather than doing them in sequence?
  • I think RouterIfResponse is an important feature, to allow for custom handling for error responses. However, things start to get more complicated with options that apply after the request has been made. For example, say you want to prefer network if the response is a 200, then try cache, then use the server response anyway if cached response is missing. Could you accomplish that with add(new RouterSourceNetwork(), new RouterIfResponse()); add(new RouterSourceCache(), new RouterSourceNetwork()) without it issuing multiple network requests? Or, say you want to do the same thing but if either response isn’t a 200 then use custom logic to determine which to use. Would the fetch event be able to access the responses that were already requested via the routes?
  • Avoiding controlling certain sub-scopes is something I'd like to see for service workers, but I'm not sure if this API can/should provide this support. I can sort of see how you could do this with a route like add(new RouterIfURLPrefix(), new RouterSourceNetwork()), but you'd have to know which sub-resources will be requested. Could there be a RouterSource that causes the service worker to not control that client? Eg add(new RouterIfURLPrefix(), new RouterSourceUnclaimClient()). Or would that be too much of an abuse of the API?

Also one thought about implementation - if a request is handled via a static route, would the service worker still be started up in the background even though it isn't needed for the request? On the one hand, this would ensure that the service worker is started for future resource requests. On the other hand, if an entire page load (main resource plus sub resources) can be handled by static routes, then it's nice to avoid the performance cost of starting up the service worker.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 3, 2019

@aaronsn

Would it be possible to have something like RouterIfTimeout which activates that route but cancels if the headers aren't received within a certain timeout? This would allow for preferring the network but falling back to cache if the network takes too long.

I think I'd make this an option to RouterSourceNetwork, as it's the only one that would benefit from a timeout right now.

Can this API handle racing sources rather than doing them in sequence?

Something like:

router.get(
  new RouterSourceAny([
    ...sources
  ])
);

This could also have a timeout option.

I think RouterIfResponse is an important feature

I think everything you mention here is possible, although we might be hitting edge cases that are easier to read as logic in a fetch event, and uncommon enough that optimising them via a router doesn't bring much of a benefit.

Avoiding controlling certain sub-scopes is something I'd like to see for service workers, but I'm not sure if this API can/should provide this support

I don't think it should. Only things you can currently do in a fetch event are in scope. To do this, we either want a way to exclude routes as part of the call to serviceWorker.register, or a way to do it via a fetch event (after which we could look at adding something to the router).

if a request is handled via a static route, would the service worker still be started up in the background even though it isn't needed for the request? On the one hand, this would ensure that the service worker is started for future resource requests. On the other hand, if an entire page load (main resource plus sub resources) can be handled by static routes, then it's nice to avoid the performance cost of starting up the service worker.

Interesting! The spec is deliberately loose when it comes to when the service worker is started, and how long it stays alive for. Either behaviour would be spec compatible.

If a service worker is started, and not needed, it shouldn't affect page performance as nothing's blocked on it.

@wanderview
Copy link
Member

@wanderview wanderview commented Jan 3, 2019

It might also be useful to describe the default routes you get when a service worker is installed. Either route to FetchEvent or no where depending on if there is a fetch handler.

@wanderview: How do you feel about this implementation-wise?

Personal opinion that does not represent any actual implementation priorities:

I guess I'm most interested in how something like this could be incrementally approached. For example, if we started with:

  1. ServiceWorkerRouter.add()
  2. RouterIfDate
  3. RouterSourceNetwork with default options

Or replace (2) with RouterIfURL. This minimal initial set might unblock certain use cases. We could then layer additional items later. I'm not sure how people would feel about a having a partial "router" in the platform that doesn't provide a full routing capability.

To me the RouterIfDate is more compelling at the moment because we don't have a good alternative solution for avoiding service worker startup costs in an expired state. RouterIfURL is more "router-like" but it seems oriented at carving out exceptions for certain subresources which feels like less of a problem since typically service worker startup is not necessary for subresources.

I imagine, though, there is going to be a tension between how many of these options and extensions to implement vs using javascript in FetchEvent. For example, the list of logical combinations in the "extensibility" section seemed like perhaps something that should just be done in js. I'm not sure we want to implement and maintain a complex DSL when we can achieve the same thing with js.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 3, 2019

One thing that isn't clear to me yet:

router.add(
  new RouterIfURLPrefix('/avatars/'),
  new RouterSourceCache(),
);

router.add(
  new RouterIfURLSuffix('.jpg'),
  new RouterSourceNetwork(),
);

If /avatars/foo.jpg is requested, but it isn't in the cache, what happens? Does the request fall through to the next route?

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 4, 2019

Having slept on it, I think it's important that a single route is selected based on conditions. I'll work on a new draft that uses a router.add(conditions, sources) pattern.

This matches other routers like Express, where "continue to other routes" is opt-in.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 4, 2019

Ok, here's a second draft:

Creating a route

WebIDL

// Install currently uses a plain ExtendableEvent, so we'd need something specific
partial interface ServiceWorkerInstallEvent {
  attribute ServiceWorkerRouter router;
}

[Exposed=ServiceWorker]
interface ServiceWorkerRouter {
  void add(
    (RouterCondition or sequence<RouterCondition>) conditions,
    (RouterSource or sequence<RouterSource>) sources,
  );
}

[Exposed=ServiceWorker]
interface RouterSource {}

[Exposed=ServiceWorker]
interface RouterCondition {}

JavaScript

addEventListener('install', (event) => {
  event.router.add(conditions, sources);
  event.router.add(otherConditions, otherSources);
});

The browser will consider routes in the order declared, and if all conditions match, each source will be tried in turn.

Conditions

These determine if a particular static route should be used rather than dispatching a fetch event.

WebIDL

[Exposed=ServiceWorker, Constructor(ByteString method)]
interface RouterIfMethod : RouterCondition {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURL : RouterCondition {}

dictionary RouterIfURLOptions {
  boolean ignoreSearch = false;
}

[Exposed=ServiceWorker, Constructor(USVString url)]
interface RouterIfURLStarts : RouterCondition {}

[Exposed=ServiceWorker, Constructor(USVString url, optional RouterIfURLOptions options)]
interface RouterIfURLEnds : RouterCondition {}

[Exposed=ServiceWorker, Constructor(optional RouterIfDateOptions options)]
interface RouterIfDate : RouterCondition {}

dictionary RouterIfDateOptions {
  // These should accept Date objects too, but I'm not sure how to do that in WebIDL.
  unsigned long long from = 0;
  // I think Infinity is an invalid value here, but you get the point.
  unsigned long long to = Infinity;
}

[Exposed=ServiceWorker, Constructor(optional RouterIfRequestOptions options)]
interface RouterIfRequest : RouterCondition {}

dictionary RouterIfRequestOptions {
  RequestDestination destination;
  RequestMode mode;
  RequestCredentials credentials;
  RequestCache cache;
  RequestRedirect redirect;
}

Again, these interfaces don't have attributes, but they could reflect the options/defaults passed into the constructor.

Sources

These determine where the route should try to get a response from.

WebIDL

[Exposed=ServiceWorker, Constructor(optional RouterSourceNetworkOptions options)]
interface RouterSourceNetwork : RouterSource {}

dictionary RouterSourceNetworkOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
  // Reject responses that do not have an ok status.
  boolean requireOkStatus;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceCacheOptions options)]
interface RouterSourceCache : RouterSource {}

RouterSourceCacheOptions : MultiCacheQueryOptions {
  // A specific request can be provided, otherwise the current request is used.
  Request request;
}

[Exposed=ServiceWorker, Constructor(optional RouterSourceFetchEventOptions options)]
interface RouterSourceFetchEvent : RouterSource {}

dictionary RouterSourceFetchEventOptions {
  DOMString id = '';
}

These interfaces don't currently have attributes, but they could have attributes that reflect the options/defaults passed into the constructor.

Shortcuts

GET requests are the most common type of request to provide specific routing for.

WebIDL

partial interface ServiceWorkerRouter {
  void get(/* same as add */);
}

Where the JavaScript implementation is roughly:

router.get = function(conditions, sources) {
  if (conditions instanceof RouterCondition) {
    conditions = [conditions];
  }
  router.add([new RouterIfMethod('GET'), ...conditions], sources);
};

We may also consider treating strings as URL matchers.

  • router.add('/foo/', sources) === router.add(new RouterIfURL('/foo/'), sources).
  • router.add('/foo/*', sources) === router.add(new RouterIfURLStarts('/foo/'), sources).
  • router.add('*.png', sources) === router.add(new RouterIfURLEnds('.png'), sources).

Examples

Bypassing the service worker for particular resources

JavaScript

// Go straight to the network after 25 hrs.
router.add(
  new RouterIfDate({ from: Date.now() + 1000 * 60 * 60 * 25 }),
  new RouterSourceNetwork(),
);

// Go straight to the network for all same-origin URLs starting '/videos/'.
router.add(
  new RouterIfURLStarts('/videos/'),
  new RouterSourceNetwork(),
);

Offline-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/avatars/'.
  new RouterIfURLStarts('/avatars/'),
  [
    // Try to get a match for the request from the cache.
    new RouterSourceCache(),
    // Otherwise, try to fetch the request from the network.
    new RouterSourceNetwork(),
    // Otherwise, try to get a match for the request from the cache for '/avatars/fallback.png'.
    new RouterSourceCache({ request: '/avatars/fallback.png' }),
  ],
);

Online-first

JavaScript

router.get(
  // If the URL is same-origin and starts '/articles/'.
  new RouterIfURLStarts('/articles/'),
  [
    // Try to fetch the request from the network.
    new RouterSourceNetwork(),
    // Otherwise, try to match the request in the cache.
    new RouterSourceCache(),
    // Otherwise, try to match '/articles/offline' in the cache.
    new RouterSourceCache({ request: '/articles/offline' }),
  ],
);

Processing

This is very rough prose, but hopefully it explains the order of things.

A service worker has routes. The routes do not belong to the registration, so a new empty service worker will have no defined routes, even if the previous service worker defined many.

A route has conditions and sources.

To create a new route containing conditions and sources

  1. If the service worker is not "installing", throw. Routes must be created before the service worker has installed.
  2. Create a new route with conditions and sources, and append it to routes.

Handling a fetch

These steps will come before handling navigation preload, meaning no preload will be made if a route handles the request.

request is the request being made.

  1. RouterLoop: For each route of this service worker's routes:
    1. For each condition of route's conditions:
      1. If condition is a RouterIfMethod, then:
        1. If condition's method does not equal request's method, then continue RouterLoop.
      2. Otherwise, if condition is a RouterIfURL, then:
        1. If condition's url does not equal request's url, then continue RouterLoop.
      3. Etc etc for other conditions.
    2. For each source of route's sources:
      1. If source is a RouterSourceNetwork, then:
        1. Let networkRequest be source's request.
        2. If networkRequest is null, then set networkRequest to request.
        3. Let response be the result of fetching networkRequest.
        4. If response is not an error, return response.
      2. Otherwise, if source is a RouterSourceCache, then:
        1. Let networkRequest be source's request.
        2. If networkRequest is null, then set networkRequest to request.
        3. Let response be the result of looking for a match in the cache, passing in source's options.
        4. If response is not null, return response.
      3. Otherwise, if source is a RouterSourceFetchEvent, then:
        1. Set routerCallbackId to source's id.
        2. Call the fetch event as usual, but with source's id as one of the event properties.
        3. Return.
    3. Return a network error.
  2. Call the fetch event as usual.

Extensibility

I can imagine things like:

  • RouterOr(...conditionalItems) – True if any of the conditional items are true.
  • RouterNot(condition) – Inverts a condition.
  • RouterFilterResponse(options) – Right now, a response is returned immediately once one is found. However, the route could continue, skipping sources, but processing filters. This could check the response and discard it if it doesn't match. An example would be discarding responses that don't have an ok status.
  • RouterCacheResponse(cacheName) – If a response has been found, add it to a cache.

But these could arrive much later. Some of the things in the main proposal may also be considered "v2".

@jeffposnick
Copy link
Contributor

@jeffposnick jeffposnick commented Jan 4, 2019

Here's a couple of things that, based on what we've run into with Workbox's routing, tend to crop up in the real-world. I wanted to draw them to your attention so that you could think about them earlier rather than later.

Cross-origin routing

Developers sometimes want to route cross-origin requests. And sometimes they don't. Coming up with a syntax for conditions that supports both scenarios can be difficult. It looks like the current proposal assumes that the URL prefix/suffix will only work for same-origin requests, so be prepared for folks asking for a cross-origin syntax at some point.

Workbox has a few different ways of specifying routing conditions, but the most common is RegExp-based, and we settled on the following behavior: if the RegExp matches the full URL starting with the first character [the 'h' in 'https://...'] then we assume that it can match cross-origin requests. If the RegExp matches, but the match starts on anything other than the first character in the full URL, then it will only trigger if it's a same-origin request.

I would imagine folks wanting to see at least a RouterIfURLOrigin condition that they could use to customize this behavior, and that might need to support wildcards to deal with CDN origins that don't have fixed naming conventions.

Non-200 OK responses sometimes need to be treated as errors

The proposal for RouterSourceNetwork currently reads If response is not an error, return response. I think some developers are going to find this too limiting. There are folks who will end up needing different behavior based on the status of the response, not just whether a NetworkError occurred.

I'm not sure what the cleanest approach is there in terms of your proposal—maybe adding in a way of setting a list of "acceptable" status codes, including letting folks opt-in or opt-out to status code 0 from an opaque response being considered a success?

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

I've renamed prefix/suffix to starts/ends to match str.startsWith.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

I wrote https://jakearchibald.com/2019/service-worker-declarative-router/ to seek wider feedback.

Also see my tweet about it for replies.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

@jeffposnick

Developers sometimes want to route cross-origin requests. And sometimes they don't.

router.add(
  [
    new RouterIfURLStarts('https://photos.example.com/'),
    new RouterIfURLEnds('.jpg', { ignoreSearch: true }),
  ]
  // …
);

The above would match on URLs that start https://photos.example.com/ and end .jpg. It gets trickier if you want to match URLs to all other origins that end .jpg, you'd need RouterNot for that.

RegExp is tricky here as there's no spec for how it could outlive JavaScript. We could try and standardise globbing, but I worry that would take a big chunk of time.

Non-200 OK responses sometimes need to be treated as errors

I've added requireOkStatus as an option.

@nhoizey
Copy link

@nhoizey nhoizey commented Jan 7, 2019

Hi @jakearchibald, this is really interesting!

Reading the post on your blog, I wondered how I could add the RouterSourceNetwork to the cache, because it looks like the fetch event would not be fired. But I see here that you suggest adding RouterCacheResponse to a future "v2" version.

IMHO, this would be great in the "v1" to help build offline experiences without manually preloading.

But I understand there must be some priorities. 😅

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

Yeah, we want to avoid trying to do everything at once. In terms of fetching and caching, see https://jakearchibald.com/2019/service-worker-declarative-router/#routersourcefetchevent.

@WORMSS
Copy link

@WORMSS WORMSS commented Jan 7, 2019

This is not going to be a popular view. but I am not a massive fan of the 'string' shortcut of the conditions. I understand you want it to be as close to existing apis like express.. but not everyone understands express all the time.
I was really enjoying the very explicit class approach and the everything in the conditional list must eval to true, rather than this or thistoo or that.

I wondered if there could be some optimisation for that, behind the scenes. Since they all have to be true to be considered true, then the order they are executed could be arbitrary? (I am of course assuming they are all stateless operations).

So ones that are a little more expensive to calculate could be done last, and super dirty cheap ones could be done first? Rather than do them in the order the service worker coder wrote them in ?? That way different browsers can optimise in their own way and what might be expensive for one is cheap for the others, and the developer doesn't need to know this when setting up the conditions.

For example,

router.add(
  [
    new RouterIfURLMatchesRegex('lets assume some horrible regex'),
    new RouterIfMethod('GET'),
  ]
  // …
);

I know RouterIfURLMatchesRegex doesn't exist, but you have to believe that it would be far less expensive to do the method condition before a possible addition to the conditions in the future. Rather than try and resolve the regex, just to find it was a POST request anyway, so didn't even need it.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

@WORMSS

I am not a massive fan of the 'string' shortcut

I see what you're saying about the string thing. Also, it isn't all that similar to express. It might be better to drop that shortcut until something like globbing can be properly spec'd.

I wondered if there could be some optimisation for that, behind the scenes. Since they all have to be true to be considered true, then the order they are executed could be arbitrary?

In the case of multiple conditions, I'd spec them as a sequence, but if a browser decided to check them in a different order, or in parallel, there'd be no observable difference.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 7, 2019

Some feedback I've received offline:

The string shortcut might lead developers to think something like /articles/*.jpg would 'work'. Also, it wouldn't be backwards compatible to add support for that later. It might be better to hold off on that shortcut, and later explore standardising globbing, which would be useful in other parts of the platform.

When conditions and sources are sequences, conditions is an 'and', whereas sources is more like an 'or'. This is weird. It might be better to drop sequences here in favour of explicit grouping like RouterIfAny(...conditions), RouterIfAll(...conditions), RouterSourceFirst(...sources), RouterSourceRace(...sources) etc etc.

@yoavweiss
Copy link

@yoavweiss yoavweiss commented Jan 7, 2019

My feedback:

  • I've seen cases where the request destination would have been extremely helpful as a filter.
  • I found the use of get() as a shortcut for GET requests confusing as first, as I expected it to be a way to get the set routes or something similar. Maybe it's just me, but if not, might be worthwhile to rename.

@shortercode
Copy link

@shortercode shortercode commented Jan 7, 2019

Really liking the concept of avoiding the overhead of starting the ServiceWorker where possible, but I feel like not adding an additional Source type that utilises a callback is a missed opportunity. It would effectively be the same behaviour as the fetch event fallback, but specific to the given conditions. For all other source types we would still avoid starting up the service worker.

Also I think syntax wise I feel it reads better to have static methods on a RouterCondition/RouterSource class that instantiate the specific type. Although I'm not sure how in keeping that is with other web APIs.

const { router } = e;

router.get(
  RouterCondition.startsWith("/avatars/"),
  [ RouterSource.cache(), RouterSource.network() ]
);

router.get(
  RouterCondition.fileExtension(".mp4"),
  RouterSource.network()
);

router.get(
  RouterCondition.URL("/"),
  RouterSource.cache("/shell.html")
);

router.add(
  RouterCondition.any(),
  RouterSource.custom(fetchEvent => {
    
  })
);

@tomayac
Copy link

@tomayac tomayac commented Jan 7, 2019

@yoavweiss: I think the .get() is heavily inspired by Express.js’ routing: http://expressjs.com/en/guide/routing.html.

@domenic
Copy link
Contributor

@domenic domenic commented Jan 7, 2019

I apologize that this is not a very substantial contribution, but I think I'm having an allergic reaction to the "Java-esque" (or Dart-esque) nested class constructors. I'd encourage thinking about what this design would look like if done in the opposite direction: a purely JSON format. For example, something like

router.route([
  {
    condition: {
      method: "GET",
      urlStartsWith: "/avatars"
    },
    destination: ["cache", "network"]
  },
  {
    condition: {
      method: "GET",
      urlPathEndsWith: ".mp4"
    }
    destination: ["network"]
  },
  {
    condition: {
      method: "GET",
      urlPath: "/"
    },
    destination: [{ cache: "/shell.html" }]
  }
]);

I don't think this extreme is right either (in particular the [{ cache: "/shell.html" }] seems weird and underdeveloped) but I think it'd be a valuable exercise to see what it would look like to have a purely declarative routing format expressed just in JS objects. Then, you could figure out where it would make sense to strategically convert some POJSOs into class instances, versus where the POJSOs are simpler.

Another way to think about this is, when are these types ever useful? If they're only ever consumed by the router system, and not manipulated, composed, or accessed by users, then perhaps it's better to just pass the data they encapsulate directly to the system. (That data is roughly a "type tag" plus their constructor arguments.)

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 9, 2019

After thinking about it, I think I'm going to go with the object based approach & @domenic's suggestion:

// Without options:
router.add(
  conditions,
  'cache',
);

// With options:
router.add(
  conditions,
  { type: 'cache', request: '/shell.html' },
);

// Multiple sources:
router.add(
  conditions,
  [
    'cache',
    'network',
    { type: 'cache', request: '/shell.html' },
  ],
);

I'm still not keen on having objects with a required property that determines the type of the object, but it seems better than the alternatives.

@WORMSS
Copy link

@WORMSS WORMSS commented Jan 9, 2019

With the object based approach, how to you specify that you want 'cache' or 'network' as race?

RouterSourceFirst(...sources), RouterSourceRace(...sources) etc etc.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 9, 2019

@WORMSS

router.add(
  conditions,
  { type: 'race', sources: […sources]},
);

The pattern feels equally extensible.

@bkardell
Copy link

@bkardell bkardell commented Jan 9, 2019

fwiw, @domenic and others expressed much better what I briefly tried to say yesterday in some other venue (twitter maybe?) - I like this recent turn though quite a bit.

@Siilwyn
Copy link

@Siilwyn Siilwyn commented Jan 10, 2019

The option explored in #1373 (comment) feels more straightforward and in this case better than using classes, would definitely prefer this version!

@jeremy-coleman
Copy link

@jeremy-coleman jeremy-coleman commented Jan 15, 2019

I'd like to suggest a somewhat different approach that I feel addresses the same issue in a more intuitive manner, which is to provide a (virtual) file system to code against the assets as they will be on the client and in the shape/structure they will actually be in. IE:
image

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 15, 2019

@jeremy-coleman The web isn't compatible with a filesystem. It's compatible with a request/response store, which is what the cache API provides. However, this alone can't communicate when a request should go to the network. I guess I don't understand your proposal fully.

@jeremy-coleman
Copy link

@jeremy-coleman jeremy-coleman commented Jan 16, 2019

I guess from my perspective, the cached files are a file system. Instead of (req res) => fetchHttpRoute => interceptEverything someLogic => if(cache) else(usenet) => next

Req res => fetchCacheRoute => if(!nocache) fetchHttp => next

If you explicitly write the api to query the cached files first you can just drop the intercept all together.

Similar to how you might conventionally write an image element as <|img src=assets/icon.svg/> because you know the location post-build, the 2nd order of that idea would be to write the src as src=clientCache

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 16, 2019

@jeremy-coleman in your system, how do you express "For any URL path ending .jpg, try to fetch from the network, otherwise fall back to this generic image…"?

@jeremy-coleman
Copy link

@jeremy-coleman jeremy-coleman commented Jan 17, 2019

it'd be the same for both online and offline. Assuming all routes point to the cache first - if it's an online asset, replace with net when available. same for offline, just the timeout for the cache with online data would be 0ms / fetch onSomeUserEvent compared to something like a 24 hour reset for offline stuff. I think a harder thing to make understandable would be differences between something like 'use last successful' vs 'use constant fallback' , for the stuff above on how to handle conditionals, Proxy.revoke() with your conditional checks on propkeys access could probably handle everything needed.

for both online and offline from the examples above, i think something like this:

router.get(
  new RouterIfURLPrefix('/**/*.jpg/'),   // find some URL ending in jpg'.
  new RouterSourceCache(),//use the static asset.
maybe
  new RouterSourceNetwork(),   //Try to fetch the request from the network.   <-- this doesn't need to come first for online reqs , just use the current value in the cache first and always
maybe update static asset with req as new default fallback

somewhat unrelated but really what underpins my line of thinking is that I feel like offline apps should basically be completely downloaded into some form of local storage on 1st req, and the SW routes should support coding against the offline assets more-so than online sources.

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 20, 2019

@jeremy-coleman based on your example above, I don't understand the difference between your proposal and mine.

@asakusuma
Copy link

@asakusuma asakusuma commented Jan 28, 2019

A bit late to the party but...

+1 for the "side effect operation" issue raised by @jeffposnick and @nhoizey

In addition to refreshing the cache based on a network source, firing analytics beacons is another feature that would be heavily used by LinkedIn. We have a lot of instrumentation to measure how the service worker is working/not working. The highest priority V2 feature for us would be support for a routingcomplete or similar event, to handle both of these cases.

@jakearchibald

I think I'd make this an option to RouterSourceNetwork, as it's the only one that would benefit from a timeout right now.

I think the timeout feature would be useful for RouterSourceFetch, as it might address #1292.

Timeout support would be really nice for creating a global fallback "catch all" handler.

router.get(
  new RouterIfURLStarts(‘/profile/*‘),
  [new RouterSourceCache(), new RouterSourceNetwork({ timeout: 10000 })]
);

router.get(‘*’, new RouterSourceCache(‘/oops.html’))

@asakusuma
Copy link

@asakusuma asakusuma commented Jan 29, 2019

re: glob matching, one use case that may not be covered without regex is the "match on any route, but no files" use case.

So if you want to match /profile/123 or /profile/123/details, but not profile/123/photo.jpg.

https://regexr.com/47c2t

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Jan 30, 2019

@n8schloss something I wanted to double check about this proposal: A service worker's routes live with the service worker. They can't be changed without shipping a new service worker. Does that work for you?

I ask because with navigation preload, you wanted to change things during the life of a service worker (the header value).

@ylafon
Copy link
Member

@ylafon ylafon commented Feb 5, 2019

@asakusuma it depends on the exact scope you want to match... and exclude.
To take one of easiest syntax to express and combine facts:

  url: { startsWith: '/profile/123',  ignoreSearch: true,
  and: {
    not: {
      url: { endsWith: '.jpg', ignoreSearch: true},
      },
    },
  }

Being able to exclude and use boolean combinators is essential here to really express use-cases that are not the most simple ones.
Also, syntax-wise,

and: {  url: { startsWith: '/profile/123',  ignoreSearch: true},
        not: { url: { endsWith: '.jpg', ignoreSearch: true} }
     }

Looks better.

@n8schloss
Copy link

@n8schloss n8schloss commented Feb 5, 2019

@n8schloss something I wanted to double check about this proposal: A service worker's routes live with the service worker. They can't be changed without shipping a new service worker. Does that work for you?

I ask because with navigation preload, you wanted to change things during the life of a service worker (the header value).

Yep! As long as there's the RouterIfDate options then our use case here will be met :)

@mgiuca
Copy link

@mgiuca mgiuca commented Aug 1, 2019

Late to the party but I'd like to give a bit of feedback.

I love the general concept. Seems like this will solve a lot of problems (speed issues with spinning up SWs, and the added "scariness" of what if the fetch handler has a bug that makes my site non-updateable.) This came up as a potential solution to w3c/manifest#774.

I have some superficial criticism (API surface details).

  • I'd like to remove the startsWith and endsWith things in favour of just having a glob syntax. Fewer API calls, less verbose syntax, and more flexible.
  • ignoreSearch is a confusing name (why not ignoreQuery)? Alternatively, just embrace the glob and allow ?* at the end of the glob to mean "ignore the query" (not as an especially special case, but rather as a general rule, treat a '?' at the end of a path with nothing after it as the same as no '?'.
  • The name router.get is confusing because a method called "get" implies it's going to return some information out of the router, not set up a new route. I'd rather just remove that method and just make you write router.add({method: 'GET'}) which is nice and obvious what it's going to do.

@WORMSS
Copy link

@WORMSS WORMSS commented Aug 1, 2019

Ignore search is because it's what the JavaScript code calls it In window.location

@jakearchibald
Copy link
Contributor Author

@jakearchibald jakearchibald commented Aug 1, 2019

  • I'd like to remove the startsWith and endsWith things in favour of just having a glob syntax. Fewer API calls, less verbose syntax, and more flexible.

This has come up a few times in this issue. I'm not against adding it at some point, but it feels like a big contentious thing to standardise for v1.

  • ignoreSearch is a confusing name (why not ignoreQuery)?

As @WORMSS says, it's for consistency with the rest of the platform. url.search, cache.match(url, { ignoreSearch: true }) etc etc.

  • The name router.get is confusing because a method called "get" implies it's going to return some information out of the router,

Yeah, maybe. Although this is how most node routers seem to do it.

@mgiuca
Copy link

@mgiuca mgiuca commented Aug 1, 2019

@WORMSS Yeah it's called that in the URL object too. It's called query inside the spec language but now that I re-read it, I realised that the word "query" never appears in the API interface itself. So "search" is fine.

@fallaciousreasoning
Copy link

@fallaciousreasoning fallaciousreasoning commented Sep 16, 2019

Curious whether we could use @wanderview's URLPattern proposal instead of startsWith/endsWith.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet