Skip to content

Latest commit

 

History

History
293 lines (171 loc) · 35.8 KB

faq.md

File metadata and controls

293 lines (171 loc) · 35.8 KB

🚧 OUT OF DATE

Please note: This page's content is in an inconsistent state relative to the rest of the proposal. It is being rewritten (notice 2021-05-13).

FAQ

  • FAQ
    • Adapt all current questions
    • Why batch prefetching + potential extensions as v1, rather than a more comprehensive solution?

General

Q: How does this proposal relate to Web Packages (WICG/webpackage) and Bundled Responses (WPACK Working Group, [wpack-wg/bundled-responses]https://github.com/wpack-wg/bundled-responses)?

A: This is the same effort, really, with a particular scope. In particular, this repository has a focus on same-origin static subresource loading, while preserving the semantics and integrity of URLs. The Google Chrome team (including Jeffrey Yasskin and Yoav Weiss) have been collaborating closely on this project. There are different concrete alternatives under discussion (especially in the details of subresource loading, and less so for the bundle format itself), but the idea is to gather more feedback (possibly including prototyping) to draw a shared conclusion.

Q: Why the name change, then?

A: To express the limited scope (separating from Signed Exchange, preserving URL semantics) and the fact that this format may be useful outside of the Web (e.g., in Node.js). Hopefully, these changes address the previous criticisms of Web Bundles.

Q: How does this proposal relate to Signed Exchange?

A: This proposal does not make any special allowances for Signed Exchange, and some coauthors personally oppose the promotion of Signed Exchange through bundling. There has been high-level discussion about a concept of "signed bundles" (which led these two proposals to be coupled at some point), but the overlap is as simple as: if a bundle were signed, there would have to be some kind of section within the bundle to contain the signature for the bundle as a whole (rather than leave signatures to being per-response).

Q: Weren't ad blockers and publishers opposed to Web Bundles? How do they feel about this proposal?

A: Robin Berjon from the New York Times said,

It's a useful approach to address the bundling mess we see in JS (and other similar issues), building on smart work from Jeffrey Yasskin and Yoav Weiss but without the bits that help Google take over the Web.

An analysis by Brave is in this issue comment.

Q: How far along is this proposal? Is it about to ship?

A: This proposal is very early. Although Chrome has a flagged experiment for unsigned Web Bundles based on this explainer, there is no specification or tests, and there are ongoing efforts to iterate on design and communicate with browser vendors, Web developers and other Web stakeholders before this proposal is ready to ship.

Q: How is this work funded? Are there any conflicts of interest?

Eye/O is funding Igalia's work on resource bundles, and Bloomberg had funded it previously. Many others have been collaborating, especially Yehuda Katz (Tilde), Pete Snyder (Brave) and several Google employees (inside and outside of the Chrome team). Google and Brave are also customers of Igalia, but not funding work on this project.

Subresource loading

Q: Rather than add bundling into the platform, why not fix HTTP?

A: If we can figure out a way to do that which would obsolete bundlers, then that would be perfect! However, it's unclear how to reduce browsers' per-fetch overhead within HTTP (which has to do with security-driven process architecture), even if we developed a nicer way to share compression dictionaries among HTTP responses and encourage more widespread prefetching. Please file an issue if you have concrete ideas.

This question has been raised over the years in response to previous Web packaging proposals. During those prior discussions, folks who were actively working on HTTP/2 felt optimistic that HTTP/2 would ultimately obviate the need for a Web packaging standard. Since then, many of those folks have become more pessimistic about solving the problem of packaging entirely through HTTP. Jake Archibald wrote a post explaining some of the unexpected subtleties of HTTP/2 called HTTP/2 push is tougher than I thought.

Sharing compression dictionaries across multiple HTTP responses is a particularly challenging part. Although HTTPWG has developed various draft proposals in this area, concerns have been raised about privacy, security, implementation complexity, and the ability to use high-quality compression techniques on-line.

Q: Are Web developers expected to write out those <script type=bundlepreload> manifests, and create the resource bundles, themselves?

A: Not necessarily. This is mainly a job for bundlers to do (explainer). Hopefully, bundlers will take an application and output an appropriate resource bundle, to be interpreted by the server to send just the requested resources to the client. The bundler will also create a bundlepreload manifest, which can be pasted into the HTML inline.

Q: Should bundling be restricted to JavaScript, which is the case with the largest amount of resource blow-up?

A: JavaScript-only bundling is explored in JavaScript module fragments, but the current bundler ecosystem shows strong demand for bundling CSS, images, WebAssembly etc., and new non-JS module types further encourage the use of many small non-JS resources. Today, we see widespread usage of CSS in JavaScript strings, and other datatypes in base64 in JS strings (!). A JS-only bundle format may encourage these patterns to continue.

Among other problems, these approaches make the resources opaque to the browser, which limits how effectively browsers can apply optimizations to them. It is also impossible to target these individual resources using browser APIs such as CSP. It is also impossible for these bundled resources to have their own headers. A JS-only bundle format may encourage these patterns to continue.

The import maps proposal can also be used to map away hashes in script filenames. This can be useful for "cache busting" for JavaScript, but not for other resource types. However, in practice, similar techniques are needed for CSS, images, and other resource types, which a module-map-based approach has trouble solving

Fetch maps could similarly be used for non-module subresources. Alternatively, import: URLs could be used to reference non-module assets while indirecting through an import map.

Q: Will support for non-JS assets make resource bundle loading too heavy-weight/slow?

A: Indeed, it may. This proposal works at the network fetch level, not the module map level. This means that, when executing a JavaScript module graph, some browser machinery needs to be engaged. Multiple browser maintainers have expressed concern about whether the fetch/network machinery can scale up to 10000+ JS modules. Although resource bundles will help save some of the overhead, they may not be enough. JavaScript-specific module fragments may be implementable with less overhead, as they work at the module map level.

It's my (Dan Ehrenberg's) hypothesis at this point that, for best performance, JS module fragments should be nested inside resource bundles. This way, the expressiveness of resource bundles can be combined with the low per-asset overhead of JS module fragments: most of the "blow-up" in terms of the number of assets today is JS modules, so it makes sense to have a specialized solution for that case, which can be contained inside the JS engine. The plan from here will be to develop prototype implementations (both in browsers and build tools) to validate this hypothesis before shipping.

Q: Why address resources within a bundle with URLs that look like they come from outside of the bundle, instead of something which more explicitly notes their source, like https://example.com/bundle.rbn#resource-within-bundle?

A: Although this is a possible approach, it would come with significant costs:

  • Web developers write application code with resources identified by paths, not fragments. The use of fragments would require that bundlers continue to virtualize paths, whereas this proposal aims to reduce the amount of virtualization that bundlers need to do.
  • Lots of code in different places likely depends on how a network request to fetch a URL is not affected by the fragment; this particular scheme breaks that assumption.
    • One example of such code is the filter rules of content blockers, but this may be fixable; it just requires some investigation.
  • In such a scheme, there is no sense of an "underlying URL" to verify against, so it's unclear how to prevent low-cost, per-request rotation of URLs, which could pose a problem for content blockers.
  • Fragment syntax doesn't quite "nest". If you have a resource served from a bundle, identified by a fragment in the URL, then we'd need to develop some other syntax to put another fragment on top of it.

A different package: scheme has also been proposed for this purpose, which avoids the use of fragments, but causes even more issues because introducing a new scheme is expensive, and the authority of such URLs is unclear due to the presence of multiple origins.

Q: Why does the syntax for loading a resource bundle use a <script> tag?

A: It is important for browsers' preload scanners to be able to fetch resource bundles as appropriate, so a declarative syntax (here, through a <script> tag) is required. <script> is used rather than <link> due to concerns from WebKit about injection attacks. But more syntax, in addition to the <script> tag, may actually be needed.

Several use cases for resource bundle loading take place without the presence of the DOM, e.g, from a Worker. Therefore, a non-DOM-based JavaScript API is necessary, possibly something like window.bundlepreload({"source": "example.rbn", /* ... */}).

The Link: header, and even further, HTTP Early Hints has great potential to serve as a mechanism to initiate prefetching as soon as possible. It would be optimal for resource batch loading to also have a syntax to be usable in this form, even if it's unavailable as a <link> tag in HTML.

Q: Is it ideal to ship a manifest to clients? Wouldn't it be better to keep this information on the server?

A: It's complicated: there are three pieces of information that need to be brought together in order for the server to send the client the information that it needs:

  • The contents of the browser's HTTP cache (held in the browser)
  • The set of routes/components requested (held in the browser)
  • The set of resources needed for each route/component (held in the server)

The approach above ships a manifest to the client, which ends up standing in for the set of resources needed for each route/component. An alternative strategy would be to ship, to the server, both a digest of the relevant part of the browser's HTTP cache, as well as a representation of which routes/components are requested. This document explores techniques for sending a digest of the browser's HTTP cache to the server, and some advanced dynamic bundling solutions use a related technique.

Q: Could the manifest be delivered to the client incrementally, instead of all at once?

A: Such an approach makes sense if fetching some bundled resources exposes the possible need for even more bundled resources in the future, that couldn't have been triggered previously. For example, say there is a rarely loaded but very large "admin" pane, which has several tabs within it: when you first load the admin pane, you may have more manifest to load for those inner tabs, which isn't necessary on first page load. There are a couple ways that this could be implemented:

  • Imperatively: There should probably be a JavaScript API for imperatively adding additional paths, each corresponding to bundled resources to be loaded. This API can be invoked explicitly after clicking on the admin pane, based on logic embedded in it.
  • Declaratively: If we find that this is a common pattern/need, then resource bundles could contain an additional section in them which is the section of paths that needs to be added for it, so that this can occur without running JavaScript.

Incremental manifest fetching is another advanced technique that could be included in a v2 proposal, or even initially if experimentation finds that it is needed for sufficient performance.

Q: How does this proposal relate to Sub-resource Integrity (SRI)?

A: Some thought has been put into various schemes to facilitate the adoption of SRI in conjunction with resource bundles. For many cases, the hashes will take up too much space to be sent to the client, and SRI adds deployment challenges (e.g., with upgrades). Efficient SRI approaches may be beneficial and follow up proposals can be explored in this repository or elsewhere. A previous draft had a closer relationship with SRI.

Q: Is there a way to load a bundle in a way that all network requests from inside of it are required to be served within the bundle?

A: Not in this proposal. Such a limitation would make the most sense if it were document-wide, but this proposal is about loading a resource bundle within a document (so, the HTML had to come from somewhere else, for one). A separate proposal could create a kind of iframe which is limited to be loading contents out of a particular bundle, perhaps with the bundle loading mechanism based on this document. It will be important to evaluate the privacy implications of such a proposal.

Q: What happens if the server sends the client more resources than it asked for?

A: Servers are permitted to do this, and all of the subresources will end up loading, though more slowly. For example, a server could simply always reply with all of the resources. Intelligent middleware could be responsible for filtering just the requested parts, making it easier to deploy resource bundle loading.

There are a number of different possible valid designs for how a browser behaves when a response contains additional resources that were not requested:

  • The browser could discard all additional resources
  • The browser could cache everything and make it available to the application
  • The browser could cache everything, but only make each resource available after it has been referenced by the bundling API.

Q: Why are the requested resources in the bundle listed as a header parameter, rather than in the URL as a query parameter?

A: Both alternatives would be possible. In a way, this is a sort of bikeshed, which would be fine to iterate on. Layering-wise, it's a bit unusual for network protocols to add query parameters this way, but it may be helpful to deal with length limitations in headers (which may be higher for URLs) and intermediaries which don't handle Vary: properly (whereas everyone keys off of the full URL).

Q: If resource bundle loading is restricted to being same-origin, does that mean they can't be loaded from a CDN?

A: It is fine to load a resource bundle from a CDN: that bundle will be representing URLs on the same origin as the bundle (as well as within the path limitation). For example, https://site.example can contain something like the following:

<!-- https://site.example/index.html -->
<script type=bundlepreload>
    {
        "source": "https://cdn.example/pack.rbn",
        "scope": "https://cdn.example/pack/",
        "paths": { /* ... */ }
    }
</script>

Because the source is the same origin as the scope, the resource bundle loading is permitted.

Q: I think it'd be great to use resource bundles to serve ads, which need to be personalized. Can the no-personalization requirement be relaxed?

A: Some people have interesting ideas for how some kind of loading involving resource bundles could improve the efficiency of ad loading and even reduce the privilege level of ads. At the same time, it's quite tricky to get these privacy and security issues right, and it's important that whatever is adopted remains compatible with various interventions that user agents want to make (such as content blocking). This is a big problem space and research area.

Ads seem to have somewhat different needs for loading compared to static subresource loading explained above. For example:

  • Ads often need to run at a lower privilege level than the surrounding page (e.g., in a cross-origin iframe), whereas subresource loading is often for fully privileged resources.
  • Ads often need to download their entirety, whereas subresource loading benefits from reusing things from the browser cache.
  • Ads are often negotiated for what to load differently for each user, whereas subresource loading generally uses broadly shared assets.

It's possible that some kind of other ad loading proposal could reuse the resource bundle format in some way, but the actual loading mechanism is likely to be quite different from what is described in this document. Ad loading with resource bundles would be a very separate project, outside the scope of this repository.

Q: How would WebExtensions (e.g., for content blocking) interact with resource bundle loading?

A: More design work is still needed here, but the general idea is: if the extension intercepts fetches today, it will be able to intercept fetches that will be served by the resource bundle, as well as the underlying fetches to the resource bundle itself. Resources which are explicitly included among the "paths" in the <script type=bundlepreload> manifest, and resources which come along for the ride when a chunk is fetched, are treated identically: Both would be intercepted by extensions. The only difference is that blocking the former will block fetching the chunk at all, whereas the latter will be something which has already downloaded.

However, this could be quite expensive for extensions which intercept fetches (e.g., content blockers), and it may be beneficial to introduce certain changes to the webRequest API to facilitate optimizations: For example, a new RequestFilter field could be added to distinguish requests served from bundles, to allow work to focus on the request for the bundle chunks, rather than the individual included requests. Safari's declarative Content Blocker API could be treated similarly.

Q: How would ServiceWorker interact with resource bundle loading?

A: More design work is needed still, but one possibility is: a ServiceWorker fetch event would be dispatched for each fetch, both of resources served by resource bundles, and the resource bundle chunk fetches themselves. Unfortunately, unlike WebExtensions, there doesn't seem to be an API (besides the path prefix) to filter which requests hit the ServiceWorker.

Q: Will the overhead of going to plugins and ServiceWorker make resource bundle loading too slow?

A: It's possible that these factors could cause significant overhead, if the number of resources is too great. A couple ways to consider mitigating this overhead:

  • At the application level, greater use of JS module fragments could reduce the number of resources, and therefore reduce the overhead.
  • In extensions and ServiceWorker, a batch-based API could be added to handle fetches which are served from resource bundles.

Serving

Q: Is it really necessary for the server to dynamically generate responses? Is there any way to implement bundling on a static file server?

A: Fundamentally, the set of resources that a browser has in its cache is based on the path that the user took through the application. This means that there is a combinatorial explosion of possibilities for the optimal bundle to send to the client, and dynamic subsetting could provide the best loading performance.

This strategy is used in custom bundlers for some major sites. This proposal aims to bring these advanced loading techniques to a broader section of Web developers.

The strategy implemented today in bundles like webpack and rollup is, instead, statically generated chunks which can be served from a static file server. With static chunking, there is a tradeoff between, on one hand, better cache usage and avoiding sending duplicate/unneeded resources (where smaller chunks are better), and on the other hand, compressability and reduction of per-fetch overhead (where bigger chunks are better). Recent work has focused on finding an optimal middle point, but the ideal would be to cache at a small granularity but fetch/compress at a bigger granularity, as is possible with dynamic chunking in the context of native resource bundle loading.

If dynamic bundle generation is too expensive/difficult to deploy in practice in many cases (whether due to usability issues for Web developers or servers), resource bundle loading could be based on (either in a separate mode, or always) static chunking with each chunk served from a different URLs: the cost in terms of runtime performance is a tradeoff with easier deployability. It may be that static chunking is enough in practice, if it only results in a reasonably small number of HTTP/2 fetches, and compression works relatively well with Brotli default compression dictionaries, for example.

Q: Will it be efficient to dynamically, optimally re-compress just the requested parts of the bundle?

A: In general, the hope is that there can be a high-quality-compressed version of the entire bundle produced and deployed to the server, and the server would be able to efficiently calculate dynamic subsets. However, the efficiency of compressing this subset is unclear, and depends on the compression algorithm used. More research is needed. (c.f. this blog post, this comment).

One idea raised to reduce the cost of re-compression for dynamic subsetting: use a different bundle URL per route/component, so that these can serve as different pre-compressed units, so the subsetting is more "dense". However, it is not clear how to handle the common case of fetching multiple routes/components at once (which is the source of the combinatorial explosion in the first case).

Q: How would the server know which kind of loading mode the client is asking for (e.g., in personalized ads vs non-personalized subresource loading)?

A: In general, the loading mode should be indicated by the HTML (here, the <script type=bundlepreload> tag). If the client knows how to interpret that, then it will send the appropriate request to the server, indicating what it wants. There are further possibilities to let the server know if more optional sections are added to the bundle format, such as an Accept-Bundle-Sections header describing what the client knows how to interpret (where no such header would indicate that only the index and responses sections are interpreted).

Tools

Q: When should bundlers decide to break up different units into module fragments?

A: Yet another interesting design space to explore! The general idea is to start using module fragments rather than separate resources in the resource bundle when necessary, to avoid the excess overhead of separate resources. One neat solution would be to put each "package" (e.g., in npm) in a single JS file using module fragments, but this may not yet be efficient enough. We'll likely need to experiment with real implementations to figure out what the optimal point is.

Q: How can this feature be used when some browsers will support it and others will not?

A: Two options:

  • Graceful degradation: Because individual resources in a resource bundle must be served from the same URL with the same contents, sites will "just work" if resource bundles are simply turned off. However, performance will often not be good enough, for all the reasons developers use bundlers in the first place today.
  • Feature detection: Detect the lack of this feature and invoke a legacy-bundled fallback. The detection can be done by introspecting the DOM and checking how the bundlepreload manifest was parsed.

Previous section - Table of contents - Next section