Skip to content

Latest commit

 

History

History
414 lines (266 loc) · 17.5 KB

NotRestoredReason.md

File metadata and controls

414 lines (266 loc) · 17.5 KB

NotRestoredReasons API Explainer

Authors:

Participate

Motivation

Browsers today offer an optimization feature for history navigation, called back/forward cache (bfcache). This enables instant loading experience when users go back to a page they already visited.

Today pages can be blocked from entering bfcache or get evicted while in bfcache for different reasons, such as reasons required by spec and reasons specific to the browser implementation. Here is the full list of reasons that can be reported: spreadsheet.

Developers can gather the hit-rate of bfcache on their site by using the pageshow handler persisted parameter and PerformanceNavigationTiming.type(back-forward). However, there is no way for developers to tell what reasons are blocking their pages from being restored from bfcache in the wild. They are not able to know what actions to take to improve the hit-rate.

We would like to make it possible for sites to collect information on why bfcache is not used on a history navigation, so that they can take actions on each reason and make their page bfcache compatible. First we will start exposing this information to PerformanceTiming API.

The reasons reported can be the ones that were present at the timing of navigating away (i.e. the page did not enter bfcache), or the ones that made the page ineligible while the page was in bfcache (i.e. the page was evicted from bfcache).

Note that we are not going to expose information about cross-origin subframes, except for the information about whether or not they blocked bfcache.

Goals

  • Provide a way to gather data as to why a page is not served from bfcache on a history navigation.
  • Provide an easy way to debug a website and make it bfcache compatible.

Non-goals

  • Provide a way to disable bfcache.
  • Provide insights into cross-origin subframes.

Developers requirements

The goal is to equip developers with enough information to make their site bfcache compatible.

In order to debug the site, developers need to be able to identify what frame within the frame-tree information applies to. This means they need to be given a tree-structure and IDs that match the frame tree. The URL for each frame is helpful for knowing the state of the frame (but cannot be given for a cross-origin iframe).

They need to know whether the frame had NotRestoredReasons or not, and if so what reasons are present.

Exposing Not-restored reasons in Tree structure

We should report the not-restored reasons in a tree structure JavaScript Object representing the frame tree.

For same-origin frames, this should report

  1. HTML ID of the frame (e.g. “foo” when <iframe id=“foo” src="...(URL)">)
  2. name attribute of the frame (e.g. “bar” when <iframe name="bar">)
  3. Location (URL) of the frame
  4. src of the frame
  5. NotRestoredReasons (can be empty)
  6. Child frames

For cross-origin frames, this should report

  1. HTML ID of the frame (e.g. “foo” when <iframe id=“foo” src="...(URL)">)
  2. name attribute of the frame (e.g. “bar” when <iframe name="bar">), report only the original name, not the updated name)
  3. src of the frame (not the current URL)

For cross-origin frames, we should not expose the information on what blocked bfcache to avoid cross-site information leaks. Instead, when any cross-origin iframe blocks bfcache, the main frame will report "masked" as a reason.

Examples

Example-1

{
  url: "a.com",
  src: "a.com",
  id: "x",
  name: "x",
  reasons: {},
  children: [
    { url: "a.com", src: "a.com", id: "y", name: "y", reasons: {}, children: [] },
    { url: "a.com", src: "a.com", id: "z", name: "z", reasons: {reason: "broadcastchannel"}, children: [] }
  ]
}

Example-2 (cross-origin iframes)

If a cross-origin iframe is blocking, its reasons will be null, and instead, the main frame will have "masked" reason.

{
  url: "a.com",
  src: "a.com",
  id: "x",
  name: "x",
  reasons: {reason: "masked"},
  children: [
    { url: "a.com", src: "a.com", id: "y", name: "y", reasons: {}, children: [] },
  	/* for b.com */ { url: "", src: "b.com", id: "z", name: "z", reasons: null, children: null }
  ]
}

Example-3 (cross-origin subtree)

If a cross-origin iframe has a subtree under it, we mask the information of subtree, only reporting the id, src, and name. This is true even when a subtree has same origin subframe in it, like the example below. When any of the cross-origin iframe is blocking, the main frame's reasons will have "masked" as a reason.

{
  url: "a.com", /* a.com */
  src: "a.com",
  id: "x",
  name: "x",
  reasons: {reason: "masked"},
  children: [
  	/* b.com and its subtree */ { url: "", src: "b.com", id: "y", name: "y", reasons: null, children: null },
  ]
}

Example-4 (multiple cross-origin iframes)

If multiple cross-origin iframes have blocking reasons, we randomly select one cross-origin iframe and report whether it blocked bfcache or not. For the selected frame, reasons reports "masked". Note that the main frame also reports "masked" as a reason because b.com is blocking. See [Security and Privacy](https://github.com/rubberyuzu/bfcache-not-retored-reason/blob/main/NotRestoredReason.md#single-cross-origin-iframe-vs-many-cross-origin-iframes) section for more details.
{
  url: "a.com",
  src: "a.com",
  id: "x",
  name: "x",
  reasons: {reason: "masked"},
  children: [
    { url: "", src: "b.com", id: "b", name: "b", reasons: null, children: null },
    { url: "", src: "c.com", id: "c", name: "c", reasons: {reason:"masked"}, children: null },
    { url: "", src: "d.com", id: "d", name: "d", reasons: null, children: null }
  ]
}

Security and Privacy

Cross-origin iframes

We don’t want to leak cross-origin information. While exposing things that the outer page knows, i.e. id="" and src="" attribute values (reference: Measure Memory API), we certainly don’t want to expose the blocking reasons.

In order not to expose any new cross-origin information, when a cross-origin frame exists in the frame tree, this API will only report whether or not the cross-origin subtree blocked bfcache, and its frame attibutes.

As explained in Example3 in this explainer, when the frame tree contains a cross-origin subtree, we mask the subtree information; we will not show specific reasons that blocked bfcache and only report whether or not this subtree blocked bfcache.

NotRestoredReasons will be part of window.performance, and this is not accessible from cross-origin subframes. This is reported only to the top main frame.

Single cross-origin iframe vs many cross-origin iframes

When we expose whether or not a cross-origin iframe blocked bfcache, site authors could potentially infer user's state. For example, when a page embeds an iframe of a social media site and if the iframe's blocking status changes based on user's logged-in state, site authors can tell if the user is logged in or not by this information.

We think exposing a single bit about whether or not a cross-origin iframe blocked bfcache is fine though. This information - whether or not cross-origin subtree blocked bfcache - is not newly exposed. Site authors could discover this by clearing all other bfcache blocking reasons and observing whether the page is bfcache or not. So giving this bit is not giving away new information, and this information can be useful so that site authors can work with the blocking sites' authors to remove the blockage.

However, when there are many cross-origin iframes, this API could give many bits in one go. For example, a page could embed 20 different social media sites and tell which sites the user is logged in, each bit possibly implying the user state. This was also technically possible to test before this API, but if we give away the information for all the frames, then that would make it significantly easier for site authors to know this information.

In order to avoid this, we propose to only expose a single bit about cross-origin iframes; that is, if there are multiple cross-origin iframes, we randomly select one iframe and report whether or not it blocked BFCache.

See Example4

This way we can minimize cross-origin information leak.

Extension usage

If users have extensions installed and they caused bfcache to be blocked, exposing reasons can be tricky. There are two levels of new information exposure:

① Users have extensions installed and they are active on this page

Specific extensions are active on this page

① is newly exposed, and maybe it’s okay. But we definitely don’t want to expose any signals to detect which extensions are installed and active (②). We could mask all the reasons related to extensions to say “Extensions blocked bfcache”, so that we don’t give any signal for ② (turning ② into ①). There are three possible cases of extensions:

a) Extensions executed script / had unload handlers and blocked bfcache

b) Extensions messaged the page and blocked bfcache

c) Extensions modified the page and as a result blocked bfcache

In case of a) and b), we can mask the specific information and just say “Extensions blocked BFCache”.

In case of c), too, we could say “Extensions blocked bfcache”, instead of a new feature that the page started to use. For example, if an extension modified the page to use IndexedDB and that blocked bfcache, we would not report “Indexed DB usage” but only say “Extensions blocked bfcache”.

If exposing ① Extensions' presence is not okay to expose at all, we can mask all a) b) c) as "Internal error". There are non-extension related reasons that could go into this category, so this will not necessarily expose extensions' presence.

After talking to privacy team, we have decided to say “Extensions blocked bfcache” for all of the extension related reasons.

Detailed design discussion

Only report blocking frames?

We could report only the blocking frames (and their parents), instead of reporting the whole tree every time including the non-blocking frames.

Specced reasons vs browser specific reasons

We should report reasons in strings. But we need to make sure that we differentiate between spec-mandated blocking reasons vs browser specific reasons.

We could add x- to the browser specific reasons to distinguish them.

// foo is browser-specific, bar is specced.
["x-foo", "bar"]

When API is not available V.S. non-history navigation

When API is not available, notRestoredReasons will return undefined. When navigation is not history navigation, notRestoredReasons will return null.

How to expose the data

Performance Navigation Timing API

Performance Navigation Timing API tells you the type of navigation (BFCache, prerender). We could also extend this API to report the not-restored reasons.

window.addEventListener('pageshow', (event) => {
  if (!event.persisted) {
    const navEntries = performance.getEntriesByType('navigation');
    for (let i = 0; i < navEntries.length; i++) {
      console.log('Navigation entry:', i);
      const p = navEntries[i];
      // p.notRestoredReasons == [{url: "a.com", id: "x", reasons: {reason: "Broadcast channel"}, children: []}]
    }
  }
});

Considered alternatives

Reporting API

Reporting API lets you observe a deprecated feature usage / browser request intervention / crashes. We would like to have another category “bfcache” here.

Report-To:  {
              "max_age": 10886400,
              "endpoints": [{
                "url": "a.com"
              }]
            }
// -> [{url: "a.com", id: "x", reasons: {reason: "Broadcast channel"}, children: []}];

Pageshow API

Pageshow API is called every time a page is loaded, and reports the persisted parameter to suggest whether it was the initial load or the cache load.

We could extend the pageshow API by reporting the not-restored reasons when persisted == false (BFCache is not used).

But as per WICG discussion, Performance Navigation Timing API was more preferred, and we are not going to implement this as Pageshow API. Discussion meeting notes links: https://docs.google.com/document/d/1GQpM8IvL4feXQ0oQdCQIPKhZZkMLNTYJQhBUntMxPkI/edit#heading=h.mo0swzgvknmp https://w3c.github.io/web-performance/meetings/2022/2022-03-31/index.html

window.addEventListener('pageshow', function(event) {
  if (!event.persisted) {
    console.log('BFCache was not used.');
    const reasons = event.notRestoredReasons;
    // [{url: "a.com", id: "x", blocked: true, reasons: ["Broadcast channel"], children: []}];
  }
});
  1. What information might this feature expose to Web sites or other parties, and for what purposes is that exposure necessary?

    Whether or not the website is blocking back/forward cache or not. (Though this information is already available using pageshow.persisted)

  2. Do features in your specification expose the minimum amount of information necessary to enable their intended uses?

    Yes.

  3. How do the features in your specification deal with personal information, personally-identifiable information (PII), or information derived from them?

    N/A

  4. How do the features in your specification deal with sensitive information?

    N/A

  5. Do the features in your specification introduce new state for an origin that persists across browsing sessions?

    No.

  6. Do the features in your specification expose information about the underlying platform to origins?

    No.

  7. Does this specification allow an origin to send data to the underlying platform?

    No.

  8. Do features in this specification enable access to device sensors?

    No.

  9. Do features in this specification enable new script execution/loading mechanisms?

    No.

  10. Do features in this specification allow an origin to access other devices?

    No.

  11. Do features in this specification allow an origin some measure of control over a user agent's native UI?

    No.

  12. What temporary identifiers do the features in this specification create or expose to the web?

    No.

  13. How does this specification distinguish between behavior in first-party and third-party contexts?

    Expose why the page is not restored from back/forward cache fully in details containing the blocking reasons for first-party contexts. Only expose whether the page blocks back/forward cache or not for third-party contexts.

  14. How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode?

    No difference.

  15. Does this specification have both "Security Considerations" and "Privacy Considerations" sections?

    It does now: Security and Privacy

  16. Do features in your specification enable origins to downgrade default security protections?

    No.

  17. How does your feature handle non-"fully active" documents?

    N/A

  18. What should this questionnaire have asked?

    Is okay to **explicity **expose whether or not cross-origin frames have blocked back/forward cache?

References