Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing new HTML elements that are pay-for-what-you-use #4697

Open
domenic opened this issue Jun 13, 2019 · 20 comments

Comments

@domenic
Copy link
Member

commented Jun 13, 2019

Recently, my team (within Chrome) has been working on initial explainers and explorations for new HTML elements: a virtual scroller, a toggle switch and a toast. There are a couple common issues we're attempting to tackle with these that need a central place to discuss, and the whatwg/html community is a good place to have those. These issues are polyfillability, and pay-for-what-you-use. They are somewhat connected, but I want to try to use two separate issues to discuss them; we'll see if that works. This issue is about pay for what you use.

The problem: adding new features to the global namespace, including HTML elements, increases the weight of every web page. Each individual feature is not very costly, but together they create a tragedy of the commons. The fact that features baked in to browsers are loaded for every page creates an attitude that more features belong in libraries, and less in the platform. But this leads to a fundamentally unergonomic platform, where you have to pull in large libraries (of inconsistent quality or accessibility) to accomplish basic UI patterns (such as, but clearly not limited to, virtual scrollers, toggle switches, toasts, etc.). We're hoping that by finding a new way to make HTML elements (and other APIs) pay-for-what-you-use, we can break the web out of this paradigm.

Elsewhere on the platform, we're exploring a solution for this problem via built-in modules, imported via the JavaScript module system. For TC39 specs, the JavaScript standard library proposal is meant to power APIs such as temporal. For web specs, the Web IDL modules infrastructure is meant to power specs such as KV storage. Overall, I think there is general interest from both the browser and implementer community in built-in modules as a solution for new JavaScript APIs.

This discussion is about how we can accomplish the same for HTML elements, not just JavaScript APIs. My opinion is that we can use the same solution as the rest of the platform. That is, you can opt in to using a HTML element using the JavaScript module system, e.g.

<script type="module">
import { StdSwitchElement } from "std:switch";
document.body.append(new StdSwitchElement());
</script>

or

<script type="module" src="std:switch"></script>
<std-switch></std-switch>

(Note: check also the polyfillability discussions in #4696, for the <std- prefix.)

Apart from the alignment-with-other APIs question, the module system works well for this because it already has so much built-in: polyfilling via import maps (module the #4696 discussions), lazy-loading via import(), etc. In the past, discussions around adding new systems for loading code, such as HTML imports, have specifically gotten push back from vendors who prefer to use the JavaScript module system as the unifying way to load dependencies.

But, what do others think?

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

I'll start of the discussion by transferring over from w3ctag/design-reviews#384 (comment), where @othermaciej discusses some potential objections for using built-in modules for new HTML elements, as opposed to script APIs. In particular, the ones that are separate from the polyfillability discussion (#4696) are:

  • Having to do an import per-element doesn't seem like it scales well.
  • It's weird for built-in HTML elements to pick up the quirk of custom elements that they start out as HTMLUnknownElement.

I'll give my initial impressions here, to start the dialogue, and @othermaciej (or others) can chime in.

I think the scaling should work reasonably well. An import per major API is a pattern we see in many JavaScript-module-based libraries today, including UI libraries. It also encourages finer-grained loading, so that e.g. you only load the switch control once you start rendering the settings page of your app, which needs it. If there are controls we think will always be used together (e.g., a <std-tabcontainer> plus <std-tab>), then they should be in the same module, but otherwise, explicitly stating what you plan to use seems pretty reasonable to me.

I agree that it's a bit weird for an element to start out as HTMLUnknownElement and then transition to something else. I don't see many other alternatives, though. If we allow dynamic loading of HTML elements to achieve the pay-for-what-you-use goal, I think we have two options: upgrade existing instances from HTMLUnknownElement, or don't upgrade them and leave them as HTMLUnknownElement. The former has precedent and infrastructure in browsers, whereas the latter introduces a strange state where some of the <std-switch>s are HTMLUnknownElement and others are StdSwitchElement. I guess a third option could be failing the import if any <std-switch>s exist on the page?

@zcorpan

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

Have you considered lazy initialization of features? So you get pay-for-what-you-use by just using the features you need, and the browser paying the cost of initializing those features when they are first used.

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

@zcorpan, I think it depends on what you mean.

If you do blocking initialization, then I think that works transparently as you envision. This would be similar to a JavaScript API for which you synchronously load the implementation whenever someone first accesses the corresponding global property. This can cause some jank, though, unless everything's already loaded into memory anyway, which kind of defeats the point.

If you do async initialization, in order to get no-jank lazy-loading, then in my opinion having an explicit opt-in, and in particular an explicit point at which the feature is fully loaded, is quite helpful. E.g.

<script type="module" src="std:switch"></script>

<script type="module">
console.assert('on' in document.querySelector("std-switch")); // no longer HTMLUnknownElement
</script>

or

import "std:switch";

console.assert('on' in document.querySelector("std-switch")); // no longer HTMLUnknownElement

or

loadOptionsSheet.onclick = async () => {
  await import("std:switch");
  optionsSheet.innerHTML = `... <std-switch></std-switch> ...`;
};
@Jamesernator

This comment has been minimized.

Copy link

commented Jun 14, 2019

The one thing I am wary of is a Flash of Unstyled Content with builtin elements.

With user defined custom elements you can use your own logic to determine some initial DOM.

But with these std: elements, you need to effectively reimplement a good chunk of them to prerender them. Worse, because some of them might have UA specific styles you might not even be able to prerender them.

Perhaps there could be a mechanism for blocking rendering of following elements until a module is loaded e.g.:

<script type="module" src="std:switch" render-blocking></script>

The exact semantics that this would need I'm not sure (maybe just hiding the rendering of elements after the current element until loaded).

@Yay295

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2019

I don't think JavaScript should have any say in what HTML objects are "imported" into your page. This seems like something that should be in <head>, like:

<head>
  <meta charset="utf-8">
  <title>HTML Import Example</title>
  <import>
    <switch>
    <toast>
  </import>
</head>

I do agree that this could be a good feature though. This could even help sold the other issue you just created: #4696. One issue mentioned there was that if you define a custom element with JavaScript, and then the element eventually become a standard native element, it could break the page. This wouldn't happen if the native element has to be imported first, because without the import it would continue working as a custom element.

@dbatiste

This comment has been minimized.

Copy link

commented Jun 14, 2019

@Yay295 : I think lazy-loading via dynamic imports would be worthwhile though.

I agree there is potential for collision if future element is built-in, depending how this works.

@kenchris

This comment has been minimized.

Copy link

commented Jun 14, 2019

For custom elements you can call https://developer.mozilla.org/en-US/docs/Web/API/CustomElementRegistry/whenDefined

Should this also work for the built-ins? Even if not implemented as custom elements internally in the browser

@zcorpan

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

@domenic, I envision sync initialization when used in script, however I suppose it can be hard to define what constitutes "use" in script. createElement and accessing the global are obvious for elements, but maybe there are more.

When an element is found by the parser, it can load the implementation as part of the tree builder, before inserting the element. This would block the parser but not the main thread (except maybe for document.write and innerHTML).

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2019

@zcorpan yeah, blocking the main thread on innerHTML or createElement seems pretty bad to me. Especially if you're in a browser that doesn't implement the element, and so you're polyfilling using import maps (#4696 bleeding in to this issue). Then you'd block on a network fetch, not just a potential disk fetch. Async initialization still seems necessary to me if we want pay-for-what-you-use.

@dcleao

This comment has been minimized.

Copy link

commented Jun 14, 2019

The declaration of a certain XML namespace, used for a certain group of standard elements would support this well.

@zcorpan

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

@domenic if you're loading a polyfill, can't that can be done up front and async?

Otherwise, if you don't implement the new element, how would you know to fetch a polyfill from parsing the tag?

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 14, 2019

@zcorpan I'm confused, I thought we were talking about your design where you lazily-load elements as they're encountered.

@kumarharsh

This comment has been minimized.

Copy link

commented Jun 15, 2019

What would the behaviour be when the user has disabled JavaScript? Would the elements be stuck in the HTMLUnknownElement form?

@zcorpan

This comment has been minimized.

Copy link
Member

commented Jun 15, 2019

@domenic I think the loading of the implementation can differ between native and polyfill. The lazy implicit approach can make sense for native, but not for polyfill.

Since polyfills last a short time (a few years; c.f. picturefill), and the native implementation lasts decades (at least), I think it makes sense to optimize the ergonomics of the native version.

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2019

I guess that's where I disagree. Again it's #4696 bleeding in to this. I think we should not have major behavior differences of that sort between the polyfill and the eventual native version. Especially given our poor track record of achieving the level of all-modern-browser support for new elements that is necessary for web developers to use them confidently. (See the implementation timelines on details, dialog, input type=date, datalist, ...).

I'll also note that, although a network fetch for the polyfill is very bad, so is blocking innerHTML or createElement on the form of lazy-loading proposed in the OP---which could even be from the network, if browsers really want to slim down their binary size. Indeed, in many emerging markets we've found that network access is actually faster than disk access, some significant percent of the time. (I can try to find the more exact data if that would be helpful.)

@caridy

This comment has been minimized.

Copy link

commented Jun 15, 2019

@domenic I like the direction. Importing std:switch at any level should work well for us. Many of the other suggestions in this thread assume that you own/control the app, and that the app should be the one provisioning the standard components, that doesn't work for us at salesforce. We definitely favor to stick the regular import semantics, than inventing something new for these new elements.

@dbaron

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

The opening comment here says:

The problem: adding new features to the global namespace, including HTML elements, increases the weight of every web page. Each individual feature is not very costly, but together they create a tragedy of the commons.

I think it would be useful to be clearer about what costs you're trying to avoid. Is it the cost of adding more names to the namespace of elements? Is it performance or memory costs of having more features in the browser (and if so, does the mechanism being proposed here actually help those)? Other things?

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 28, 2019

I think it would be useful to be clearer about what costs you're trying to avoid. Is it the cost of adding more names to the namespace of elements? Is it performance or memory costs of having more features in the browser (and if so, does the mechanism being proposed here actually help those)? Other things?

Sorry for missing this, @dbaron. With this issue, I am targeting the per-page performance and memory costs of having more features in the browser. I would like to come up with a technological framework which allows implementations to only load what a given page will use, from disk into memory. At least in the case of high-level APIs, like the new HTML elements proposed here, which do not need to be intricately entangled into other parts of the system.

The idea is to overcome one of the technical arguments for "people should just do that in libraries". Namely, when the browser bakes something in, every page pays for it. Whereas when people pull in a library, only the page pulling in that library uses it. We'd like to get to a place where web standards also have the ability to provide features that are pay-for-what-you-use, like libraries currently do.

The JavaScript module system fits in here very naturally, as it creates an asynchronous load point, plus adds a lot of infrastructure that makes it easy to write code which only runs after that asynchronous load has completed. Our experimentation with built-in modules in Chrome (for the elements mentioned in the OP, plus KV Storage) has shown this to be pretty natural to integrate into the loading pipeline, and get these lazy-loading benefits.

You can imagine other technical mechanisms, e.g. some of the things @zcorpan and I are discussing above, where calling createElement('std-switch') or parsing HTML containing <std-switch> or accessing window.StdSwitchElement blocks the return from that API until the implementation is loaded from disk into memory. But I think an asynchronous mechanism will work better for page performance and responsiveness, and when I survey async code-loading mechanisms in the platform today, JS modules are a pretty good one.

@rniwa

This comment has been minimized.

Copy link
Collaborator

commented Jun 28, 2019

With this issue, I am targeting the per-page performance and memory costs of having more features in the browser. I would like to come up with a technological framework which allows implementations to only load what a given page will use, from disk into memory. At least in the case of high-level APIs, like the new HTML elements proposed here, which do not need to be intricately entangled into other parts of the system.

This should be already the case in most operating systems like macOS and iOS as long as browser engine's binary is ordered so that functions related to a single element is localized in the text segment of the engine's binary. The modern OS kernel is smart enough to only pull in the part of a single binary that's being used, and as long as the text segment is not dirty, the kernel is smart enough to dynamically throw it away from the physical memory when there is a sufficient memory pressure. Tools like vmmap and footprint should provide this information.

There should be virtually no benefit beyond the perhaps miniscule amount of memory used to have more constructor names being visible in the global object. Even this shouldn't be an issue because the list of builtin objects don't usually change from a document to document, and JS engines like JSC can share the underlying data structures across multiple documents to save memory space.

On the contrary, having this kind of dynamic module loading behavior for builtin elements would probably result in more dirty memory use because now each document may load a different set of elements, and therefore requires a different underlying representation at least in WebKit / JSC.

Now, there might be some runtime benefit in having less builtin elements because that would basically mean that the parser would have to search the matching builtin element from a smaller set. And even a hash table with amortized O(1) lookup time can be slowed down due to the cache hit / TLB misses if the table is sufficiently large. However, this kind of lookup is highly optimizable at compilation time and could be shared across documents. Because a typical web page tend to have many documents due to frames, about:blank navigation, etc... the amortized cost of such a hash table in each process tends to be negligibly small and highly performant.

Again, on the contrary, looking up an element name in a dynamically mutating hash set is a lot more challenging problem. At least in WebKit's implementation, every custom element instantiation is way slower than builtin element instantiation to due to the secondary lookup we have to do with the custom elements registry (~20% the last time I checked).

So I'd really love to see some empirical evidence for the runtime & memory cost of having more builtin elements being higher than the proposed dynamically loaded elements if you have any.

@domenic

This comment has been minimized.

Copy link
Member Author

commented Jun 28, 2019

Sure, I'll work on getting that for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.