Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: extract “static” component subtrees to HTML during build #17479

Closed
devongovett opened this issue Nov 28, 2019 · 10 comments
Closed

Idea: extract “static” component subtrees to HTML during build #17479

devongovett opened this issue Nov 28, 2019 · 10 comments

Comments

@devongovett
Copy link
Contributor

devongovett commented Nov 28, 2019

This came up in a twitter thread with @gaearon and @aweary and we decided to move the discussion here. I’ll try to summarize the conversation so far below.

The basic question was whether a build tool could extract the static parts of a component tree ahead of time into HTML, and ship smaller JS to the client with only the dynamic parts. This could have benefits for code size and hydration performance for statically generated sites. It would likely have a smaller impact on dynamically generated server rendered sites, but it’s possible there are static parts that could be extracted there too (e.g. header, footer, article content, etc.).

There are a couple potential ways to go about this, each with various tradeoffs.

  1. Do something like what ember and other template compilers do and generate some kind of IR from components and inject dynamic content into slots at runtime. This would likely require a lot of changes to React itself. @aweary seems to be working on some kind of compiler to do just that.
  2. Rewrite the JS with something like prepack, similar to what @trueadm did here (though it sounded like the output was quite variable in size).
  3. Rewrite the component tree to generate a different but equivalent tree with the static parts hoisted out. Similar to this babel plugin but taken much farther to work at a whole tree level instead of a component level. Also to remove the static parts from JS altogether and generate static HTML to avoid duplicate content in JS and unnecessary hydration cost. Some way to allow static HTML in the middle of a tree to be reused might be needed, but maybe compiling to multiple roots would work?

Obviously a lot more thought is needed here. As @gaearon noted, a solid definition of “static” will be important for this discussion. Mine is that it could be rendered to HTML and never updated by JS, but perhaps people have other ideas.

@zhangenming
Copy link
Contributor

btw, prepack is it still being maintained?
In recent year, no code was submitted....

@devongovett
Copy link
Contributor Author

devongovett commented Nov 28, 2019

A very simple example of input/output for discussion is below. Obviously would be much more complex in practice.

This example is emulating some kind of static site generator that generates the full HTML page in React. There are static and dynamic parts (dependent on client state).

Home.js

import React from 'react';

// Example of a page level component.
export function Home() {
  let [count, setCount] = React.useState(0);
  return (
    <html>
      <body>
        <Header />
        <h1>This is the homepage</h1>
        <p>Some dynamic content: {count}</p>
        <button onClick={() => setCount(c => c + 1)}>Increment</button>
      </body>
    </html>
  );
}

Header.js

export function Header() {
  return (
    <header>
      This is the header. It’s pretty static.
    </header>
  );
}

Generated HTML:

<html>
  <body>
    <script async src="bundle.js"></script> 
    <header>
      This is the header. It’s pretty static.
    </header>
    <h1>This is the homepage</h1>
    <div id="placeholder1" />
  </body>
</html>

Rough generated JS:

function HomeGenerated() {
  let [count, setCount] = React.useState(0);
  return (
    <Fragment>
      <p>Some dynamic content: {count}</p>
      <button onClick={() => setCount(c => c + 1)}>Increment</button>
    </Fragment>
  );
}

// Hypothetical method that works like ReactDOM.render, but replaces the placeholder node instead of rendering into it
ReactDOM.renderPlaceholder(<HomeGenerated />, document.getElementById('placeholder1');

As you can see, the idea is to generate HTML for the static parts and rather than rendering the dynamic parts like with normal SSR and hydrating at runtime, a placeholder is rendered instead which is replaced on the client side (possibly these could be combined too?). The generated JS includes a version of the original component, but with only the dynamic parts left.

@gaearon
Copy link
Collaborator

gaearon commented Nov 28, 2019

Can you flesh out what "static" means more? Can Header take props? Render something conditionally? Can it contain event handlers? Read context? Contain components that do that?

Also, can components above Header do that?

@devongovett
Copy link
Contributor Author

I don't want to limit what is possible for users, so the answer to all of your questions should be "yes". What it optimizes to is the question.

I don't think this should happen around the original component boundaries, but the entire tree for a page/app. So, rather than asking whether Header accepts props, can have event handlers, etc., we should ask "what parts of the rendered DOM tree for this page could be represented as HTML that will never change?". The trick is figuring out how to render those parts ahead of time, and inject the dynamic parts without re-rendering the whole tree on the client.

Seems like two primitives might be needed:

  1. A way to replace/insert a node in HTML with dynamic client rendered node(s). This would allow "hydrating" dynamic parts of a tree without re-rendering the whole thing.
  2. A way to include an existing static HTML node in a React tree. This would allow including static content in the middle of a dynamic tree without re-rendering it on the client.

With those, a compiler could generate HTML and create a new component tree to render the just dynamic parts at runtime. Kinda like we used to add dynamic parts with jQuery or similar (we didn't usually re-render the whole app again), but automatic.

I can try to write a more complicated example exploring these ideas further tomorrow.

@themre
Copy link

themre commented Nov 28, 2019

@zhangenming no, prepack is for now on hold.

@trueadm
Copy link
Contributor

trueadm commented Nov 28, 2019

From experience with working on this from Prepack – we did not see any real perf wins from separating out static and dynamic content for our real-world apps. In benchmarks, the perf wins looked great, but then, because of the nature of React apps, things become more complex. High order components, conditionally returning early, spreading props etc, make it hard to reason about all possible cases ahead-of-time.

You can probably still play around with Prepack from master using this guide: https://github.com/facebook/prepack/wiki/react-compiler

Note: we stopped working on this a while back and the project in on hold.

Furthermore, I believe a better solution would be as you mentioned in point 1:

Do something like what ember and other template compilers do and generate some kind of IR from components and inject dynamic content into slots at runtime. This would likely require a lot of changes to React itself. @aweary seems to be working on some kind of compiler to do just that.

I also explored this earlier this year and was able to build a hacky compiler (using the Babylon parser) in my spare time that took a strict set of Flow typed React components (function components only), and optimize them into static IR trees that contain: the template, the control logic and a bunch of compute functions that evaluate dynamic slots at runtime. It improved performance measurably, but also required an entirely new React reconciler and runtime to support the new IR format. It also didn't support Concurrent Mode, Suspense and a bunch of other important features, nor was it feature complete – I simply ran out of spare time to keeping going on it. However, I did show the project to @aweary, so hopefully he was able to extract out the ideas I had there.

Ultimately though, I don't think this is the perfect approach either. I have some other ideas but I don't really want to dive into them just yet. There's a big risk that if I do, I'll get too excited by the prospect of them, and not get my current work done before the end of the half. :P

@devongovett
Copy link
Contributor Author

What I'm thinking about is quite different than what Prepack does, and definitely much simpler than an IR compiler + new reconciler. I'm thinking of something much closer to traditional SSR + hydration, with some additional integration with the JS bundler. The two main differences from current SSR + hydration are:

  1. Don't re-render static content client side. Include a placeholder in the tree to reuse the existing server rendered DOM node.
  2. Exclude the JS code that generates the static nodes from the client JS bundle.

I think this could be implemented by integrating react-dom/server and a JS bundler. During rendering, React knows which components have state, effects, event handlers, etc. These components could be marked and returned in addition to the rendered HTML. The bundler could then exclude pure components that only render some HTML from props. The JSX could be rewritten to exclude these components and replaced with a placeholder referencing the HTML via some data attribute. During hydration, React could reuse the nodes pointed to by these placeholders rather than re-rendering.

The output I'd like to see is static HTML and some minimal JS to make it interactive. Anything non-interactive can be automatically excluded. This is closer to the days when we manually wrote HTML + JS with e.g. jQuery to make parts of a page interactive, but done automatically at build time from a single component tree.

Usecases

Tools like Gatsby and Next.js use React to build static websites, with each page prerendered to HTML and then hydrated at runtime. These are very common because they let you mix static and dynamic very easily and take advantage of React's component model. The downside is that a lot of work is duplicated at runtime that's already done at build time in order to hydrate the page and enable the interactive parts. Much of that might be unnecessary, because most of the content likely will not change until the user browses to another page. This results in large JS bundles duplicating the same content that's already in the HTML, longer time to interactive, etc.

Let's take a blog built with Gatsby as an example. It's mostly static content. There might be a few interactive elements at the top and bottom of a post, e.g. to change the theme from light to dark mode, leave comments, subscribe etc. The main content of a post is mostly static though. Unfortunately, due to the way hydration works, you have to re-render the entire page as generated at build time in order to make the dynamic parts interactive. This requires all of the data used to generate the page to be loaded as a JSON file (including the entire text of the blog post, again), and for all of the component code used to generate the page to be downloaded. All of this was already rendered just fine in the HTML, and in fact you could turn off JavaScript and the page still renders mostly fine with some interactive parts disabled. Most the JS that's loaded is not really needed.

You might argue that the JS would still be needed to render a different blog post on some later navigation without reloading the whole page. This is true, but there are alternatives. One is to download the code for those later, for subsequent page navigations. Another is to simply download the pre-rendered HTML rather than the data for the subsequent post just like during the initial page load. Or you could not bother with client side routing at all during full page navigations, and opt to use a service worker to speed things up. I'm not sure that client rendering the entire page makes the most sense for full page content sites. It duplicates a lot of work that's already been done at build time, and results in heavier sites that may not be faster in reality vs what's possible via other means.

Perhaps using React for static sites is the wrong tool for the job. Maybe we should render some static HTML with a template language and inject the dynamic parts manually via a client side script. The popularity of tools like Gatsby and Next shows that people want to use the power of the React component model to generate their static sites though. The downside is the weight of these sites vs traditional sites generated by other SSG tools (e.g. Jekyl). I think we can do better.

@mohsen1
Copy link

mohsen1 commented Dec 2, 2019

This is something I have been thinking about for a while now. See my issue on Preact. Exactly because of how Gatsby works. There are a lot of "mostly static content" React websites in the wild that can benefit from this sort of optimization.

AFAIK React hydration "just works" if you take out the static contents. Should be easy to mark some components "static" and build a quick compiler that simply removes those components from the JS bundle. I think it's worth implementing something like this in user land by ignoring ReactDOM.hydrat's mismatch client/server errors to see benefits on websites like Gatsby's own documentations.

@sebmarkbage
Copy link
Collaborator

"Flight" is meant to address this problem space but in a different way. I think it's easiest just to show it when I land a bit more code.

@gaearon
Copy link
Collaborator

gaearon commented Mar 24, 2021

I think we can close this now.

Server Components address this by separating parts that can be executed ahead of time (whether during build or on the server) from parts that are state-driven and need to be on the client. While they require some manual judgement on which component lives in which world we think that is good because you have precise guarantees about what gets shipped to the client. Although a more automated compiler could still be possible on top.

Server Component indeed relate both to server rendering to HTML (which is a separate topic) and bundling. Rendering them to HTML is a missing piece but there’s ongoing work on that (#20970). In addition to Server Components work itself. So that’s work in progress too.

Together these two projects address the problem of loading unnecessary code, and doing unnecessary rendering computation on the client. There’s still some data duplication for content and data needed to hydrate the client. However the data itself is more limited than traditional approaches because it’s essentially an already prepared render output. Instead of raw response JSON. We also plan to layer an additional optimization to reduce duplication of text content and attributes. So that it reuses what’s in the HTML for the initial render.

All of this is work in progress but I think it addresses the original issue that we can close this.

https://reactjs.org/server-components

@gaearon gaearon closed this as completed Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants