Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby): lazy bundle page components in dev server #27884

Merged
merged 36 commits into from
Nov 17, 2020
Merged

Conversation

KyleAMathews
Copy link
Contributor

@KyleAMathews KyleAMathews commented Nov 6, 2020

This means we only compile the core runtime code initially — page components are only compiled when the user visits the page.

This can lead to time savings on compiling the dev bundle up to 80% — e.g. building the dev bundle for one test site w/ a large number of components dropped from 42 seconds to 5 seconds.

On gatsbyjs.com, the bundle compilation time dropped from 145 seconds to 27 seconds for a 81% decrease. Smaller sites won't see as a dramatic of difference — more around ~40-60%.

This means we only compile the core runtime code initially — page components are only compiled when the user visits the page.
@KyleAMathews KyleAMathews requested a review from a team as a code owner November 6, 2020 21:31
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Nov 6, 2020
@KyleAMathews KyleAMathews added type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement. and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Nov 6, 2020
Copy link
Contributor

@pvdz pvdz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(some comments, by no means a thorough review)

return Promise.resolve(lazyRequires.lazyComponents[chunkName])
} else {
console.log(`trigger compilation on the server for`, chunkName)
return new Promise(resolve => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This promise doesn't reject (errors would be lost) and the timeout timer does not call reject nor resolve, meaning I think it ends up in a zombie state once it hits that path, since the interval that would call resolve and is the only way to resolve the promise is stopped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing here to reject on I don't think. We're just telling the server to start compiling the page component and then... waiting. We're not doing anything per se.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything can crash. this.loadComponent can fail for any number of reasons 🤷

const componentChunkPromise = this.loadComponent(
componentChunkName,
pageData.path
).then(component => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when loadComponent throws an error? Or when the then callback does? I think this goes to most promises here. Is there a general uncaught rejection handler in place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I know we've talked about adding a general error boundary for this stuff in React land but that work hasn't been done yet AFAIK. @wardpeet or @pieh might know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProdLoader's loadComponent method current implementation catches rejection internally and it resolve to null -

const loadComponent = chunkName =>
asyncRequires.components[chunkName]
? asyncRequires.components[chunkName]()
.then(preferDefault)
// loader will handle the case when component is null
.catch(() => null)
: Promise.resolve()
super(loadComponent, matchPaths)

There should be some null checks by whatever call loadComponent method, and I think behaviour should be similar as when user tries to navigate to a page that can't load page-data.json file -

if (data.notFound) {
// check if html file exist using HEAD request:
// if it does we should navigate to it instead of showing 404
return doFetch(rawPath, `HEAD`).then(req => {
if (req.status === 200) {
// page (.html file) actually exist (or we asked for 404 )
// returning page resources status as errored to trigger
// regular browser navigation to given page
return {
status: PageResourceStatus.Error,
}
}
// if HEAD request wasn't 200, return notFound result
// and show 404 page
return data
})
}

(above code is done for the most part as resource loading resilience attempt, to try to load the page in traditional way if resources are blocked or missing)

DevLoader (speaking about master branch, not this PR as I didn't follow changes yet) doesn't have anything like that but it also doesn't do any async work, because we don't split chunks in develop and everything load at once, so if "module loading" would cause errors it will likely break entire runtime at the time we import sync-requires (so

import syncRequires from "$virtual/sync-requires"
)?

packages/gatsby/src/bootstrap/requires-writer.ts Outdated Show resolved Hide resolved
packages/gatsby/src/redux/actions/public.js Outdated Show resolved Hide resolved
@pvdz
Copy link
Contributor

pvdz commented Nov 6, 2020

You had me at 80%! 🚀

@pvdz pvdz added topic: dev tooling topic: performance Related to runtime & build performance labels Nov 6, 2020
…robably has errors in their code or something bigger is wrong so let them just refresh, etc.
@KyleAMathews
Copy link
Contributor Author

Guessing the development_runtime failures are from the compilation not getting triggered under certain scenarios or hitting timeouts. Will dig in Monday.

Copy link
Contributor

@wardpeet wardpeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I'm not the biggest fan of the redux dance, I rather have it all scoped inside a webpack plugin but that would be more work so let's go with this approach 💪 💯

What's the delay on page load?

packages/gatsby/src/bootstrap/requires-writer.ts Outdated Show resolved Hide resolved
packages/gatsby/cache-dir/dev-loader.js Outdated Show resolved Hide resolved
KyleAMathews and others added 2 commits November 9, 2020 12:01
Co-authored-by: Ward Peeters <ward@coding-tech.com>
Co-authored-by: Ward Peeters <ward@coding-tech.com>
@KyleAMathews
Copy link
Contributor Author

What's the delay on page load?

depends on how big the page component is but for simple sites it was very reasonable e.g. < 200ms

@KyleAMathews
Copy link
Contributor Author

KyleAMathews commented Nov 10, 2020

Ended up going on quite the journey today — pages with static queries weren't loading — turns out we don't know what the static queries for a page component are until we bundle it. So we have to first trigger bundling it, wait until it's written out (now including static queries), and then finally load the page-data.json & static queries in the browser.

56d5288ae5

We only discover static queries for pages after the component is compiled.

Which means when we lazily add page components, we initially don't know
what static queries should be loaded. To track this, we mark page-data.json
files as not bundled & only after the staticQueryHashes is written do we
accept the page as loaded.
@KyleAMathews
Copy link
Contributor Author

This is definitely weirder now so not sure if there's a better way to handle this. With the per-page querying work, it'd probably be easier to track this as we'll be tracking the state of each page cc @pieh @vladar

@pieh
Copy link
Contributor

pieh commented Nov 10, 2020

With the per-page querying work, it'd probably be easier to track this as we'll be tracking the state of each page

We can potentially trigger that in page-data express handler ( #27882 - that will be one of triggers for query on demand).

Tricky part however is that in legacy mode (non query on demand) page-data endpoints will be hit for preloading too and this would need additional work so we can distinguish between page-data fetches needed to render a page from those that preload data for other pages. Unless we would want to precompile those on hovers as well? Would definitely need some testing

Copy link
Contributor

@vladar vladar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a shallow review - mostly checked if there is something that leaks through the experimental flag. Couldn't find anything particularly standing out.

Only the additional loader.loadPage(window.location.pathname).then() in app.js is not quite clear.

packages/gatsby/cache-dir/app.js Outdated Show resolved Hide resolved
Comment on lines 14 to 32
// Tell the server the user wants to visit this page
// to trigger it compiling the page component's code.
const checkForBundle = () => {
// Check if the bundle is included and return.
if (process.env.NODE_ENV !== `test`) {
delete require.cache[
require.resolve(`$virtual/lazy-client-sync-requires`)
]
}

const lazyRequires = require(`$virtual/lazy-client-sync-requires`)
if (lazyRequires.lazyComponents[chunkName]) {
resolve(lazyRequires.lazyComponents[chunkName])
} else {
setTimeout(checkForBundle, 100)
}
}

checkForBundle()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to consider that requested chunkname actually would never show up in the bundle and this would never resolve or reject?

because of a lot of async execution here (including networking) there still might be the cases where template might no longer be there (was removed since initial request)

We did hit problem like this Vlad when working on query on demand in this test scenario:

describe("no 404", () => {
before(() => {
cy.task(`restoreAllBlockedResources`)
cy.task(`blockAssetsForPage`, {
pagePath: `/404`,
filter: `page-data`,
})
cy.task(`blockAssetsForPage`, {
pagePath: `/404.html`,
filter: `page-data`,
})
cy.task(`blockPageComponent`, {
path: `pages/404.js`,
})
})
after(() => {
cy.task(`restoreAllBlockedResources`)
})
it(`make sure 404 is NOT present`, () => {
cy.visit(`/______not_existing_page`).waitForRouteChange()
cy.findByText("Preview custom 404 page").click()
cy.findByText("A custom 404 page wasn't detected", {
exact: false,
}).should(`exist`)
cy.findByText(
"You just hit a route that does not exist... the sadness.",
{ exact: false }
).should(`not.exist`)
})

which removes the page and immediately tries to visit it so page is still in store ( we didn't receive event from chokidar yet to remove the page), then pages get removed while we do wait for page-data for it - I think this is quite similar case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not blocking merge because this is experimental - but something to keep in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... actually timing out makes a lot of sense — we can check for say 5 seconds and then quit. Otherwise the runtime will just keep looping forever trying to load it which is wasteful at best and could slow things down.

pieh
pieh previously approved these changes Nov 17, 2020
Copy link
Contributor

@pieh pieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to get into master branch

it might have some edge cases due to timing/async, but all of this is behind experimental flag so I think it's good to go!

faster gatsby develop, here we come :)

Copy link
Contributor

@pieh pieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-approve after merging master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: performance Related to runtime & build performance type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants