New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] remove static-site-generator-webpack-plugin and render html files ourselves #4912

Merged
merged 1 commit into from Apr 19, 2018

Conversation

4 participants
@pieh
Contributor

pieh commented Apr 10, 2018

this gives us more control over the process

this also initially use better-queue to limit concurrent page renders (right now hardcoded to 20) and decrease memory footprint for larger sites

@wafflebot wafflebot bot assigned pieh Apr 10, 2018

@wafflebot wafflebot bot added the review label Apr 10, 2018

@pieh pieh changed the title from remove static-site-generator-webpack-plugin and render html files ourselves to [v2] remove static-site-generator-webpack-plugin and render html files ourselves Apr 10, 2018

@pieh pieh added this to To Do in Gatsby v2 Release via automation Apr 10, 2018

remove static-site-generator-webpack-plugin and render html files our…
…selves

this gives us more control over the process

Signed-off-by: Michal Piechowiak <misiek.piechowiak@gmail.com>
@KyleAMathews

This comment has been minimized.

Show comment
Hide comment
@KyleAMathews

KyleAMathews Apr 10, 2018

Contributor

Great! Will review in more detail later. Curious what kind of memory reductions you're seeing?

Contributor

KyleAMathews commented Apr 10, 2018

Great! Will review in more detail later. Curious what kind of memory reductions you're seeing?

module.exports = (htmlComponentRenderer, pages) =>
new Promise(resolve => {
const queue = new Queue(

This comment has been minimized.

@jquense

jquense Apr 10, 2018

Collaborator

p-queue may be a good option for promise support?

@jquense

jquense Apr 10, 2018

Collaborator

p-queue may be a good option for promise support?

This comment has been minimized.

@pieh

pieh Apr 10, 2018

Contributor

I will have a look at it tomorrow

@pieh

pieh Apr 10, 2018

Contributor

I will have a look at it tomorrow

This comment has been minimized.

@KyleAMathews

KyleAMathews Apr 10, 2018

Contributor

p-queue is nice but better-queue is better in that it gives us a lot more knobs and buttons to play with e.g. support for merging, filtering, etc. We're also using it a number of other places in core & plugins.

@KyleAMathews

KyleAMathews Apr 10, 2018

Contributor

p-queue is nice but better-queue is better in that it gives us a lot more knobs and buttons to play with e.g. support for merging, filtering, etc. We're also using it a number of other places in core & plugins.

This comment has been minimized.

@KyleAMathews

KyleAMathews Apr 10, 2018

Contributor

better-queue also gathers a lot of stats which would be interesting for analysis https://www.npmjs.com/package/better-queue#queue-statistics

@KyleAMathews

KyleAMathews Apr 10, 2018

Contributor

better-queue also gathers a lot of stats which would be interesting for analysis https://www.npmjs.com/package/better-queue#queue-statistics

This comment has been minimized.

@pieh

pieh Apr 16, 2018

Contributor

Another idea I came across today is to use http://bluebirdjs.com/docs/api/promise.map.html#promise.map . I don't think we actually need any of the knobs apart from concurrency setting (at least right now) and using that would get rid of some of the boilerplate code here and doesn't require to add additional dependency to gatsby

@pieh

pieh Apr 16, 2018

Contributor

Another idea I came across today is to use http://bluebirdjs.com/docs/api/promise.map.html#promise.map . I don't think we actually need any of the knobs apart from concurrency setting (at least right now) and using that would get rid of some of the boilerplate code here and doesn't require to add additional dependency to gatsby

This comment has been minimized.

@KyleAMathews

KyleAMathews Apr 16, 2018

Contributor

I'd rather have the optionality to tweak more things + I want to add an optional build stats report and the build-html queue stats would be really valuable thing to add.

@KyleAMathews

KyleAMathews Apr 16, 2018

Contributor

I'd rather have the optionality to tweak more things + I want to add an optional build stats report and the build-html queue stats would be really valuable thing to add.

This comment has been minimized.

@KyleAMathews

KyleAMathews Apr 17, 2018

Contributor

Oh also we're already using better-queue in core so this isn't a new dependency.

@KyleAMathews

KyleAMathews Apr 17, 2018

Contributor

Oh also we're already using better-queue in core so this isn't a new dependency.

This comment has been minimized.

@pieh

pieh Apr 17, 2018

Contributor

I meant p-queue as potential new dependency. But I guess this can just stay as is then. Unless you would like me to add html build stats in this PR?

@pieh

pieh Apr 17, 2018

Contributor

I meant p-queue as potential new dependency. But I guess this can just stay as is then. Unless you would like me to add html build stats in this PR?

This comment has been minimized.

@KyleAMathews

KyleAMathews Apr 17, 2018

Contributor

Oh right, sorry.

No the stats can come in a future PR — let's get this in so our first beta will be in great shape. We need to think some first about the design of the stats gathering.

@KyleAMathews

KyleAMathews Apr 17, 2018

Contributor

Oh right, sorry.

No the stats can come in a future PR — let's get this in so our first beta will be in great shape. We need to think some first about the design of the stats gathering.

// This function will fail on Windows with no further consequences.
}
return resolve(null, stats)
return renderHTML(require(outputFile), pages).then(() => {

This comment has been minimized.

@jquense

jquense Apr 10, 2018

Collaborator

will just requiring this always work? not sure how the various async imports or jsonp stuff will be handled...I guess this is what the plugin used to do tho

@jquense

jquense Apr 10, 2018

Collaborator

will just requiring this always work? not sure how the various async imports or jsonp stuff will be handled...I guess this is what the plugin used to do tho

This comment has been minimized.

@pieh

pieh Apr 10, 2018

Contributor

this should work, as gatsby is in control of this - outputFile is compiled .cache/static-entry.js which sync requires all the templates (and for now layouts) - so any dynamic imports that user may be using are below that and would be handled same as they are currently

@pieh

pieh Apr 10, 2018

Contributor

this should work, as gatsby is in control of this - outputFile is compiled .cache/static-entry.js which sync requires all the templates (and for now layouts) - so any dynamic imports that user may be using are below that and would be handled same as they are currently

@pieh

This comment has been minimized.

Show comment
Hide comment
@pieh

pieh Apr 10, 2018

Contributor

I will run tests to get exact numbers tomorrow - I have some old charts (comparisons I did for json-loader) but they don't really will show effect of this PR alone

Contributor

pieh commented Apr 10, 2018

I will run tests to get exact numbers tomorrow - I have some old charts (comparisons I did for json-loader) but they don't really will show effect of this PR alone

@KyleAMathews KyleAMathews moved this from To Do to In progress in Gatsby v2 Release Apr 11, 2018

@m-allanson m-allanson removed the review label Apr 13, 2018

@pieh

This comment has been minimized.

Show comment
Hide comment
@pieh

pieh Apr 15, 2018

Contributor

Performance comparison from building locally current v2 vs this branch using freecodecamp guide as benchmark site (2780 pages) - this is only for "Building static HTML for pages" step!

Note: With current v2 we need to run with increased memory limit using --max_old_space_size=7168 option (otherwise we run out of memory, value was just reused from freecodecamp guide setting) - for this to be fair comparison I've run this branch both with this flag set and without it:

Memory usage (peak) [MB]:

Run v2 --max_old_space_size this PR --max_old_space_size this PR
1 3244.9 1766.7 954.1
2 2992.5 1791.9 994.2
3 3387.2 1758.2 972.4
4 3168 1754.2 1011.7
5 3248 1749.8 988.1
Avg 3208.12 1764.16 984.1

This was actually suprising to me how "lazy" node / V8 is with releasing memory, but I guess garbage collection can be expensive.

Time spent [s]:

Run v2 --max_old_space_size this PR --max_old_space_size this PR
1 81.785 73.569 64.982
2 83.131 70.541 63.593
3 88.496 72.855 58.889
4 79.593 73.044 64.676
5 94.74 72.869 67.704
Avg 85.549 72.5756 63.9688

Now this is suprise for me that there is time improvement when not using --max_old_space_size flag. I wanted to post results yesterday, but found similiar times to be suspicious (when I ran tests yesterday I did a things a little different - first run v2 5 times, then this PR with flag 5 times and then this PR without flag 5 times and thought some system process in background maybe influenced results so much. So today I've run v2 -> this PR with flag -> this PR without flag and looped that to somewhat ensure that all builds been tested in similiar conditions.

Time improvment from v2 to this PR I think could be explained with v2 doing promise.all on all paths and almost 2800 workers ran in parallel being starved on resources slowing things down plus maybe some overhead from using webpack for handling output.

Just in case dump all time/memory metrics is in this spreadsheet https://docs.google.com/spreadsheets/d/1TyUc_J-0v2Te0XIcEM-V79Ut-_e1Wxky2rrVLPVY9FM/edit?usp=sharing - first sheet being summary same as in this comment and other sheet with detailed data

Contributor

pieh commented Apr 15, 2018

Performance comparison from building locally current v2 vs this branch using freecodecamp guide as benchmark site (2780 pages) - this is only for "Building static HTML for pages" step!

Note: With current v2 we need to run with increased memory limit using --max_old_space_size=7168 option (otherwise we run out of memory, value was just reused from freecodecamp guide setting) - for this to be fair comparison I've run this branch both with this flag set and without it:

Memory usage (peak) [MB]:

Run v2 --max_old_space_size this PR --max_old_space_size this PR
1 3244.9 1766.7 954.1
2 2992.5 1791.9 994.2
3 3387.2 1758.2 972.4
4 3168 1754.2 1011.7
5 3248 1749.8 988.1
Avg 3208.12 1764.16 984.1

This was actually suprising to me how "lazy" node / V8 is with releasing memory, but I guess garbage collection can be expensive.

Time spent [s]:

Run v2 --max_old_space_size this PR --max_old_space_size this PR
1 81.785 73.569 64.982
2 83.131 70.541 63.593
3 88.496 72.855 58.889
4 79.593 73.044 64.676
5 94.74 72.869 67.704
Avg 85.549 72.5756 63.9688

Now this is suprise for me that there is time improvement when not using --max_old_space_size flag. I wanted to post results yesterday, but found similiar times to be suspicious (when I ran tests yesterday I did a things a little different - first run v2 5 times, then this PR with flag 5 times and then this PR without flag 5 times and thought some system process in background maybe influenced results so much. So today I've run v2 -> this PR with flag -> this PR without flag and looped that to somewhat ensure that all builds been tested in similiar conditions.

Time improvment from v2 to this PR I think could be explained with v2 doing promise.all on all paths and almost 2800 workers ran in parallel being starved on resources slowing things down plus maybe some overhead from using webpack for handling output.

Just in case dump all time/memory metrics is in this spreadsheet https://docs.google.com/spreadsheets/d/1TyUc_J-0v2Te0XIcEM-V79Ut-_e1Wxky2rrVLPVY9FM/edit?usp=sharing - first sheet being summary same as in this comment and other sheet with detailed data

@pieh

This comment has been minimized.

Show comment
Hide comment
@pieh

pieh Apr 19, 2018

Contributor

Does this need any further work to be merged?

I wait for this to be merged before updating and resolving conflicts in #4715 (because this would add other conflict to be resolved anyway), so if there is any work needed here I would like to do this.

Contributor

pieh commented Apr 19, 2018

Does this need any further work to be merged?

I wait for this to be merged before updating and resolving conflicts in #4715 (because this would add other conflict to be resolved anyway), so if there is any work needed here I would like to do this.

@KyleAMathews

This comment has been minimized.

Show comment
Hide comment
@KyleAMathews

KyleAMathews Apr 19, 2018

Contributor

Looks good to me! Let's get it in! Thanks for your great work on this! Huge improvement 🎉

Contributor

KyleAMathews commented Apr 19, 2018

Looks good to me! Let's get it in! Thanks for your great work on this! Huge improvement 🎉

@KyleAMathews KyleAMathews merged commit 1266c93 into gatsbyjs:v2 Apr 19, 2018

1 of 4 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build failed
Details
deploy/netlify Deploy preview failed.
Details
DCO All commits have a DCO sign-off from the author

Gatsby v2 Release automation moved this from In progress to Done Apr 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment