Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve build time of navigation panel #956

Merged
merged 5 commits into from
Sep 12, 2022

Conversation

pdmosses
Copy link
Contributor

Fix #863.

The current Liquid code to generate the navigation panel involves the inefficient extraction of a list of pages from a list of page groups (identified by @captn3m0 in his original explanation of the performance issue).

The optimisation implemented by this PR generates navigation links directly from the list of page groups, thereby avoiding the extraction of a list of pages from it. The Liquid code is now a bit tedious, but I don't see a simpler solution.

The need for grouping pages arises because Jekyll doesn't provide a filter to sort a list of pages on the value of an arbitrary expression.

Using Jekyll v4.2.2 (macOS 12.5, M2 MacBook Air, 16 GB memory), building https://github.com/endoflife-date/endoflife.date using https://github.com/pdmosses/just-the-docs/blob/fix-nav-performance/_includes/nav.html produced the following profile extract:

Filename Count Bytes Time
just-the-docs-0.4.0.rc1/_layouts/default.html 130 3792.04K 5.160
_includes/nav.html 130 1405.20K 4.054
just-the-docs-0.4.0.rc1/_includes/head.html 130 617.82K 0.495
_layouts/product.html 127 1014.38K 0.413
_includes/head_custom.html 130 427.83K 0.393
assets/js/zzzz-search-data.json 1 149.31K 0.050

@nathancarter has tried adding the new nav.html to a site with over 300 pages, and reported that it improved the build time of more than 3 minutes to about 30 seconds.

Further optimisation of navigation might be possible (e.g., using Jekyll include caching), but the current optimisation should be sufficient for v0.4.0.

To test that this PR does not appear to affect the navigation panel generated by v0.3.3:

  1. Clone https://github.com/just-the-docs/just-the-docs-tests.
  2. Update _config.yml and Gemfile to use this PR branch.
  3. Run bundle update.
  4. Inspect the rendering of the entire collection of navigation tests.

(Many of the differences reported in the GitHub visualisation of the changes are due to shifting much of the code 2 spaces to the left, in connection with moving the first ul element to be close to its first item.)

The possibility of using the `title` as a default value for `nav_order` requires page grouping
(because of the lack of support in Jekyll-Liquid for sorting by the value of an expression).
It appears that extracting a sorted list of pages from a sorted list of groups is quadratic
in the number of groups: it can take 35 seconds for 130 single-value groups.

This update to nav.html circumvents this source of inefficiency by generating the navigation
directly from the sorted page groups.
The possibility of using the `title` as a default value for `nav_order` requires page grouping
(because of the lack of support in Jekyll-Liquid for sorting by the value of an expression).
It appears that extracting a sorted list of pages from a sorted list of groups is quadratic
in the number of groups: it can take 35 seconds for 130 single-value groups.

This update to nav.html circumvents this source of inefficiency by generating the navigation
directly from the sorted page groups.

Building the same site with 130 pages now takes only about 6 seconds with Jekyll v4.2.2.

In the previous commit, distinguishing between numbers and strings didn't work.
This commit has been partially tested, locally.
@pdmosses pdmosses changed the title Fix nav performance Improve build time of navigation panel Sep 10, 2022
Copy link
Member

@mattxwang mattxwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is good to merge. I've left a nit and a note for later, but I think neither should block this being merged.

I am slightly wary of this possibly introducing breaking changes to existing sites, though I appreciate the thoroughness with the tests repo. I think my plan is:

  1. merge this
  2. cut a new rc (not v0.4.0)
  3. let @captn3m0 and @nathancarter test our RC to confirm that it still resolves their problem (somewhat trivial)
  4. field any new bug reports we receive from the new nav
  5. publish a v0.4.0 by the end of this month 😊

_includes/nav.html Show resolved Hide resolved
_includes/nav.html Outdated Show resolved Hide resolved
_includes/nav.html Outdated Show resolved Hide resolved
mattxwang and others added 3 commits September 10, 2022 17:00
Improve alignment.

Co-authored-by: Matt Wang <matt@matthewwang.me>
@pdmosses
Copy link
Contributor Author

I am slightly wary of this possibly introducing breaking changes to existing sites, though I appreciate the thoroughness with the tests repo.

The current navigation tests check only for regression regarding previous bug fixes, so I don't think they should be regarded as "thorough".

Some of the changes made in v0.4.0.rc1 are bug fixes, which are deliberately not backwards compatible with v0.3.3.

However, this PR is intended to generate exactly the same nav panels as v0.4.0.rc1. After locally updating the just-the-docs-tests config to avoid the html compression, and copying the built sites for the two versions, I get the following from diff:

just-the-docs-tests: diff _site-0.4.0.rc1-3.8.7/index.html _site-fix-nav-performance-3.8.7/index.html          
28a29
>   <a class="skip-to-main" href="#main-content">Skip to main content</a>
178c179
<   <br>Version: v0.4.0.rc1
---
>   <br>Version: fix-nav-performance
just-the-docs-tests: 

That confirms that the nav links on the home page of just-the-docs-tests are indeed the same for v0.4.0.rc1 and my PR branch (modulo the inclusion of the "Skip to main content" link due to a PR already merged into main).

I think it's safe to conclude that the nav build optimisation in this PR doesn't change the hierarchy displayed in the nav panels. The nav panels on all the other pages are (hopefully) the same as on the home page, apart from unfolding and font details.

But also the breadcrumbs and the auto-TOCs might differ. A rigorous check would diff each pair of html files, which would require some scripting.

@mattxwang
Copy link
Member

Great, thanks @pdmosses - that explanation helps clarify things for me. Time to merge + release rc2!

@mattxwang mattxwang merged commit 457dce3 into just-the-docs:main Sep 12, 2022
@mattxwang
Copy link
Member

Changelog:

@captn3m0
Copy link
Member

captn3m0 commented Sep 12, 2022

Summary:

Summary of build times with different versions:

Build Render Time
0.4.0.rc2 12.34
0.3.3 70.5
custom nav.html 6.9

0.4.0.rc2:

Build Process Summary:

| PHASE      |    TIME |
+------------+---------+
| RESET      |  0.0002 |
| READ       |  0.4496 |
| GENERATE   |  0.0008 |
| RENDER     | 12.3458 |
| CLEANUP    |  0.0615 |
| WRITE      |  0.3217 |
+------------+---------+
| TOTAL TIME | 13.1796 |

Site Render Stats:
| Filename                                                                                       | Count |     Bytes |   Time |
+------------------------------------------------------------------------------------------------+-------+-----------+--------+
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.4.0.rc2/_layouts/default.html                    |   138 |  4152.40K | 10.367 |
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.4.0.rc2/_includes/nav.html                       |   138 |  1587.77K |  7.358 |
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.4.0.rc2/_includes/head.html                      |   138 |   682.85K |  1.925 |
| _layouts/product.html                                                                          |   135 |  1071.85K |  0.847 |
| _includes/head_custom.html                                                                     |   138 |   481.11K |  0.715 |
+------------------------------------------------------------------------------------------------+-------+-----------+--------+
| TOTAL (for 34 files)                                                                           |  3461 | 13694.74K | 21.774 |

Custom nav.html

We are using this currently (which doesn't do a custom sort, only natural_sort):

Build Process Summary:

| PHASE      |   TIME |
+------------+--------+
| RESET      | 0.0002 |
| READ       | 0.3198 |
| GENERATE   | 0.0006 |
| RENDER     | 6.9113 |
| CLEANUP    | 0.0310 |
| WRITE      | 0.2163 |
+------------+--------+
| TOTAL TIME | 7.4792 |


Site Render Stats:

| Filename                                                                                       | Count |     Bytes |   Time |
+------------------------------------------------------------------------------------------------+-------+-----------+--------+
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.4.0.rc2/_layouts/default.html                    |   138 |  5718.78K |  5.211 |
| _includes/nav.html                                                                             |   138 |  3154.15K |  2.367 |
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.4.0.rc2/_includes/head.html                      |   138 |   682.85K |  1.776 |
| _layouts/product.html                                                                          |   135 |  1071.85K |  0.808 |
+------------------------------------------------------------------------------------------------+-------+-----------+--------+
| TOTAL (for 34 files)                                                                           |  3461 | 18393.89K | 11.419 |

0.3.3

| PHASE      |    TIME |
+------------+---------+
| RESET      |  0.0002 |
| READ       |  0.3153 |
| GENERATE   |  0.0006 |
| RENDER     | 70.5798 |
| CLEANUP    |  0.0349 |
| WRITE      |  0.2117 |
+------------+---------+
| TOTAL TIME | 71.1425 |

| Filename                                                                                  | Count |     Bytes |    Time |
+-------------------------------------------------------------------------------------------+-------+-----------+---------+
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.3.3/_layouts/default.html                   |   138 |  4464.92K |  68.496 |
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.3.3/_includes/nav.html                      |   138 |  1988.16K |  66.495 |
| vendor/bundle/ruby/3.0.0/gems/just-the-docs-0.3.3/_includes/head.html                     |   138 |   682.04K |   1.862 |
+-------------------------------------------------------------------------------------------+-------+-----------+---------+
| TOTAL (for 29 files)                                                                      |  2906 | 14638.41K | 138.964 |

So switching from our custom nav.html to the latest RC still slows us by 5-6 seconds. However, that's still better than the baseline of 70ish seconds that we are at on the 0.3.3 release

Don't see any regressions in the navigation HTML generated. Since our usecase doesn't have titles starting with numbers, planning to stick with the custom nav.html for a slightly faster build.

@pdmosses
Copy link
Contributor Author

@captn3m0 thanks for your follow-up with the summary and the profiling results!

Presumably you're using Jekyll v4.2.2. As there are breaking changes from Jekyll v3 (some obvious, some non-obvious), I hope we'll soon be able to drop support for Jekyll v3. Perhaps that's easier now that GH Actions is the default build on GH Pages?

I wrote in a comment:

BTW, I considered adding a quick test for sites where the entire navigation is uniformly numeric or uniformly string-based, but users would probably get surprised by a dramatic increase in build times when adding a page that made it a bit non-uniform…

I might try doing that (in a fresh PR). I suspect that rather few sites use a mixture of numbers and strings.

@pdmosses
Copy link
Contributor Author

I wrote:

That confirms that the nav links on the home page of just-the-docs-tests are indeed the same for v0.4.0.rc1 and my PR branch (modulo the inclusion of the "Skip to main content" link due to a PR already merged into main).

I think it's safe to conclude that the nav build optimisation in this PR doesn't change the hierarchy displayed in the nav panels. The nav panels on all the other pages are (hopefully) the same as on the home page, apart from unfolding and font details.

But also the breadcrumbs and the auto-TOCs might differ. A rigorous check would diff each pair of html files, which would require some scripting.

@mattxwang, in fact I could have used diff -rq with the option to ignore lines that match the "Skip to main content" link. Unfortunately the subsequent merge of main into fix-nav-performance added the aria labels to the expanders, and they make every nav panel line differ in the site generated by the current version of the PR branch.

If it's possible to revert the commit that added the aria labels to the PR branch, that would allow me to regenerate the the previous test site (so as to diff it with v0.4.0.rc1). I think it's f158628.

BTW, I was expecting my PR branch to be merged into main, rather than the other way round. Would it have helped if I'd based it on v0.4.0.rc1 instead of on main? Sorry for my ignorance re Git and the merge business!

@mattxwang
Copy link
Member

Unfortunately the subsequent merge of main into fix-nav-performance added the aria labels to the expanders, and they make every nav panel line differ in the site generated by the current version of the PR branch.

Understood. In the future, would it be possible to rebase to the last commit before this one was merged in, and then compare? I believe the history is linear, so you'd still be able to isolate the difference introduced in one commit.

If it's possible to revert the commit that added the aria labels to the PR branch, that would allow me to regenerate the the previous test site (so as to diff it with v0.4.0.rc1). I think it's f158628.

Go for it! Feel free to force-push to this branch, etc. - especially since we've already merged it. Let me know if you need help with that!

BTW, I was expecting my PR branch to be merged into main, rather than the other way round. Would it have helped if I'd based it on v0.4.0.rc1 instead of on main? Sorry for my ignorance re Git and the merge business!

Oh, no - not a mistake on your end! I've enabled a setting for this repo that requires branches to be up-to-date with main before merging them; the intention is that we know it'll be a clean merge / issues with the merge will be caught with CI, etc. Do you think this is too confusing?

@pdmosses
Copy link
Contributor Author

Unfortunately the subsequent merge of main into fix-nav-performance added the aria labels to the expanders, and they make every nav panel line differ in the site generated by the current version of the PR branch.

Understood. In the future, would it be possible to rebase to the last commit before this one was merged in, and then compare? I believe the history is linear, so you'd still be able to isolate the difference introduced in one commit.

I can try – so long as I don't need to resolve conflicts in the process…

If it's possible to revert the commit that added the aria labels to the PR branch, that would allow me to regenerate the the previous test site (so as to diff it with v0.4.0.rc1). I think it's f158628.

Go for it! Feel free to force-push to this branch, etc. - especially since we've already merged it. Let me know if you need help with that!

Ah, I've just recalled that jekyll-remote-theme can use tags as well as (pre-)releases! That could be a simpler solution.

BTW, I was expecting my PR branch to be merged into main, rather than the other way round. Would it have helped if I'd based it on v0.4.0.rc1 instead of on main? Sorry for my ignorance re Git and the merge business!

Oh, no - not a mistake on your end! I've enabled a setting for this repo that requires branches to be up-to-date with main before merging them; the intention is that we know it'll be a clean merge / issues with the merge will be caught with CI, etc. Do you think this is too confusing?

Not now that I know about it! 😄

@pdmosses pdmosses deleted the fix-nav-performance branch September 22, 2022 11:13
captn3m0 added a commit to endoflife-date/endoflife.date that referenced this pull request Feb 1, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Feb 19, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Feb 24, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Mar 16, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Mar 30, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Apr 13, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Apr 26, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request May 18, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request May 23, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Jun 20, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Jul 2, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
marcwrobel pushed a commit to endoflife-date/endoflife.date that referenced this pull request Jul 25, 2023
- See just-the-docs/just-the-docs#956
- This is still 7-8 seconds slower than our fast version, but this
  lets us stay closer to upstream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance issue with large number of files without collection
3 participants