Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations for large documentation #1887

Closed
2 of 3 tasks
aimeos opened this issue Aug 27, 2020 · 43 comments
Closed
2 of 3 tasks

Optimizations for large documentation #1887

aimeos opened this issue Aug 27, 2020 · 43 comments
Labels
change request Issue requests a new feature or improvement resolved Issue is resolved, yet unreleased if open

Comments

@aimeos
Copy link

aimeos commented Aug 27, 2020

I checked that...

  • ... the documentation does not mention anything about my idea
  • ... to my best knowledge, my idea wouldn't break something for other users
  • ... there are no open or closed issues that are related to my idea

Description

We are currently trying to optimize page render times of the generated docs. It largely depends on the content and we have a lot in our documentation, e.g.: https://aimeos.org/docs/2020.x/config/client-html/account-download/

We found out that the biggest problem is the navigation, which contains all navigation items on all pages. Google Lighthouse reports an exessive number of DOM nodes for each page (~1400) and an extremely long largest contentful paint:

image

The first though was to reduce the number of navigation items but that isn't that easy as we can't merge more pages because they are rather long in most cases and the value/experience for the user would get worse.

We identified two things that could be optimized and have a big effect:

1.) The biggest problem is how Disqus is added
The Material theme uses document.write() to add it's HTML code but we can use in partials/integrations/disqus.html instead:

  <script>
    var disqus_config=function(){
      this.page.url="{{ page.canonical_url }}",
      this.page.identifier="{{ page.canonical_url | replace(config.site_url, '') }}"
    };

    if('IntersectionObserver' in window) {
      let observer = new IntersectionObserver(function(entries, observer) {
        for(let entry of entries) {
          if(entry.isIntersecting) {
            observer.unobserve(entry.target);
            el=document.createElement("script");
            el.src="https://{{ disqus }}.disqus.com/embed.js";
            el.setAttribute("data-timestamp",+new Date);
            document.body.appendChild(el);
          }
        }
      },{
        threshold: 0.01
      });
      observer.observe(document.querySelector('#__comments'));
    } else {
      el=document.createElement("script");
      el.src="https://{{ disqus }}.disqus.com/embed.js";
      el.setAttribute("data-timestamp",+new Date);
      document.body.appendChild(el);
    }
  </script>

This postpones the document.write until the user scrolls to the comment section.

2.) Reduce the HTML tags per nested navigation
There are 65 span tags <span class="md-nav__icon md-icon"> on each page which contains an <svg><path/></svg>. If we remove them and add a label.md-nav__link:after { content: url(caret.svg); } in CSS, we save ~200 nodes at once. Also, the file size reduces because the icon is only defined once.

Afterwards we get:
image

Use Cases

We only have >300 pages which I consider a medium sized documentation. Nevertheless, the DOM nodes are currently >5x the number of navigation items, which has a big effect the more pages are added.

@squidfunk
Copy link
Owner

Thanks for digging into this.

We found out that the biggest problem is the navigation, which contains all navigation items on all pages. Google Lighthouse reports an exessive number of DOM nodes for each page (~1400) and an extremely long largest contentful paint:

That's necessary for navigation on mobile devices. Navigation is only included once and used inside the drawer on mobile and the sidebar on desktop. It's not possible to drop any items here, even when using tabs, as we would lose mobile navigation. If you don't need mobile navigation, you might remove it using overrides as documented in the customization guide.

The biggest problem is how Disqus is added
The Material theme uses document.write() to add it's HTML code but we can use in partials/integrations/disqus.html instead.

I guess we could use IntersectionObserver, as support seems to be solid, but yes, we need to provide the fallback. Note, however, that your solution might leak memory (need to check) when used together with instant loading, as the observers are not unregistered upon page unload.

Reduce the HTML tags per nested navigation
There are 65 span tags on each page which contains an . If we remove them and add a label.md-nav__link:after { content: url(caret.svg); } in CSS, we save ~200 nodes at once. Also, the file size reduces because the icon is only defined once

We can think about moving this to CSS, yes.

@squidfunk
Copy link
Owner

The latest commit moves the SVG icon definitions which are part of the navigation to CSS, as suggested. The number of nodes should now be greatly reduced for large documentation projects. I (currently) don't see the Disqus optimizations as being necessary, so we'll leave it as it is. Also, overriding Disqus is simple.

Reopening until released.

@squidfunk squidfunk reopened this Aug 28, 2020
@squidfunk squidfunk added change request Issue requests a new feature or improvement resolved Issue is resolved, yet unreleased if open labels Aug 28, 2020
@squidfunk
Copy link
Owner

Okay, one little change to the Disqus integration – it's now wrapped with an event listener to be executed on load, thus it shouldn't block the first paint anymore. Added in 96f58ed.

@squidfunk
Copy link
Owner

Released as part of 5.5.10. Thanks for your effort digging into this topic!

@aimeos
Copy link
Author

aimeos commented Aug 29, 2020

Okay, one little change to the Disqus integration – it's now wrapped with an event listener to be executed on load, thus it shouldn't block the first paint anymore. Added in 96f58ed.

Unfortunately, it's not the case. The results are still the same with an extremely long contentful paint:
image

Only if we use the intersection observer to load it if the user scrolls down, the problem is solved. The intersection observer also has the advantage that the Disqus threads wouldn't be loaded most of time.

There's one further optimization for the future:
Due to the nature of the Mkdocs documentation to include the whole navigation tree in each page, a lot of HTML code is included. You've already reduced the number of DOM elements and it also makes a difference how much characters each node contains. Thus, reducing the length of the class names and avoiding additional attributes if possible also makes a difference the bigger the documentation will get:

image

@squidfunk
Copy link
Owner

squidfunk commented Aug 29, 2020

Unfortunately, it's not the case. The results are still the same with an extremely long contentful paint

Interesting, as my testing showed improvements, but Lighthouse scores will always depend on the actual, final HTML being used. Note that you're only escaping from Lighthouse taking notice of the Disqus integration when using the IntersectionObserver approach. It is the same, if the page is loaded at the bottom, rendering the comments straight away. If you feel that the integration doesn't work for you, overriding it is plain and simple, as you've already proposed.

Due to the nature of the Mkdocs documentation to include the whole navigation tree in each page, a lot of HTML code is included. You've already reduced the number of DOM elements and it also makes a difference how much characters each node contains. Thus, reducing the length of the class names and avoiding additional attributes if possible also makes a difference the bigger the documentation will get

This isn't big of a problem when content is served with gzip anyway, which is an established best-practice. It will make no difference in the compressed source delivered to the browser. The browser has to do a little more parsing will need a little more time consuming the token stream, but that difference is negligible. I've actually written a CSS parser myself, and tokenization, which is the CSS parsing stage this would have an effect on, is always a matter of a few milliseconds.

Yes, we could compress or shorten class names, but that would either come at the cost of worse maintainability if done directly in the SCSS code that is compiled to CSS, or result in unstable class names when done as a post-processing step. This would mean that it would be much harder for authors to override certain parts, as class names are less readable or might change in unexpected ways, increasing cost of maintenance downstream. Material for MkDocs strives to be a hackable and extendable theme and clear and concise CSS naming is an essential part of it.

@wilhelmer
Copy link
Contributor

We found out that the biggest problem is the navigation, which contains all navigation items on all pages. Google Lighthouse reports an exessive number of DOM nodes for each page (~1400) and an extremely long largest contentful paint:

That's necessary for navigation on mobile devices. Navigation is only included once and used inside the drawer on mobile and the sidebar on desktop. It's not possible to drop any items here, even when using tabs, as we would lose mobile navigation. If you don't need mobile navigation, you might remove it using overrides as documented in the customization guide.

@squidfunk I still think it would be nice to be able to exclude all nav items from other tabs when navigation.tabs is enabled. At least make that an option. As for mobile navigation, why not display the tab level, but if you click on a tab, load the tab index page, closing the drawer? I could live with that.

I think navigation.tabs is mostly used for larger documentation projects, and in those projects (like ours), all the excess nav items really become a PITA. We have a lot of pages where the nav items make up 90% (!!) of the file size. It just doesn't scale well.

If you think that's an edge case, some help on how to get this done via customization would be much appreciated.

@squidfunk
Copy link
Owner

squidfunk commented Aug 23, 2021

If you think that's an edge case, some help on how to get this done via customization would be much appreciated.

I'd really consider this as an edge case, as most documentation projects are not that large. However, I understand that the file size is causing you trouble. I would have thought that gzip compression should be quite efficient, as there are a lot of shared strings/prefixes, but I may be wrong.

So in general, one could replace the non-active tab in the drawer with a link to that tab page, as you said, and only render the navigation hierarchy for the active tab. This would need to be done in nav-item.html. If you take a look at the code, you realize why I'm not keen on making this more complex than it already is.

I'm pretty sure that it could be done, but some research is definitely necessary. I'm unsure whether this could be an option that goes into master or will remain a customization. IMHO, this can only be decided after implementing it. If you need help, ping me at martin.donath@squidfunk.com.

@flanciskinho
Copy link
Contributor

I think that this is a edge case.

Have you tried techniques on your web server like using compression when sending documents, add expiration time for the content, or enabling the cache?

@aimeos
Copy link
Author

aimeos commented Aug 23, 2021

The main problem are the huge number of HTML nodes due to the full navigation menu that can cause a slow down of rendering for smartphones. It takes several seconds until the page becomes usable.

@wilhelmer
Copy link
Contributor

IMHO, it's just 90% unnecessary data that shouldn't be there, regardless of whether any web server techniques can work around it. We also deliver the documentation as a download for local use, so every byte matters.

@squidfunk
Copy link
Owner

The main problem are the huge number of HTML nodes due to the full navigation menu that can cause a slow down of rendering for smartphones. It takes several seconds until the page becomes usable.

I'd be curious to see a reproducible case with some benchmarks (before + after trimming down the navigation structure). This would yield the necessary baseline on how much improvement could be expected, and if it's worth going down that path.

@wilhelmer
Copy link
Contributor

wilhelmer commented Aug 23, 2021

The more I think about it, the more I like the idea of removing the nav items from other tabs. Right now (as you pointed out), with navigation.tabs enabled, the template switches to "tabbed mode" on desktop, but the mobile view remains the same, which is inconsistent. Users who only visit the site on mobile won't notice any difference, despite the setting.

If we had a real "tabbed mode" in both desktop and mobile – maybe with more visual clues in the mobile version, not just with replaced links – that would bring a consistent experience for all users AND dramatically reduce the file/DOM size on large doc projects.

@squidfunk
Copy link
Owner

squidfunk commented Aug 23, 2021

but the mobile view remains the same, which is inconsistent.

I don't agree with this point. While it might be possible to show the tabs on all screen sizes (by making them scrollable, which would make it harder to discover navigation), you're effectively breaking navigation into multiple UI elements on mobile. My personal experience is that users have learned that navigation on mobile hides entirely behind the hamburger menu. Breaking navigation into multiple components might in fact be less intuitive. This is the main reason navigation is entirely contained in the drawer on mobile.

If we had a real "tabbed mode" in both desktop and mobile – maybe with more visual clues in the mobile version, not just with replaced links – that would bring a consistent experience for all users AND dramatically reduce the file/DOM size on large doc projects.

Yes, larger docs projects might benefit from this. However, this would mean a significant change in behavior + a serious refactoring for which I currently can't afford the time, given the current funding situation. I have too many other things on my plate, so if somebody really really wants this, there are two options:

  1. A user-provided PR, which is completely backward compatible with all existing features (and Insider features), that doesn't introduce too much additional complexity, because, in the end, it is me who has to maintain it. As a starter, this discussion contains a checklist what has to be considered when implementing features for Material for MkDocs.
  2. I implement this as part of a freelance gig, so I can set aside enough time to pull this off

I'm hoping for your understanding. This project is very, very complex already. When adding more features, especially regarding navigation, many things have to be considered. Nonetheless, a benchmark, which clearly shows the upside of this refactoring (given gzip compression) is a mandatory requirement that has yet to be provided.

@wilhelmer
Copy link
Contributor

wilhelmer commented Aug 24, 2021

I don't think we can argue whether it's inconsistent (it is, unless you rename the option to navigation.desktop.tabs), only if the behavior is inconsequential. But these are just words, I get your point. And I think you're right, in most cases, users will benefit more from the non-breaking mobile navigation than they do from the reduced file/DOM size.

Anyway, I came up with a fairly simple solution that only requires customizing nav-item.html. I'd be happy to hear your thoughts. This is not intended as a general solution, but as a personal customization. (Also, thanks for your mail.)

Step 1: Nav items should only be generated for the active main navigation item and its nested item list

We change this:

<!-- Main navigation item with nested items -->
{% if nav_item.children %}

To this:

  <!-- Main navigation item with nested items -->
  {% if nav_item.children and (nav_item.active or level > 1 ) %}

That's it! Does all the magic. (I hope.)

EDIT: If this should only be applied when navigation.tabs is enabled, use:

  <!-- Main navigation item with nested items -->
  {% if nav_item.children and ("navigation.tabs" not in features or (nav_item.active or level > 1 )) %}

Step 2: The main level of the mobile view should behave like tabs

To achieve this, we can borrow some code from tabs-item.html. We change this:

<!-- Main navigation item -->
{% else %}
<li class="{{ class }}">
<a href="{{ nav_item.url | url }}" class="md-nav__link">
{{ nav_item.title }}
</a>
</li>
{% endif %}

To this:

  <!-- Main navigation item with nested items -->
  {% elif nav_item.children %}
    {% set title = title | d(nav_item.title) %}
    {% set nav_item = nav_item.children | first %}

    <!-- Recurse, if the first item has further nested items -->
    {% if nav_item.children %}
      {% include "partials/nav-item.html" %}

    <!-- Render item -->
    {% else %}
      <li class="{{ class }}">
      <a href="{{ nav_item.url | url }}" class="md-nav__link">
        {{ title }}
      </a>
    </li>
    {% endif %}

  <!-- Main navigation item -->
  {% else %}
    <li class="{{ class }}">      
      <a href="{{ nav_item.url | url }}" class="md-nav__link">
        {{ nav_item.title }}
      </a>
    </li>
  {% endif %}

EDIT: If this should only be applied when navigation.tabs is enabled, change the first lines to:

  <!-- Main navigation item with nested items -->
  {% elif nav_item.children and "navigation.tabs" in features %}

@squidfunk
Copy link
Owner

Thanks for providing your solution. I'm yet waiting for somebody to provide some real-world data that proves that dropping the navigation items from the DOM really benefits loading times on slower connections and/or devices.

@wilhelmer
Copy link
Contributor

File size difference in our project (yes, I know, doesn't say much about the loading time, but maybe still helpful)

Before:
788 HTML files, 151 MB (159,291,985 bytes)

After:
788 HTML files, 62.8 MB (65,954,416 bytes)

-> ~41.6 % of original size

@squidfunk
Copy link
Owner

Optimizations for large documentation projects are now on the Insiders roadmap, called "navigation pruning":
https://squidfunk.github.io/mkdocs-material/insiders/#12000-piri-piri

@malohr
Copy link

malohr commented Dec 22, 2021

@squidfunk

I think this is not an edge case and makes mkdocs unusable for bigger projects. We've got over 4k md files and have now an 8GB build, which is definitly anything other than great . The build times are extremly long (approx 15-20min) and if I enable the minify plugin, it can go up to an hour. I guess we really need some optimization otherwise it wont work properly for bigger projects. Still thanks for all the effort you put in there :) I really hope this gets fixed.

@wilhelmer
Copy link
Contributor

@malohr Did you try my solution?

@squidfunk
Copy link
Owner

@malohr – I'm actively working on bringing down the size of the final output and think I have a working prototype ready soon. Regarding build times – I think this should be raised over at the MkDocs repository.

@malohr
Copy link

malohr commented Dec 22, 2021

@wilhelmer Haven’t tried it but will definitely give this a shot until there is an official release that will fix it, thank you :)

@squidfunk Awesome, really appreciate all your work. This is just the missing piece to make the project perfect for big docs.

@zilom75
Copy link

zilom75 commented Jan 18, 2022

@wilhelmer
I tried it! :)
I run my mkdocs site in a docker container as an Azure web app. The image was getting too big and it had problems being unpacked in Azure.
With this fix, my docker image went from 1.89GB to 789MB.
I'm more worried about storage sizes in cloud, but the performance improvement in runtime is welcome as well.

Thanks! :)

@squidfunk
Copy link
Owner

If any of you have public repositories with large documentation, it would be of great help for me in seeing what we can optimize, so feel free to post links.

@wilhelmer
Copy link
Contributor

wilhelmer commented Mar 1, 2022

@squidfunk Here's a project with a large topic: large-topic.zip

Yes I know, the index.md content is horrible, it's auto generated. Still: Why does it take so long to render index.html?

If I remove the assets and load the pure HTML file, it's loading fast. So it must be something about Material. It would be great if you could take a look at this.

EDIT: When using the default MkDocs template, it's also loading fast.

@squidfunk
Copy link
Owner

@wilhelmer please open a new issue, as this issue has become too high-level.

@squidfunk
Copy link
Owner

I've got a working solution for navigation pruning ready to be tested 😊 Does somebody have a public Open Source project with a huge navigation structure that might benefit from those optimizations? I'd like to run some numbers before releasing.

@aimeos
Copy link
Author

aimeos commented May 24, 2022

Our docs are not that large as the ones from @wilhelmer but you can see if there's an improvement:
https://github.com/aimeos/aimeos-docs

@squidfunk
Copy link
Owner

squidfunk commented May 24, 2022

Thanks for providing the link to your docs. I've tested with the new navigation.prune feature, and the total size of the documentation reduced from 70MB to 47MB, so –33% by pruning navigation nodes alone. Lighthouse shows that on an average page, DOM nodes could be reduced from 1,268 to 247 (–81%). I'll wrap these findings up in a blog article tomorrow.

The feature should also be ready to be released tomorrow.

@wilhelmer
Copy link
Contributor

Whoa! I just tested navigation.prune and compared it to my custom optimizations given above. The results look great:

No optimizations:
675 HTML files, 149 MB

Custom optimizations (see above):
675 HTML files, 72.4 MB

navigation.prune:
675 HTML files, 37.7 MB

@squidfunk
Copy link
Owner

Yeah! Love it! Definitely need to write a blog article about this, as it might get lost in the vast amount of options that we offer.

@wilhelmer
Copy link
Contributor

However ... the downside of your implementation is that it actually changes the UX. You can't just look around and expand different sections of the nav anymore while staying on the same page. Expanding a section will always cause a page reload.

I'm not sure this is worth the extra gain in file size. I think I'd rather go with my 72 MB and not sacrifice the UX for it.

No offense though - you know I absolutely admire your work, and maybe that's the best way to go for most users. I'm just not sure whether it's right for my project.

@squidfunk
Copy link
Owner

Jup, the current solution is really the maximum we can save without sacrificing navigation layout. My motivation for the navigation pruning feature was the main use case being "download for offline use", where a page load doesn't matter at all, because loading from the file system is instant anyway.

@wilhelmer
Copy link
Contributor

I'd argue that even when browsing locally, the page load affects the UX – the screen flickers, and the content changes as you are effectively moving to a new page. It just doesn't feel as smooth as browsing through the nav tree without any reloads.

My point is that you can still save a lot without any noticeable change if navigation.tabs is enabled. Because then, you can simply cut off all navigation items within the non-active tabs, which you can't access anyway (as discussed above).

So maybe navigation.prune could behave differently depending on whether navigation.tabs is enabled or not. But then again, you run into problems with the mobile drawer. Might be too complex.

@glenn-jocher
Copy link

glenn-jocher commented Feb 4, 2024

@wilhelmer @squidfunk stumbled upon this issue today after a poor lightspeed test of our Mkdocs page showing the same excessive DOM elements:

Screenshot 2024-02-04 at 16 50 00

We have about 300 docs pages and lightspeed is showing 1500 DOM elements. I inspected the page in Chrome and was surprised to see that the entire navigation is apparently loaded into the HTML of every single docs page, even for pages that I would think are incapable of displaying those navigation elements.

I really like the navigation.sections option, but we need to improve our page load speeds and this seems like an area that is suboptimal. This example page at https://docs.ultralytics.com/help/ only appears to show a maximum of 22 possible navigation links in the top and left bars, but despite that all 300 links are loaded in the HTML. I'm using navigation.prune also, my YAML is at https://github.com/ultralytics/ultralytics/blob/main/mkdocs.yml.

EDIT: Lastly I should say that this means if I add new pages in other remote regions of my documentation, this /help page will continue to get larger and slower, which runs quite contrary to what I'd expect.

Screenshot 2024-02-04 at 16 53 57

@squidfunk
Copy link
Owner

squidfunk commented Feb 5, 2024

if I add new pages in other remote regions of my documentation, this /help page will continue to get larger and slower, which runs quite contrary to what I'd expect.

@glenn-jocher I'm not sure I understand.

navigation.prune should only include links that are actually reachable from the current page. Thus, you might either have found a bug in navigation pruning, or there's a misconception we need to clarify. Could you please provide a minimal reproduction, so we have something to discuss? 1,400 nodes is a sign that pruning is not active (see #1887 (comment)), so it might be an error on your or on our side. 300 pages should not be a problem at all.

Also, when providing a minimal reproduction, please make absolutely sure to remove any customizations, because the error might hide in them. We need to know if stock Material for MkDocs exhibits this problem.

@glenn-jocher
Copy link

glenn-jocher commented Feb 9, 2024

@squidfunk thanks for your patience. I've isolated a minimum reproducible example in a new mkdocs_reproduce branch in https://github.com/ultralytics/ultralytics/tree/mkdocs-reproduce that has all plugins and extras stripped away.

Reproduce

# Clone the repo
git clone https://github.com/ultralytics/ultralytics

# Change directory and checkout the branch
cd ultralytics
git checkout mkdocs_reproduce

# Install mkdocs-material
pip install "mkdocs-material"

# Serve the docs
mkdocs serve

Then when I go to i.e. the help page at http://127.0.0.1:8000/help/ and inspect the html I see relative links to all ~300 pages, I think because of the navigation, even though visually I only see about 20 pages available to navigate to (10 on the top bar, 10 on the left). I believe this is causing the Pagespeed "Excessive DOM Size" errors on our Docs.

Screenshot 2024-02-09 at 01 13 26 Screenshot 2024-02-09 at 01 14 03

@squidfunk
Copy link
Owner

squidfunk commented Feb 9, 2024

Thanks for the reproduction. Indeed, pruning does not work correctly, as there's a bug when the following three feature flags are all enabled at once:

  • navigation.prune
  • navigation.sections
  • navigation.tabs

It works correctly if the last two features are not enabled, or if either one is enabled, but not if both are enabled. This should now be fixed with 292d563, where I also expanded the kind of cryptic boolean condition to make it more explicit. My testing shows the issue is now fixed, and pruning works as expected. Can you confirm, @glenn-jocher?

For debugging, it's good to add the following CSS to see pruning in action:

.md-nav__item--section {
  background: green !important;
}
.md-nav__item--pruned {
  background: red !important;
}

Edit: I checked your example again: DOM nodes count is down from 1.598 to 266 (-83%) on the help page.

@glenn-jocher
Copy link

@squidfunk the update works! Also our Docs without any plugins build 3x faster now, from 3s to 1s. Everything feels a little faster while navigating too. Nice work on the fix.

Please release a pip update when possible, thank you!

@squidfunk
Copy link
Owner

Released as part of 9.5.9. For all other navigation pruning related bugs, please open new issues.

@glenn-jocher
Copy link

Perfect, thank you!

@vincentkelleher
Copy link

Fantastic work @squidfunk, thanks for your work on this subject 👍

I had the issue with the Gitlab Pages size limit being attained, the navigation.prune feature saved the day by cutting our documentation from ~870MB to ~165MB 👏

@squidfunk
Copy link
Owner

Wow, 80% reduction! That's awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change request Issue requests a new feature or improvement resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

8 participants