Skip to content

Commit

Permalink
Add information about lazy typesetting
Browse files Browse the repository at this point in the history
  • Loading branch information
dpvc committed Jun 16, 2021
1 parent 192e1f0 commit 8c55381
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 0 deletions.
1 change: 1 addition & 0 deletions index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ need it.
:titlesonly:

Output Formats <output/index>
Lazy Typesetting <output/lazy>
Line Breaking <output/linebreaks>
Font Support <output/fonts>
Browser Support <output/browser>
Expand Down
50 changes: 50 additions & 0 deletions output/lazy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.. _lazy-typesetting:

################
Lazy Typesetting
################

MathJax offers an extension that is designed to improve the
performance of pages with large numbers of equations. It implements a
"lazy typesetting" approach that only processes an expression when it
comes into view. This means that expressions will not be typeset when
they are not visible, and your readers will not have to wait for the
entire document to typeset, speeding up their initial view of the
page. Furthermore, any expressions that are never seen will not be
typeset, saving the processing time that would normally have been
spent on those expressions.

This also helps with the situation where you may link to a particular
location in your page (via a URL with a hash); typesetting the
material above that point can cause the browser to change the scroll
position, and so the user may not end up at the proper location in the
page. With the lazy extension, the material above that point is not
typeset until the user scrolls upwards, and so there is no position
change.

To use the lazy typesetting extension, simply add it to your
configuration as follows:

.. code-block:: latex

MathJax = {
loader: {load: ['ui/lazy']}
};

This will adjust the typesetting pipeline to implement the
lazy-typesetting functionality.

Lazy typesetting works best with SVG output, but changes with the way
the CommonHTML output handles its stylesheet updates make the CHTML
output nearly as fast. With TeX input, the lazy extension makes sure
that previous expressions are processed by TeX (though not output to
the page) so that any macro definitions or automatic equation numbers
are in place when the visible expressions are processed. Currently,
documents that contain ``\ref`` or ``\eqref`` links may not yet work
properly, since target equations may not have been typeset, and so the
link location may not be marked in the document. In particular,
forward references are unlikely to work, and backward references will
work only if the target expression has already been typeset. We hope
to improve this situation in a future release.

|-----|

14 comments on commit 8c55381

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also helps with the situation where you may link to a particular location in your page (via a URL with a hash); typesetting the material above that point can cause the browser to change the scroll position, and so the user may not end up at the proper location in the page. With the lazy extension, the material above that point is not typeset until the user scrolls upwards, and so there is no position change.

This fails when the hash location is close to the bottom. Currently I'm trying to fix it by refreshing the hash location using some js code like:
window.setTimeout(function(){ var hash_id = window.location.hash.substring(1); document.getElementById(hash_id).scrollIntoView(); }, 1000);
which is not a great solution because: a) waiting a second can look slow to hasty users and b) there's still no guarantee that the formulas in view will finish typesetting in poor Internet connection situations. What I need is a promise from mathjax in lazy mode that tells me when the formulas in view have finished typesetting. The startup: pageReady() configuration doesn't work in lazy mode.

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a configuration that I think will do what you want:

MathJax = {
  loader: {
    load: ['ui/lazy'],
  },
  startup: {
    ready() {
      //
      // Do the regular startup.
      //
      MathJax.startup.defaultReady();
      //
      // Check for a hash in the URL
      //
      if (location.hash) {
        //
        // Hijack the lazyProcessSet() method so that we can scroll
        //    to the hash position after the first batch of expressions
        //    are typeset
        //
        const MathDocument = MathJax.startup.document;
        const PROCESSSET = MathDocument.lazyProcessSet;
        const callback = () => {
          MathDocument.lazyProcessSet = PROCESSSET;  // restore initial function 
          MathDocument.lazyHandleSet();              // typeset the math
          MathDocument.lazyPromise.then(() => {      // after the typesetting is done
            if (!userScroll) {                       //   scroll hash into view if user hasn't scrolled
              const hash = document.getElementById(location.hash.slice(1));
              if (hash) hash.scrollIntoView();
            }
            window.removeEventListener('scroll', scrollHandler);    // clear the scroll handler
          });
        };
        MathDocument.lazyProcessSet = (window.requestIdleCallback ?
          () => window.requestIdleCallback(callback) :
          () => setTimeout(callback, 10)
        );
        //
        // Check whether the user has scrolled the window before typesetting is complete
        //
        let userScroll = false;
        let initialScroll = false;
        const scrollHandler = () => {
          if (initialScroll) {
            userScroll = true;         // this is the user scrolling the window by hand
            window.removeEventListener('scroll', scrollHandler);
          } else {
            initialScroll = true;      // this is the initial scroll due to the hash itself
          }
        };
        window.addEventListener('scroll', scrollHandler);
      }
    }
  }
};

It modifies the lazy processing so that, after the first block of math is processed, it will scroll to the hash location. It also check that the user hasn't scrolled the page by hand during the typesetting, and doesn't scroll to the hash location in that case.

I think this will improve your situation, but please let us know if that is that case, so we can consider whether to include this idea into the lazy typesetter in the future.

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my testing results, your setting works in case the user jumps from another page to a late hash location (close to the bottom) on this page, but fails if the user jumps from an early hash location on this page to the late hash location. Also, when the user presses Ctrl+End, the browser doesn't scroll to the true end but goes to the scroll position which used to be the end but now isn't, because of formula typesetting. I understand that the lazy functionality work optimally if the user just sequentially reads from the beginning to the end of the same page. In other cases, it needs more testing and improvements. Hope to see a better version of the lazy typesetter in your future release.

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here is an updated version that will handle hash links that are internal to the page.

MathJax = {
  loader: {
    load: ['ui/lazy'],
  },
  startup: {
    ready() {
      //
      // Do the regular startup.
      //
      MathJax.startup.defaultReady();
      //
      // Check whether the user has scrolled the window before typesetting is complete
      //
      let userScroll = false;
      let initialScroll = false;
      const scrollHandler = () => {
        if (initialScroll) {
          userScroll = true;         // this is the user scrolling the window by hand
          window.removeEventListener('scroll', scrollHandler);
        } else {
          initialScroll = true;      // this is the initial scroll due to a hash URL
        }
      };
      //
      // Set up lazy initial promise by hijacking the lazyProcessSet() 
      //   method for the first time it is called.
      //
      const MathDocument = MathJax.startup.document;
      const PROCESSSET = MathDocument.lazyProcessSet;
      const setLazyProcessSet = () => {
        MathDocument.lazyProcessSet = (window.requestIdleCallback ?
          () => window.requestIdleCallback(lazyCheckHash) :
          () => setTimeout(lazyCheckHash, 10)
        );
        userScroll = initialscroll = false;
        window.addEventListener('scroll', scrollHandler);
      };
      //
      // Function to move hash location to top after typesetting
      //
      const lazyCheckHash = () => {
        MathDocument.lazyProcessSet = PROCESSSET;  // restore initial function 
        MathDocument.lazyHandleSet();              // typeset the math
        MathDocument.lazyPromise.then(() => {      // after the typesetting is done
          if (!userScroll) {                       //   scroll hash into view if user hasn't scrolled
            const hash = document.getElementById(location.hash.slice(1));
            if (hash) hash.scrollIntoView();
          }
          window.removeEventListener('scroll', scrollHandler);    // clear the scroll handler
        });
      };
      //
      // Handle hash changes and initial hash URL processing
      //
      window.addEventListener('hashchange', setLazyProcessSet, false);
      if (location.hash) setLazyProcessSet();
    }
  }
};

This doesn't handle your CTRL+END issue. That would require a global keyup handler, but I'm not including one here as keybindings are not consistent across operating systems (e.g., CTRL+END on Windows, FN+RIGHTARROW or CMD+DOWNARROW on MacOS, and I'm not sure what corresponds to that on mobile devices). You should nee able to put together a key handler that calls setLazyProcessSet() at the appropriate time if you want that functionality.

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The hard part is communicating and synchronising with mathjax. There are not enough sample codes in the documentation to show how this is done, except asking an insider like you for help. My tentative solution to the Ctrl+End problem is to do something in the onscroll event. Every time the bottom is hit (<=1px), I use window.setTimeout(..., 500) to scroll the window to 2px from the bottom. Hopefully the 2px breaks the cascade of sticking the user to the bottom.

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing: I found a situation where scrollIntoView needs to be called twice: once immediately and once delayed. I went to the last hash location of my long document with many formulas, then scrolled away by pressing Ctrl + Home to the beginning of the document leaving the hash location of the URL unchanged, and then pressed F5 to refresh the page. By pressing F5, all formulas except those in the first view become untypesetted. The first scrollIntoView call brings the last hash location into view, but then formulas below that hash location get typesetted and the second scrollIntoView call becomes necessary in order to scroll to the correct hash location (either to the new bottom or putting the hash location on top of the screen depending on how many formulas there are below that hash location). Hopefully this finding helps with your design of the new version of the lazy typesetter.

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard part is communicating and synchronising with mathjax. There are not enough sample codes in the documentation to show how this is done

Yes, the documentation needs improvement. Since the lazy typesetting extension is relatively new (new in v3.2), there is very little about it.

My tentative solution to the Ctrl+End problem is to do something in the onscroll event

I think that will be problematic. Aside from onscroll events being expensive (there can be a LOT of them), I think this will cause unwanted scrolling to bottom when you don't want it too. For example, if the user scrolls by hand and ends up at the bottom, and then math is typeset, you would not want to scrolling to the bottom at that point (the user has scrolled to top line to where they want it, and the fact that the page gets longer because of math shouldn't change that). Or if there is an anchor position such that it is at the top of the window when the bottom of the page is at the bottom of the window, and you link to that hash location and math typesetting makes that section longer, you would still want to have the hash anchor at the top of the page, not scroll to the bottom.

The real issue is that there is no even for "scroll to bottom" as there is for "scroll to hash position", and so you don't have any real way to tell that you are scrolling to the bottom of the page. You only have key presses (CTRL+END), and those key bindings are OS specific. I've certainly visited sites where scrolling to the bottom have caused images to be loaded that enlarge the page and so you are no longer at the bottom. I don't think that is any big deal. I suspect you are going to have to let that one go.

I found a situation where scrollIntoView needs to be called twice.

I am not able to reproduce the problem in MacOS in Firefox, Chrome, or Safari. In fact, this was one of the situations I tested. What OS and browser (and their versions) are you using?

Note that the initial scroll to the hash location should be performed by the browser, then MathJax should typeset the math that has been exposed, then the code above should do the scrollIntoView() to move the hash anchor to the top again. It is possible that some browsers handle this situation differently, however, and I suppose there could be timing issues with how scripts are loaded versus when the screen is updated. I would not expect that the expressions at the top of the window are typeset before the scrolling takes place. That suggests that the browser is not moving to the hash location itself. If you don't include the code we have been working on above, does it move to the hash location?

Can you provide a link to an example page that exhibits the problem?

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've certainly visited sites where scrolling to the bottom have caused images to be loaded that enlarge the page and so you are no longer at the bottom. I don't think that is any big deal. I suspect you are going to have to let that one go.

Agreed. There are js codes that load more items when the user scrolls close to the bottom to extend the page.

I am not able to reproduce the problem in MacOS in Firefox, Chrome, or Safari. In fact, this was one of the situations I tested. What OS and browser (and their versions) are you using?

I'm using Windows and have tried different browsers. Using your code, the behavior is different. I clicked to the last hash, pressed Ctrl+Home and F5. The scroll position remains at the beginning (not going to the last hash), which is acceptable, while in my treatment I did the scrollIntoView twice to get to the last hash again. When I then clicked on a link on the page to get to the last hash, I found that your code works fine using Firefox and Google Chrome, bringing the scroller to the correct hash position. But in Microsoft Edge or 360 browsers, the scroller goes to the incorrect location probably because there was no hash change. Hopefully you can see the problem yourself using Microsoft Edge or 360 browsers.

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What version of windows are you using, and what version of Edge? (These are important, which is why I asked for that information earlier.)

I have tried my test file in Windows 11 with Edge 97.0.1072.55, and it works (the position remains at the top, as is the case with Chrome, which makes sense since Edge now uses Chromium as its display engine). If you are using an old version of Edge that is before the change from Trident to Chromium, that could be the problem. I'm not willing to install the 360 browser on my machine, so I won't be testing that. But are you using "compatibility mode" (which is Trident based)?

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using Windows 10 and Edge 92.0.902.78, older than your version. Staying at the top after clicking #last_hash, pressing Ctrl+Home, and F5 is OK. But then clicking #last_hash should go to the right place, which was not the case using my version of Edge and 360. In 360, I tested only the speed mode (the formulas do not typeset in compatibility mode).

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I misunderstood your earlier post. I can reproduce the result as you describe, and will look into it.

@dpvc
Copy link
Member Author

@dpvc dpvc commented on 8c55381 Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it turns out that the problem is that the hash change event doesn't occur when you click the link if the has value is already set to that value in the URL, as it is when you reload the page. I could not find a better way to monitor the hash links. One possible solution is to add some javascript code to override the actual links themselves (so you can tell when one is pressed), but that introduces other issues. The other approach is to force the page to move to the hash link explicitly when the page is loaded. This is a change to the browser's standard behavior, but seems a reasonable action, and requires less hacking about with the links. The code below implements this idea, and should work for Edge as well as other browsers.

MathJax = {
  options: {
    compileError(doc, math, err) {console.log(err); return doc.compileError(math, err)},
    typesetError(doc, math, err) {console.log(err); return doc.typesetError(math, err)}
  },
  loader: {
    load: ['ui/lazy'],
  },
  startup: {
    ready() {
      //
      // Do the regular startup.
      //
      MathJax.startup.defaultReady();
      //
      // Check whether the user has scrolled the window before typesetting is complete
      //
      let userScroll = false;
      let initialScroll = false;
      const scrollHandler = () => {
        if (initialScroll) {
          userScroll = true;         // this is the user scrolling the window by hand
          window.removeEventListener('scroll', scrollHandler);
        } else {
          initialScroll = true;      // this is the initial scroll due to a hash URL
        }
      };
      //
      // Scroll hash into view
      //
      const scrollToHash = () => {
        const hash = document.getElementById(location.hash.slice(1));
        if (hash) hash.scrollIntoView();
      };
      //
      // Set up lazy initial promise by hijacking the lazyProcessSet() 
      //   method for the first time it is called.
      //
      const MathDocument = MathJax.startup.document;
      const PROCESSSET = MathDocument.lazyProcessSet;
      const setLazyProcessSet = () => {
        MathDocument.lazyProcessSet = (window.requestIdleCallback ?
          () => window.requestIdleCallback(lazyCheckHash) :
          () => setTimeout(lazyCheckHash, 10)
        );
        userScroll = initialscroll = false;
        window.addEventListener('scroll', scrollHandler);
      };
      //
      // Function to move hash location to top after typesetting
      //
      const lazyCheckHash = () => {
        MathDocument.lazyProcessSet = PROCESSSET;  // restore initial function 
        MathDocument.lazyHandleSet();              // typeset the math
        MathDocument.lazyPromise.then(() => {      // after the typesetting is done
          if (!userScroll) scrollToHash()          //   scroll hash into view if user hasn't scrolled
          window.removeEventListener('scroll', scrollHandler);    // clear the scroll handler
        });
      };
      //
      // Handle hash changes and initial hash URL processing
      //
      window.addEventListener('hashchange', setLazyProcessSet, false);
      if (location.hash) {
        setTimeout(() => {
          setLazyProcessSet();
          scrollToHash();          // force scrolling into view since hashchange event doesn't
                                   //   fire if hash is already in place when we click link to hash.
        },0);                      // setTimeout() needed for Edge in order to get the scroll to happen after
                                   //   the browser scrolls to the original location on a reloaded page.
      }
    }
  }
};

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was really the "no hash change" issue that I guessed. Difficult problem because different browsers treated it differently. I chose to do the second choice and found that scrollIntoView needed to be called twice. I see two scrollToHash calls in your code, so I think your code now does the same but reacts more quickly (mine had to wait 0.5 sec after the first scrollIntoView call to ensure the formulas have finished typesetting before the second scrollIntoView call). Good, I'll use your code now!

@zhuoranh21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize the forced hash location on refresh is sometimes not wanted. For example, at the beginning of every section there's a hash location. When I go to the middle of a section and click a link and then click go back, the standard browser behavior is to return to the scroll position from where I left the page. Now since there are formulas, the browser-recorded scroll position may no longer be correct. And returning to the beginning of the section may not be what the user wants, either. The user simply wants to go back to the position where he clicked the link (undo the click).

Another situation: I sequentially read from the beginning to the middle, and press F5. The standard browser behavior is to stay unmoved. The lazy typesetter without the patch code may scroll to an unwanted position because the formulas I've read now all become untypesetted and the page length changes. With the patch, the browser scrolls to the beginning of the page (or some hash location).

I think there is an ultimate cure of all the problems we've been discussing so far (including F5, Ctrl+End, or whatsoever). Is it possible to do a quick ``preprocessing'' of all displaystyle formulas throughout the page before the window.onload event to make them occupy the space required to hold the fully typesetted formula? The lazy mode only applies to the true typesetting step. The preprocessing avoids all height changes and messing up of scroll positions in future.

Please sign in to comment.