Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write mathematics as HTML script tags? #611

Closed
rbeezer opened this issue Jul 3, 2017 · 26 comments
Closed

Write mathematics as HTML script tags? #611

rbeezer opened this issue Jul 3, 2017 · 26 comments

Comments

@rbeezer
Copy link
Collaborator

rbeezer commented Jul 3, 2017

With paragraph bust-up in place at #515 we could consider writing math/LaTeX/mathjax as script elements:

http://docs.mathjax.org/en/latest/advanced/model.html#mathjax-script-tags

This might greatly simplify cross-references to equations (HTML id rather than MathJax LaTeX \label{} mechanism). Maybe it would remove step from page loading and give a speed-up? Identifying mathematics with proper elements might have other benefits?

Not sure I see a downside, other than some uglieness to accomodate IE quirks. Benefits may be marginal.

Reactions?

@davidfarmer
Copy link
Contributor

davidfarmer commented Jul 6, 2017 via email

@davidfarmer
Copy link
Contributor

Okay, I found an example of a referenced numbered equation. It looks like there is
a \label in the HTML source, but it is ignored because the HTML has a hard-coded
\tag, and the reference also has that tag hard-coded.

Since there is no way to replicate the numbering in the PDF unless the numbers are
directly written into the HTML, I don't get what the \label is doing, or how an HTML id
could be of any use.

@rbeezer
Copy link
Collaborator Author

rbeezer commented Jul 6, 2017 via email

@rbeezer
Copy link
Collaborator Author

rbeezer commented Jul 6, 2017 via email

@rbeezer rbeezer mentioned this issue Nov 26, 2018
@rbeezer
Copy link
Collaborator Author

rbeezer commented May 21, 2021

My reaction is that it is crazy to consider doing this in the
next year.

It has been four years now.

@Alex-Jordan
Copy link
Contributor

Alex-Jordan commented May 21, 2021

In the CAT example I saw earlier today, the page used MathJax 2.

My weak understanding is that MJ2 finds all the \(...\) and turns them in the DOM into <script type="math/tex"> tags. My total speculation is that the CAT is looking for these script tags to understand where math starts and stops.

My equally weak understanding of MathJax3 is that it does not work this way. It does not build <script type="math/tex"> tags from \(...\). If all of the above is right, the CAT may have an issue with MJ3.

However, MJ3 can be configured to look for <script type="math/tex"> tags. (Either in addition to \(...\), or instead of.) If PreTeXt HTML exclusively used <script type="math/tex"> tags for math, it may be a good thing for the CAT. That is, if my weak understandings and assumptions are anywhere close to correct.

A bonus effect of using <script type="math/tex"> tags is that you can directly write < characters inside them without the web browser thinking that a new tag is starting. So from the perspective of an author using the CAT who is oblivious to the need for \lt, it's another reason to use <script type="math/tex"> tags. (Although for translation to PTX, you would still need to turn < into \lt . Or at least into &lt;.)

@davidfarmer
Copy link
Contributor

The CAT does not use the typeset inline math. For display math the HTML
has a wrapper which is used. So, this issue is not relevant to the CAT.

Also, the CAT is fine with "<" in the input source, because that is intercepted
and converted. (Is, or will be in a few days).

I am not seeing the benefit of this idea.

@rbeezer
Copy link
Collaborator Author

rbeezer commented May 22, 2021

I am not seeing the benefit of this idea.

100% unambiguous. Math/LaTeX delimited by HTML/XHTML/XML syntax. Not a convenient syntax for people authoring one-off web pages outside of PreTeXt.

Suppose somebody "accidentally" authors \(foo\) inside a paragraph. It'll get rendered as LaTeX in HTML output. Except if you try the experiment, it won't happen as I just said. Because we scan all text nodes and any instance of \( becomes \[unicode-no-width-space-here](. We could forgo that kludge.

Alex is getting this script version back from WW servers. I had to put this in bare HTML so an author's interactive could have our MJ run over some LateX in a Javascript widget showing slopes (which is broken now with MJ3, but I know how to fix).

@Alex-Jordan
Copy link
Contributor

Well, I was wrong about the CAT then.

I think it's a good thing if we really don't care how human readable the HTML is. Rob assures me we do not. But maybe you feel differently?

Next consideration is how bad do we want to prevent adventurous hackers from doing something off book? Like I just tried this:

            <p>
                A hack: <m>x+y\) equals \(z</m>
            </p>

And the result looks fine, like:
Screen Shot 2021-05-21 at 9 49 17 PM

Moving to script tags would partly kill such attempts, since to do the same thing you would use closing/opening span tags, which then would ruin your PDF output.

@Alex-Jordan
Copy link
Contributor

Actually that hack is worse than I thought. At present we can do:

            <p>
                A hack: <m>\) Anything can go here, and no tomfoolery will be prevented,
                because we allow text nodes inside math to pass through unaltered.   \(</m>
            </p>

@davidfarmer
Copy link
Contributor

If the only change to inline math is replacing \( and \) by opening and closing script tags, then that is fine with me and I don't see any headaches for CSS or the CAT.

The MathJax setup for the page should change so that \( is not interpreted as as opening math tag. Would that eliminate the need to do \[unicode-no-width-space-here]( ?

Will someone be allowed to define \( as a macro for \left(?

Are there any changes to display math? If div.displaymath is still the wrapper, and any changes are inside that div, that is okay with me.

@Alex-Jordan
Copy link
Contributor

Alex-Jordan commented May 22, 2021

Would that eliminate the need to do [unicode-no-width-space-here]( ?

Yes.

Will someone be allowed to define ( as a macro for \left(?

No. Print latex would still regnize and use \( for its math opening delimiter.

Are there any changes to display math?

It would also replace \[ with the same script tag, with another attribute mode=display.

@davidfarmer
Copy link
Contributor

davidfarmer commented May 22, 2021 via email

@rbeezer
Copy link
Collaborator Author

rbeezer commented May 22, 2021

never outputs backslash square bracket, instead begin(equation*}

I was about to say that I think that is the way it is now!

@Alex-Jordan: If MJ is configured to only look for the script element, will it still "see" \begin{equation}, \begin{align}, etc?

@Alex-Jordan
Copy link
Contributor

With MathJax, it will process \begin{xxx}...\end{xxx} whether you are in math mode or not. For example, it will process \begin{equation*}...\end{equation*} whether you are inside math mode or not. It will process \begin{matrix}...\end{matrix} whether you are inside math mode or not.

For this, you would put the script tag with mode="display" around the \begin{equation*}...\end{equation*}.

@davidfarmer
Copy link
Contributor

For inline math, this is actually a good change which helps the CAT.

As much as possible, I prefer data over code. What I mean is that data
describes an object, and general-purpose code converts the data into
the chosen representation of that object. Need a new object?
Just describe its data, with no need to change the code.

Almost everything is xml, so its opening and closing tags are described by
a tagName, attributes, and attribute values. But not math. Inline math is
not xml, so now there is no tagName, but you need two new fields instead:
openingTag and closingTag. The code has to check if there is a tagName,
and if not, then use the opening and closing tags.

Switching to <script type="math/tex"> addresses that special case
for inline math.
That is good, and I will switch to that as the inline HTML math wrapper.

But what about display math? If I still need to supply
\begin{equation*}...\end{equation*}, or align for multiline,
then I can't remove the code which handles the special case
of no xml tagName.

@Alex-Jordan
Copy link
Contributor

If you have

<script type="math/tex" mode="display">\begin{equation*}\frac{x}{2}+y=z\end{equation*}</script>

and you drop the script tag and have \begin{equation*}\frac{x}{2}+y=z\end{equation*} stored in some variable, you still have valid MathJax inline math. It would be invalid in regular LaTeX, but \begin{equation*}\frac{x}{2}+y=z\end{equation*} is valid in inline math mode when using MathJax. So perhaps viewing the \begin{equation*}...\end{equation*} as a display math delimiter is not how to look at it. Instead there is a thing that MathJax does where it looks for these delimiters, and if there are no other containing math delimiters then it infers you want display mode.

@rbeezer
Copy link
Collaborator Author

rbeezer commented May 23, 2021

Right. Except PTX HTML output had a div.displaymath wrapping it.

Not sure I'm tracking the CAT scenario, but div.displaymath could certainly contain more information via additional attributes - like the originating PTX element (md, mdn, me, men) or the resulting LaTeX environment (equation, align*, gather). Would that help?

@davidfarmer
Copy link
Contributor

The user of the CAT only types the contents of the display math,
not the begin and end tags.

Wrapping the content in div.displaymath and/or a script tag is the
expected behavior.
It is the LaTeX-style begin and end tags I would like to avoid.
But if those have to be there, then it is not actually that much of a hassle
for me to leave the code as-is. And I would not be surprised if down the
road other things need separate beginning and ending tags.

@rbeezer
Copy link
Collaborator Author

rbeezer commented May 23, 2021

Maybe way off-base, but is the following a solution? Maybe not a good solution, but a demonstration that I am understanding.

<div class="displaymath" latex-env="equation">
x^2+y^2=25
</div>

and then PTX JS gets this before MathJax does and injects the script tag for MJ, and the \begin{equation} to make the right LaTeX, then MJ gets what it needs/wants?

@Alex-Jordan
Copy link
Contributor

Will the CAT be delving into mrow level markup? In other words, will CAT users write \\, or will they use the CAT to move to a new row?

@davidfarmer
Copy link
Contributor

davidfarmer commented May 23, 2021 via email

@Alex-Jordan
Copy link
Contributor

Here are the MathJax instructions for using script tags in MJ3:
http://docs.mathjax.org/en/latest/upgrading/v2.html?highlight=findScript#changes-in-the-mathjax-api#math-script-example

As I read the configuration code, it seems that it is not important to literally use mode='display'. That attribute in the script tag could be mode='gather', mode='align', or mode='alignat' to keep track of the type, and in that configuration the variable display just needs to look for any of them. Or mode='display' if it is not multiline. Maybe starred variants to track numbered or not.

So a rough outline is: based on the value of @mode in the script tag, the CAT should know to expect \begin{gather}...\end{gather}, \begin{align}...\end{align}, \begin{equation}...\end{equation}, etc., and trim away exactly what it expects to find there, while keeping track of the flavor based on the explicit @mode value.

In the other direction, it could take what an author writes and infer:

  • single line/multiline
  • numbered/not numbered
  • multiline gather/multiline align

based on what the user typed in. The only thing I can't think of an automated inference for is alignat versus align. But if the CAT infers (for example) multiline gather, not numbered, it could build the script tag with @mode set accordingly. It could write PTX output that only had the real row by row content that the user typed. And it could insert the \begin{gather*}...\end{gather*} inside the script tag for HTML.

If I understood right, the redundancy of setting @mode and also writing \begin{gather*}...\end{gather*} is less than ideal.

@davidfarmer
Copy link
Contributor

I looked at the way MJ3 replaces find and standard math delimiters by a
script tag. It is not as simple as in MJ2.

If we decide to make the switch, it should be done when we have time to focus on it.

@rbeezer
Copy link
Collaborator Author

rbeezer commented May 24, 2021

Not imminent. ;-)

@rbeezer
Copy link
Collaborator Author

rbeezer commented Oct 30, 2021

MathJax 3 allows for marking elements it will ignore, and marking elements it will process. There still needs to be LaTeX delimiters, but since we know whare math is, we can isolate the intrepration of these delimiters. So a different solution to the proposal here.

Guts of the change at d5ef68f

@rbeezer rbeezer closed this as completed Oct 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants