Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 3 generates unexpected output with tex4ht and mathjax #530

Closed
rlkamalapurkar opened this issue Aug 15, 2021 · 22 comments
Closed

Version 3 generates unexpected output with tex4ht and mathjax #530

rlkamalapurkar opened this issue Aug 15, 2021 · 22 comments
Assignees

Comments

@rlkamalapurkar
Copy link

Compiling the file test.tex:

\documentclass{article}
\usepackage{siunitx}
\begin{document}
	The speed is $v = 3$ \si[per-mode=symbol]{\meter\per\second}.
\end{document}

using make4ht test.tex "xhtml,mathjax" puts a part of the siunitx source code in the output test.html

<body>
<!-- l. 4 --><p class='noindent'>The speed is \(v = 3\) \(\relax \exp_args:NV \__siunitx_print_math_auxiii:n \l__siunitx_print_tmp_tl \).
</p>
</body> 

The document compiles correctly with make4ht if I use siunitx-v2 instead of version 3.0.24.

@josephwright josephwright self-assigned this Aug 15, 2021
@josephwright
Copy link
Owner

I'll take a look. I can see where that line comes from in siunitx but I'm not sure why it's not being expanded by tex4ht.

@josephwright
Copy link
Owner

At the cost of loosing font control, one could use

\ExplSyntaxOn
        \cs_gset_protected:Npn \__siunitx_print_math_auxii:n #1
          {
            \tl_set:Nn \l__siunitx_print_tmp_tl {#1}
            \exp_args:NNnx \tl_replace_all:Nnn \l__siunitx_print_tmp_tl
              { ^ } { \token_to_str:N ^ }
            \exp_args:NV \ensuremath \l__siunitx_print_tmp_tl
          }
\ExplSyntaxOff

(after \begin{document}) for the present.

I think I will need to discuss this with the tex4ht developers: I can see a bit of what is happening, but I'm not sure if that is the best fix.

@michal-h21
Copy link

In the MathJax mode, all content of math environments is passed unexpanded to the HTML output, and it is left to MathJax to render it. So it is important that content in \( ... \) contains only macros that MathJax knows.

@josephwright
Copy link
Owner

@michal-h21 I'd worked that out :) I remember now that in v2 I just gave up with units and forced text mode. That's really sub-optimal from a semantic point of view, but my hack above means we give up with font control: also suboptimal. As we are in a typesetting context, I'd not done stuff by expansion, but I probably could arrange to have a 'clean' math mode setup before applying \ensuremath.

I'll need to ponder this a bit: some of the search-and-replace gets awkward, etc.

@michal-h21
Copy link

Your code works pretty well when I save it as siunitx.4ht:

\ExplSyntaxOn
\AtBeginDocument{%
\cs_gset_protected:Npn \__siunitx_print_math_auxii:n #1
  {
    \tl_set:Nn \l__siunitx_print_tmp_tl {#1}
    \exp_args:NNnx \tl_replace_all:Nnn \l__siunitx_print_tmp_tl
      { ^ } { \token_to_str:N ^ }
    \exp_args:NV \ensuremath \l__siunitx_print_tmp_tl
}
}
\ExplSyntaxOff

\Hinput{siunitx}
\endinput

The only issue is that it still produces the \relax command to the output HTML, and MathJax doesn't support that. It is possible to pass a dummy configuration for this command to MathJax, but it would be better to omit it from the output anyway.

@josephwright
Copy link
Owner

I've got a few ideas about how to approach this. The best is if there is a way to know we are in MathJax mode: @michal-h21 is there a flag? If so, I could arrange that \SI, etc., given in text mode 'convert' to math mode, and that will then mean that material is passed to MathJaX as-is. Failing that, I think I can arrange to move where the internal \enusremath sits such that it's not an issue.

My hack is very much that: it relies on siunitx internals so could go into siunitx itself but I'd rather didn't go into any other packages. On the \relax, that's just from \ensuremath: one could arrange to avoid that as with e-TeX we don't require the 'defensive' code.

@michal-h21
Copy link

For example the \ifdefined\fixmathjaxtoc is true only in the MathJax mode. It is definitely best to do any fixes in siunitx.4ht. Note that it is executed before \AtBeginDocument', it seems that siunitx` defines some macros at that moment?

Maybe it would be easier to output the detokenized math content and, and \( \) strings around it? To avoid any involment of the LaTeX math mode, as it is not useful in this case anyway.

@rlkamalapurkar
Copy link
Author

@josephwright, you said

If so, I could arrange that \SI, etc., given in text mode 'convert' to math mode, and that will then mean that material is passed to MathJaX as-is

This is exactly what siunitx does if the \si command is inside math mode in the TeX source code:

\begin{document}
	The speed is $v = 3 \si[per-mode=symbol]{\meter\per\second}$.
\end{document}

gives

<body>
<!-- l. 4 --><p class='noindent'>The speed is \(v = 3 \si [per-mode=symbol]{\meter \per \second }\).
</p> 
</body> 

That does not help, though, because as of now, MathJaX does not support \si without external packages.

I think leaving it up to MathJaX would be ideal, especially if the external package could be integrated with MathJaX. It is beyond my coding abilities to do that myself, though!

I just filed this bug report because I was using \si in text mode to generate my site and it stopped working when I updated siunitx to version 3.

@michal-h21
Copy link

@rlkamalapurkar you can configure MathJax in TeX4ht, so you can try to integrate it with the MathJax Siunitx package.

@josephwright
Copy link
Owner

@michal-h21 As my hack touches an internal function, it's really not suitable for anything outside of siunitx itself. I'm pondering if you need an API here or whether I can fix nicely at my end.

(I guess @FrankMittelbach might have wider comments on the entire business of patches for tex4ht)

@michal-h21
Copy link

michal-h21 commented Aug 16, 2021

@josephwright I've come with this solution:

\ExplSyntaxOn
\ifdefined\fixmathjaxtoc
\AtBeginDocument{%
\cs_gset_protected:Npn \__siunitx_print_math_auxii:n #1
  {
    \tl_set:Nn \l__siunitx_print_tmp_tl {#1}
    \exp_args:NNnx \tl_replace_all:Nnn \l__siunitx_print_tmp_tl
      { ^ } { \token_to_str:N ^ }
    % escape special HTML characters
    \regex_replace_all:nnN { \x{26} } { &amp; } \l__siunitx_print_tmp_tl
    \regex_replace_all:nnN { \x{3C} } { &lt; } \l__siunitx_print_tmp_tl
    \regex_replace_all:nnN { \x{3E} } { &gt; } \l__siunitx_print_tmp_tl
    \HCode{\detokenize{\(} \tl_to_str:N \l__siunitx_print_tmp_tl \detokenize{\)}}
}
}
\fi
\ExplSyntaxOff

\Hinput{siunitx}
\endinput

It reuses some code that TeX4ht uses in the MathJax mode to replace <, > and &, as these characters would cause invalid HTML. I am not sure about the \HCode line, it is not in the Expl 3 style, but it does the trick :)

I can put this to siunitx.4ht and add it to the TeX4ht sources.

@josephwright
Copy link
Owner

I've updated the code to do a better job here, but I still need to think about how best to expose 'extra search and replace' to tex4ht. I'm still wondering a bit about this: I guess I expected it to be handled 'last minute'.

@michal-h21
Copy link

@josephwright I've already put the code from my previous post to TeX4ht sources. Will it need a modification for the new Siunitx code?

@josephwright
Copy link
Owner

@michal-h21 I've still only got an internal interface, so you'll want something like

\tl_if_exist:NTF \l__siunitx_print_math_html_tl
  {
    \tl_put_right:Nn \l__siunitx_print_math_html_tl
      {
        & { &amp; }
        < { &lt; }
        > { &gt; }
      }
  }
  {
    \cs_gset_protected:Npn \__siunitx_print_math_auxii:n #1
      {    
        \tl_set:Nn \l__siunitx_print_tmp_tl {#1}
        \exp_args:NNnx \tl_replace_all:Nnn \l__siunitx_print_tmp_tl
          { ^ } { \token_to_str:N ^ }
        % escape special HTML characters
        \regex_replace_all:nnN { \x{26} } { &amp; } \l__siunitx_print_tmp_tl
        \regex_replace_all:nnN { \x{3C} } { &lt; } \l__siunitx_print_tmp_tl
        \regex_replace_all:nnN { \x{3E} } { &gt; } \l__siunitx_print_tmp_tl
        \HCode{\detokenize{\(} \tl_to_str:N \l__siunitx_print_tmp_tl \detokenize{\)}}
    }
  }

I'm still trying to work out a proper, public, interface. What's confusing me is I don't follow why you need to filter out &, < and > at the siunitx end, as they must show up in general math mode material anyway - don't you make them math-active?

@michal-h21
Copy link

Thanks. We need to escape &, < and > because they would end directly in the HTML code otherwise. As they are special HTML characters, it would result in rendering errors.

What does this code do?:

\tl_put_right:Nn \l__siunitx_print_math_html_tl
      {
        & { &amp; }
        < { &lt; }
        > { &gt; }
      }

@josephwright
Copy link
Owner

Actually, you might need

\tl_put_right:Nx \l__siunitx_print_math_html_tl
  {
    & { \token_to_str:N & amp ; }
    < { \token_to_str:N & lt ; }
    > { \token_to_str:N & gt ; }
  }

What this does is add to the internal token list (macro) \l__siunitx_print_math_html_tl, which is then used in a search-and-replace of the tokens to be passed to math mode.

The reason I've not provided this as a public interface is that I'd imagine you need to handle a simple

$ a < b $

and the 'obvious' way to me is something like

\mathcode`\<="8000\relax
\begingroup
  \catcode`\<=\active
  \xdef<{\string&gt;}
\endgroup

which would then apply to the output from siunitx without any special handling. That won't work for ^ from siunitx as I explicitly set it as catcode-7, which is why I have to 'tidy up' that one case.

@ebin-dev
Copy link

ebin-dev commented Apr 8, 2023

Hi there,

siunitx produces unexpected output if \num{} is used in text.tex above (only inside a math environment):

The speed is $v = \num{3.14159}$ \si[per-mode=symbol]{\meter\per\second}.

it gives

<!-- l. 4 --><p class='noindent'>The speed is \(v = \num {3.14159}\) \(\mathrm {m}/\mathrm {s}\).

\num is unknown to the browser and shows up in red. Could this be solved - or is there another solution ?

Thanks for your excellent work !

@michal-h21
Copy link

You need to configure MathJax to support Siunitx. See this guide on how to pass MathJax configuration from TeX4ht. There seems to be MathJax extension for Siunitx, but with a deprecation warning, so I am not sure how well it works.

In your case the following configuration file can be used to support just the \num command:

\Preamble{xhtml} 
\catcode`\#=11 
\Configure{MathJaxConfig}{{ 
    tex: { 
      tags: "ams", 
      \detokenize{% 
      macros: { 
        num: ["#1",1],
      } 
  } 
} 
}} 
\catcode`\#=6 
\begin{document} 
\EndPreamble

@ebin-dev
Copy link

ebin-dev commented Apr 11, 2023

@michal-h21 : the configuration indeed solves the issue with \num{} inside a math environment - thank you very much!

However, siunits still do not display correctly within any math environment, and unfortunately I could not figure out how to modify the configuration so that it would work. The MathJax extension is deprecated - it would appear to partially resolve \si{} but not \unit{}.

Would you have a suggestion to make the following text display correctly in html ? This would probably also be helpful for others :-).

test.tex:
The values are \begin{equation}v = \num{3.14159}\,\si{\meter\per\second}\end{equation} and \[ \lambda = \num{2,71828}\,\unit{\centi\meter} \]

it gives:

<p class='noindent'>The values are \begin {equation} v = \num {3.14159}\,\si {\meter \per \second } \end {equation}<a id='x1-2r1'></a> and \[ \lambda = \num {2,71828}\,\unit {\centi \meter } \]
</p>

"\si \meter \per \second" and "\unit \centi \meter" is still displayed in red by the browser ...

@michal-h21
Copy link

I guess the best solution would be if someone upgraded the MathJax extension. Otherwise, you would need to pass suitable definition for all Siunitx commands, which would be complicated. You can also try the MathML + MathJax combination. You would avoid the issues with unknown macros, but it is possible that you would run into other issues. You can try it using

 $ make4ht test.tex "mathml,mathjax"

@limefrogyank
Copy link

I guess the best solution would be if someone upgraded the MathJax extension.

I'm working on a javascript port of this tool that will work in the latest version of MathJax. It's not public yet, but it
s coming very soon and will be open/free/etc. Maybe next month. Hopefully, before the end of the year.

@josephwright
Copy link
Owner

Looks to me like this is fixed from the TeX4ht end, so I will close here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants