url package is tagged as Formula if the math code is loaded #5

u-fischer · 2023-07-24T14:25:19Z

\url uses internally math mode and this is grabbed and tagged as Formula. The problem is that url does set \m@th but inside the math at the end:

\def\Url@FormatString{%
 \UrlFont \Url@MathSetup 
 $\fam\z@ \textfont\z@\font
 \expandafter\UrlLeft\Url@String\UrlRight
 \m@th$% <--------------  
}%

This can be corrected by moving \m@th into \Url@MathSetup, but I wonder if a dedicated command to mark "fake math" is needed, and if the code should detect \m@th inside dollars?

\DocumentMetadata{uncompress,testphase={phase-III,math}}
\documentclass{article}
\usepackage{url}
\begin{document}

\url{https://www.latex-project.org} %tagged as formula
\makeatletter
\AddToHook{cmd/Url@MathSetup/before}{\@math}
\makeatother

\url{https://www.latex-project.org} %ok only text.

$a=b$
\end{document}

The text was updated successfully, but these errors were encountered:

car222222 · 2023-07-24T14:41:26Z

@davidcarlisle once commented that putting \m@th at the end of the math is quite common.

Please can you tell me where to find the code you are using to "grab math stuff" : file, location and branch. Thanks.
Then I can perhaps look into how we might detect such cases.

car222222 · 2023-07-24T14:44:41Z

Is it obvious that the use of mathmode is a necessary, or wise, method by which to format a URL?

FrankMittelbach · 2023-07-24T14:53:50Z

@davidcarlisle once commented that putting \m@th at the end of the math is quite common.

Please can you tell me where to find the code you are using to "grab math stuff" : file, location and branch. Thanks. Then I can perhaps look into how we might detect such cases.

develop branch latex-lab/latex-lab-math.dtx (or something like that)

FrankMittelbach · 2023-07-24T14:55:22Z

Is it obvious that the use of mathmode is a necessary, or wise, method by which to format a URL?

may be historic and Donald's style of coding, you can do some tricks if you pretend you are in math

davidcarlisle · 2023-07-24T14:56:20Z

Is it obvious that the use of mathmode is a necessary, or wise, method by which to format a URL?

it's not for math as such but for \mathcode"8000 math-active characters, which could, perhaps, be replaced by \scantokens and real active characters these days but url.sty math mode processing has a very long history.

u-fischer · 2023-07-24T15:04:40Z

Is it obvious that the use of mathmode is a necessary, or wise, method by which to format a URL?

I hope to convince someone (@josephwright) at some time to write a replacement, the current implementation for example can't handle unicode properly, but for now we have to take what is there.

josephwright · 2023-07-24T15:06:12Z

@u-fischer OK, I'll add it to the to-do list. Could we start a formal spec list somewhere so I don't start on the wrong path?

josephwright · 2023-07-25T08:36:53Z

I think Donald is using math mode to get fine control of line breaking: not sure if a non-math mode solution is really viable as a result.

car222222 · 2023-07-25T08:54:07Z

Yet another strange way to make use of TeX's abilities!

FrankMittelbach · 2023-07-25T09:40:16Z

sure we all came up with tricks like that to make things work in limited space (this is why I said using math mode for tricks because it was a bit more that the math active). @josephwright I'm not sure that it would be that hard conceptually by parsing through the url token by token. On the other hand, it might be simpler to just accept that mmode is sometimes misused for its technical possibilities and all we would need to do for this is to have a reasonable simple flag to ensure that no tagging happens (which could be a simple as requiring \m@th as the first token after $ or some dedicated \NotMath).

josephwright · 2023-07-25T09:42:28Z

@FrankMittelbach I think on reflection you are fight. My feeling is we really should look to 'fix' these uses but we also need to at least try to 'handle' them. So yes, some form of flag to say 'not maths' is a good idea, but we should also try over time to re-implement the 'abuses' so we don't need math mode: I suspect a 'not maths' flag will have edge cases where it falls.

u-fischer · 2023-07-25T09:57:44Z

So yes, some form of flag to say 'not maths' is a good idea

Yes, a flag to say "not math" is a good idea, and also a flag "this is math", to overwrite the \m@th detection. Perhaps the private boolean used currently inside \m@th should be made public?

car222222 · 2023-07-25T10:29:10Z

I just checked and: @josephwright did fix the \m@th detector, using \tl_if_in:nnF, so that it should now be effective anywhere within the grabbed math.

So there must be some other reason why it grabs the math in this case.

FrankMittelbach · 2023-07-25T10:29:41Z

Why would you need to overwrite the \m@th detection? If the mechanism is that \m@th has to be the first token after $ then a simple correction is to use \relax\m@th for existing code. Of course you could give this \relax a name such as \IsMath :-)

FrankMittelbach · 2023-07-25T10:30:37Z

I just checked and: @josephwright did fix the \m@th detector, using \tl_if_in:nnF so that it should now be effective anywhere within any math.

So there must be some other reason why it grabs the math in this case.

interesting ... some grouping that interfers perhaps?

car222222 · 2023-07-25T10:38:49Z

Not sure: I did not check the details of what \tl_if_in:nnF actually does if the tl contains brace-groups:-).
I just assumed that it recurses into them.

u-fischer · 2023-07-25T10:46:35Z

I just checked and: @josephwright did fix the \m@th detector, using \tl_if_in:nnF, so that it should now be effective anywhere within the grabbed math.

So there must be some other reason why it grabs the math in this case.

If I do the following the first three are tagged as math, only the last one is normal text:

\DocumentMetadata{uncompress,testphase={phase-III,math}}
\documentclass{article}
\begin{document}
\makeatletter
$ a=b$

$\m@th c=d $

$e=f \m@th$

{\m@th $g=h$ }
\end{document}

josephwright · 2023-07-25T10:54:00Z

@u-fischer We had some back-and-forward about \m@th, with the result being at the time we decided it only meant 'not maths' if it came immediately before the $. That was because it also shows up in 'real' maths (amsmath, etc.).

FrankMittelbach · 2023-07-25T11:06:02Z

I'm surprised you say "before" and also surprised if that is what amsmath does, because it means you have to add an extra (unnecessary) group to keep the change local, which you get for free if you put it inside the dollars. So I always thought we implemented $\m@th as the legacy indicator.

However, seeing how it is used in amsmath I guess you are right and we should not look for \m@th inside, because quite often there is is in fact use to produce math but with mathsurround forced to zero. So that does in fact mean we need some flag (outside of the $...$ probably) that signals that the next $...$ is not or is math.

josephwright · 2023-07-25T11:20:53Z

As a reminder, we find in amsmath for example both

\def\@mathmeasure#1#2#3{\setbox#1\hbox{\frozen@everymath\@emptytoks
    \m@th$#2#3$}}

and

\def\plainroot@#1\of#2{\setbox\rootbox\hbox{%
 $\m@th\scriptscriptstyle{#1}$}%
\def\r@@t#1#2{\setboxz@h{$\m@th#1\sqrtsign{#2}$}%
 \dimen@\ht\z@\advance\dimen@-\dp\z@
 \setbox\@ne\hbox{$\m@th#1\mskip\uproot@ mu$}%
 \advance\dimen@ by1.667\wd\@ne
 \mkern-\leftroot@ mu\mkern5mu\raise.6\dimen@\copy\rootbox
 \mkern-10mu\mkern\leftroot@ mu\boxz@}

car222222 · 2023-07-25T11:28:58Z

@josephwright wrote:
We had some back-and-forward about \m@th, with the result being
at the time we decided it only meant 'not maths' if it came immediately before the $.

Maybe we did, but then later it got changed, as I just explained, to not grab the any math that contains it. I did actually check
the code for this.

Also, which $, the first or last. Anyway, that is certainly not what the current implementation.

That was because it also shows up in 'real' maths (amsmath, etc.).

I one believed that took but I have not yet found any examples of this, at least not in amsmath.

Where it does occur is in connection with math that is itself contained in "faketext", such as when an alignment or an hbox is used within math to format some purely mathematical construct,
with no real text involved.

Some time ago I wrote an essay to describe a reasonably robust method of distinguishing (within math) between such faketext and real text within the math. This should be implemented so that real math within real text should be grabbed in cases where the current setup does not grab it.

I wonder where that file is now?

car222222 · 2023-07-25T11:34:18Z

For emphasis:
I believe that the convention that "it must occur immediately before a $" was never on the table.
I am reasonably sure that at one stage the code checked only whether it came immediately after the opening $ or $$ (or
before the math, of course).

car222222 · 2023-07-25T11:38:51Z

@FrankMittelbach wrote:

However, seeing how it is used in amsmath . . . because quite
often there it is in fact use to produce math but with
mathsurround forced to zero.

Please can you point to any example of this that is not immediately inside "faketext", such as an hbox or vbox+array. I really need to find any such examples if they exist.

car222222 · 2023-07-25T11:43:18Z

@josephwright All of those are examples of "faketext", as I explained.

car222222 · 2023-07-25T11:46:56Z

Also, of course, if the \m@th is buried in a definition like these examples, then it will not be found by just scanning the top-level math contents.

u-fischer · 2023-07-25T13:06:04Z

@u-fischer We had some back-and-forward about \m@th, with the result being at the time we decided it only meant 'not maths' if it came immediately before the $.

Well actually that is not true. After looking at the emails and code I think @car222222 is quite right. Math is not processed if \m@th is detected.

\DocumentMetadata{uncompress,testphase={phase-III,math}}
\documentclass{article}

\ExplSyntaxOn
\math_processor:n{XXXX}
\ExplSyntaxOff
\begin{document}
\makeatletter
$ a=b$

$\m@th c=d $

$e=f \m@th$

{\m@th $g=h$ }
\end{document}

gives

The problem is only with the tagging code which is outside the processor in \__math_grab_dollar:w and so applied unconditionally (unless \__math_grab_dollar:w is not executed if the boolean is false)

\cs_new_protected:Npn \__math_grab_dollar:w % $
  #1 $
  {
    \tl_if_blank:nF {#1}
      {
        \__math_process:nn { math } {#1} % $
        \tagmcend %end P-chunk, in code: \tag_mc_end_push:
        \@kernel@math@begin
        #1 $
        \@kernel@math@end
        \tagmcbegin{}  % restart P-chunk (whatsits in pdftex)
      }
  }

car222222 · 2023-07-25T14:11:58Z

Well, that is all Frank's (or maybe Ulrike's?) code, which has probably not been checked. I have not seen it before today.

Logically, it would seem reasonable to me that the tagging
should be done (or at least set up, or not) by the processor:
then it would not get done if the math did contain a \m@th.

FrankMittelbach changed the title ~~url is tagged as Formula if the math code is loaded~~ url package is tagged as Formula if the math code is loaded Jul 24, 2023

FrankMittelbach added the currently incompatible package or class package or class that doesn't work with current version of tagging code label Jul 24, 2023

u-fischer added the fixed in release issue is fixed and will be deployed in the next release of package or kernel label Sep 14, 2023

u-fischer closed this as completed Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

url package is tagged as Formula if the math code is loaded #5

url package is tagged as Formula if the math code is loaded #5

u-fischer commented Jul 24, 2023 •

edited

car222222 commented Jul 24, 2023

car222222 commented Jul 24, 2023

FrankMittelbach commented Jul 24, 2023 •

edited

FrankMittelbach commented Jul 24, 2023 •

edited

davidcarlisle commented Jul 24, 2023

u-fischer commented Jul 24, 2023

josephwright commented Jul 24, 2023

josephwright commented Jul 25, 2023

car222222 commented Jul 25, 2023

FrankMittelbach commented Jul 25, 2023

josephwright commented Jul 25, 2023

u-fischer commented Jul 25, 2023

car222222 commented Jul 25, 2023 •

edited

FrankMittelbach commented Jul 25, 2023 •

edited

FrankMittelbach commented Jul 25, 2023 •

edited by car222222

car222222 commented Jul 25, 2023

u-fischer commented Jul 25, 2023

josephwright commented Jul 25, 2023

FrankMittelbach commented Jul 25, 2023

josephwright commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023 •

edited

u-fischer commented Jul 25, 2023

car222222 commented Jul 25, 2023

url package is tagged as Formula if the math code is loaded #5

url package is tagged as Formula if the math code is loaded #5

Comments

u-fischer commented Jul 24, 2023 • edited

car222222 commented Jul 24, 2023

car222222 commented Jul 24, 2023

FrankMittelbach commented Jul 24, 2023 • edited

FrankMittelbach commented Jul 24, 2023 • edited

davidcarlisle commented Jul 24, 2023

u-fischer commented Jul 24, 2023

josephwright commented Jul 24, 2023

josephwright commented Jul 25, 2023

car222222 commented Jul 25, 2023

FrankMittelbach commented Jul 25, 2023

josephwright commented Jul 25, 2023

u-fischer commented Jul 25, 2023

car222222 commented Jul 25, 2023 • edited

FrankMittelbach commented Jul 25, 2023 • edited

FrankMittelbach commented Jul 25, 2023 • edited by car222222

car222222 commented Jul 25, 2023

u-fischer commented Jul 25, 2023

josephwright commented Jul 25, 2023

FrankMittelbach commented Jul 25, 2023

josephwright commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023

car222222 commented Jul 25, 2023 • edited

u-fischer commented Jul 25, 2023

car222222 commented Jul 25, 2023

u-fischer commented Jul 24, 2023 •

edited

FrankMittelbach commented Jul 24, 2023 •

edited

FrankMittelbach commented Jul 24, 2023 •

edited

car222222 commented Jul 25, 2023 •

edited

FrankMittelbach commented Jul 25, 2023 •

edited

FrankMittelbach commented Jul 25, 2023 •

edited by car222222

car222222 commented Jul 25, 2023 •

edited