Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility between Latin and unicode-math #394

Closed
Doc73 opened this issue Feb 27, 2020 · 22 comments
Closed

Incompatibility between Latin and unicode-math #394

Doc73 opened this issue Feb 27, 2020 · 22 comments

Comments

@Doc73
Copy link

Doc73 commented Feb 27, 2020

Dear Devs,
I can't know when this issue arose, but I can confidently say it wasn't there on August 1, 2019, when I last compiled a project of mine that I took up again today after months.

This MWE is working fine:

% !TeX program = xelatex
% !TeX encoding = UTF-8
% !TeX spellcheck = it_IT

\documentclass{book}
\usepackage{fontspec}
\usepackage{unicode-math}
\usepackage{polyglossia}
\setmainlanguage[babelshorthands=true]{italian}
	\PolyglossiaSetup{italian}{indentfirst=false}
%\setotherlanguage[variant=modern]{latin}
%\setotherlanguage[]{english}

\begin{document}
Quel ramo del lago di Como, che volge a mezzogiorno, tra due catene non interrotte di monti, 
tutto a seni e a golfi, a seconda dello sporgere e del rientrare di quelli, vien, quasi a un tratto, 
a ristringersi, e a prender corso e figura di fiume, tra un promontorio a destra, e un'ampia costiera 
dall'altra parte; e il ponte, che ivi congiunge le due rive, par che renda ancor più sensibile 
all'occhio questa trasformazione, e segni il punto in cui il lago cessa, e l'Adda rincomincia, 
per ripigliar poi nome di lago dove le rive, allontanandosi di nuovo, lascian l'acqua distendersi 
e rallentarsi in nuovi golfi e in nuovi seni.
\end{document}

It works also if I write

%\usepackage{unicode-math}
...
\setotherlanguage[variant=modern]{latin}

but the latter option cannot coexist with the package unicode-math, namely if I have

\usepackage{unicode-math}
...
\setotherlanguage[variant=modern]{latin}

There are no issues with English.

The problem seems to be related to character ' (U+0027): in fact, if I replace it with the character (U+2019), my document runs fine.

I hope I was clear! :-)

Many thanks in advance,
Domenico

@Doc73 Doc73 changed the title Incompatibility between Latin and unicode-math package Incompatibility between Latin and unicode-math Feb 27, 2020
@jspitz
Copy link
Collaborator

jspitz commented Feb 28, 2020

For the record, latin triggers the following LaTeX errors with unicode-math:

! Missing $ inserted.
<inserted text> 
                $
l.17 ...fiume, tra un promontorio a destra, e un'a
                                                  mpia costiera

and

! Missing $ inserted.
<inserted text> 
                $
l.22 \end{document}

The problem seems to emerge from gloss-latin making ' active, which seems to clash with unicode-math. @wehro?

@jspitz
Copy link
Collaborator

jspitz commented Feb 28, 2020

Respective unicode-math ticket: latex3/unicode-math#462

@jspitz
Copy link
Collaborator

jspitz commented Feb 28, 2020

A workaround mentioned in the unicode-math ticket is \shorthandoff{'} immediately after \begin{document}.

@wehro I suppose we need to make sure shorthands are really only activated in the respective language and if babelshorthands is true.

@Doc73
Copy link
Author

Doc73 commented Feb 28, 2020

@jspitz
Unfortunately I can't test anymore this workaround in my book, because I changed all U+0027 chars into U+2019

@wehro
Copy link
Contributor

wehro commented Mar 2, 2020

A fix is in preparation. Please have some patience.

@jspitz
Copy link
Collaborator

jspitz commented Mar 2, 2020

Sure, we're not in a hurry.

wehro added a commit to wehro/polyglossia that referenced this issue Mar 8, 2020
wehro added a commit to wehro/polyglossia that referenced this issue Mar 8, 2020
@wehro
Copy link
Contributor

wehro commented Mar 8, 2020

I have a solution which works as long as no primes appear within a math formula within a Latin language section (#397).
The prime detection mechanism of unicode-math does not work if the acute is active. This also concerns babel languages with active acute (cf. catalan).

Compare the output of the following document with and without the activeacute option (look the primes):

\documentclass{article}
\usepackage{unicode-math}
\usepackage[activeacute,catalan,english]{babel}

\begin{document}
\textbf{English:} $f'$ $f''$

\selectlanguage{catalan}
\textbf{Catalan:} 'a $f'$ $f''$
\end{document}

And look the following document after the merge of #397:

\documentclass{article}
\usepackage{unicode-math}
\usepackage{polyglossia}
\setmainlanguage{german}
\setotherlanguage[babelshorthands=true]{latin}

\begin{document}
b'a
$f'$
$f''$
$f'''$

\begin{latin}
b'a
$f'$
$f''$
$f'''$
\end{latin}

b'a
$f'$
$f''$
$f'''$
\end{document}

A similar problem is occurring here, but only within the Latin language environment.

jspitz added a commit that referenced this issue Mar 9, 2020
Fix incompatiblity between Latin and unicode-math (#394), but only partly
@jspitz
Copy link
Collaborator

jspitz commented Mar 9, 2020

PR merged. @wspr do you have an idea about the remaining problem?

@wspr
Copy link
Contributor

wspr commented Mar 9, 2020

@jspitz Sorry, I'm not able to spend much time on this at the moment. What would unicode-math need to add for this to work most easily? I could add a "text mode" command to unicode-math like the following:

\documentclass{article}
\usepackage{unicode-math}
\ExplSyntaxOn
\cs_set:Nn \um_prime_text_char: {!}
\ExplSyntaxOff
\begin{document}
\catcode`\'=\active
b'a
$f'$
\end{document}
  • Would this help?
  • What would the best name for this command be?

I almost think that definitions for active catcodes is something that needs kernel support to avoid conflicts...

(Ping back for latex3/unicode-math#462)

@jspitz
Copy link
Collaborator

jspitz commented Mar 9, 2020

@wspr I don't know if that's possible, but the best solution seems to be that unicode-math activates what ever it does with the prime character only in math mode and restores the previous definition/cat code when math mode is left.

@wspr
Copy link
Contributor

wspr commented Mar 9, 2020

@jspitz That's not really possible at the moment — I'm not currently hooking deep enough into the way that LaTeX2e handles maths to make that happen. The solution I'm outlining above is essentially:

\documentclass{article}
\def\ActiveTick{\ifmmode \MathTick\else \TextTick\fi}
\def\MathTick{^\prime}% dummy definition but you get the idea
\def\TextTick{\char`\'}% redo this for babel/polyglossia as needed
\catcode`\'=\active
\let'\ActiveTick
\begin{document}
T'
$M'$
\end{document}

but it needs coordination where that original definition is set up and what we call the internal components. This could even be a (tiny) 3rd party package, or even the 2e kernel itself...

@jspitz
Copy link
Collaborator

jspitz commented Mar 9, 2020

I see. This could work for us.

@jspitz
Copy link
Collaborator

jspitz commented Mar 9, 2020

On the other hand, I now learned that we have this in gloss-latin.ldf:

    \shorthandon {'}
    \bbl@activate {'}
    \declare@shorthand {latin} {'}
      {
        \mode_if_math:TF
          {
            \active@math@prime % defined in "latex.ltx"
            % This definition is differing from the primes of the unicode-math package.
            % TO DO: Make sure that the appearance of primes is the same as with the
            % unicode-math package if this package is loaded.
          }
          {
            \polyglossia_latin_put_acute:N
          }
      }

So if unicode-math provided a command that is equal to what the active prime normally does, we could also call that in the \mode_if_math:T part.
If unicode-math simply redef'ed \active@math@prime we wouldn't even need to do anything at all on the polyglossia side.

@davidcarlisle
Copy link

@wspr the ifmmode test is already present in the babel.polyglossia shorthand if it can be coordinated

the latin prime ends up being

\latin@sh@'@ ->\mode_if_math:TF {\active@math@prime }{\polyglossia_latin_put_ac

so \active@math@prime is more or less your suggested \MathTick

@davidcarlisle
Copy link

@jspitz was 4 seconds quicker:-)

@wehro
Copy link
Contributor

wehro commented Mar 10, 2020

So if unicode-math provided a command that is equal to what the active prime normally does, we could also call that in the \mode_if_math:T part.

This command is already existing: Line 3406 of unicode-math-xetex.sty is
\cs_set_eq:NN ' \__um_scan_sup_prime:

It is possible to replace \active@math@prime by \__um_scan_sup_prime: in gloss-latin.ldf if unicode-math is loaded. But it does not work if there are two or more subsequent primes. As in the babel-catalan example above, unicode-math complains about a “double superscript”. That means that \__um_scan_sup_prime: does not recognize the following prime if it is set active by means of babel.

@jspitz
Copy link
Collaborator

jspitz commented Mar 11, 2020

Yes, but I figure it's possible to define a command that is able to do such (self-)scanning,

@wehro
Copy link
Contributor

wehro commented Mar 11, 2020

If I change \peek_meaning_remove:NTF ' in line 3243 of unicode-math-xetex.sty by \peek_charcode_remove:NTF ' and replace \active@math@prime by \__um_scan_sup_prime: in gloss-latin.ldf, I get correct results with the example code for Latin above.
The \peek_charcode_remove:NTF command might be the solution, but I don't know if it has side effects.

@jspitz
Copy link
Collaborator

jspitz commented Mar 12, 2020

I still think best would be if unicode-math redefined \active@math@prime so that it is able to accumulate properly. But I suppose that's @wspr 's call.

@jspitz
Copy link
Collaborator

jspitz commented Mar 20, 2020

@wspr taking up @wehro's idea, the following fixes the issue for me with XeTeX without having to change polyglossia:

--- /tmp/meld-tmp7x7cbuxl
+++ /home/juergen/texmf/tex/latex/testen/unicode-math-xetex.sty
@@ -3240,7 +3240,7 @@
 \cs_new:Nn \__um_scanprime_collect:N
  {
   \int_incr:N \l__um_primecount_int
-  \peek_meaning_remove:NTF '
+  \peek_charcode_remove:NTF '
    { \__um_scanprime_collect:N #1 }
    {
     \peek_meaning_remove:NTF \__um_scan_prime:
@@ -3403,6 +3403,7 @@
   \char_set_catcode_active:n {"2037}
   \cs_gset:Nn \__um_define_prime_chars:
    {
+    \cs_set_eq:NN \active@math@prime  \__um_scan_sup_prime:
     \cs_set_eq:NN '        \__um_scan_sup_prime:
     \cs_set_eq:NN ^^^^2032 \__um_scan_sup_prime:
     \cs_set_eq:NN ^^^^2033 \__um_scan_sup_dprime:

I don't know whether \peek_meaning_remove can simply be replaced by \peek_charcode_remove. If not it also works to add the new test, like:

--- /tmp/meld-tmp7x7cbuxl
+++ /home/juergen/texmf/tex/latex/testen/unicode-math-xetex.sty
@@ -3243,9 +3243,12 @@
   \peek_meaning_remove:NTF '
    { \__um_scanprime_collect:N #1 }
    {
-    \peek_meaning_remove:NTF \__um_scan_prime:
-     { \__um_scanprime_collect:N #1 }
-     {
+    \peek_charcode_remove:NTF '
+    { \__um_scanprime_collect:N #1 }
+    {
+     \peek_meaning_remove:NTF \__um_scan_prime:
+      { \__um_scanprime_collect:N #1 }
+      {
       \peek_meaning_remove:NTF ^^^^2032
        { \__um_scanprime_collect:N #1 }
        {
@@ -3295,6 +3298,7 @@
        }
      }
    }
+  }
  }
 \cs_new:Npn \__um_scan_backprime:
  {
@@ -3403,6 +3407,7 @@
   \char_set_catcode_active:n {"2037}
   \cs_gset:Nn \__um_define_prime_chars:
    {
+    \cs_set_eq:NN \active@math@prime  \__um_scan_sup_prime:
     \cs_set_eq:NN '        \__um_scan_sup_prime:
     \cs_set_eq:NN ^^^^2032 \__um_scan_sup_prime:
     \cs_set_eq:NN ^^^^2033 \__um_scan_sup_dprime:

The fix in unicode-math-luatex.sty is analog.

@wspr
Copy link
Contributor

wspr commented Mar 20, 2020 via email

@jspitz
Copy link
Collaborator

jspitz commented Mar 20, 2020

Crazy as crazy can be, indeed.
Thanks. Marking this fixed here, as there's nothing left to do on the polyglossia side.

@jspitz jspitz added FIXED IN DEV This bug is fixed for the next release and removed FIXED IN DEV This bug is fixed for the next release labels Mar 20, 2020
@jspitz jspitz closed this as completed Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants