Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to fix the curly brackets bug #166

Merged
merged 11 commits into from
Nov 4, 2023
Merged

Conversation

GinoGiotto
Copy link
Contributor

This is my attempt to fix issue #129. The solution I proposed in #129 (comment) is actually not very practical because latex translations get merged togheter pretty early and this makes it difficult to take advantage of that information. So I changed my mind and implemented @david-a-wheeler proposal instead #129 (comment). I took inspiration from an analogous functionality already present for HTML:

metamath-exe/src/mminou.c

Lines 549 to 588 in ae35c1b

// HTML mode
// The HTML mode is intended not to break inside quoted HTML tag
// strings. All HTML output should be called with this mode.
// Since we don't parse HTML this method is not perfect. Only double
// quotes are inspected, so all HTML strings with spaces must be
// surrounded by double quotes. If text quotes surround a tag
// with a quoted string, this code will not work and must be
// enhanced.
// Whenever we are inside of a quote, we change a space to ASCII 3 to
// prevent matching it. The reverse is done in the print2() function,
// where all ASCII 3's are converted back to space.
// Note added 20-Oct-02: tidy.exe breaks HREF quotes with new line.
// Check HTML spec - do we really need this code?
j = (long)strlen(multiLine);
// Do a bug check to make sure no real ASCII 3's are ever printed
for (i = 0; i < j; i++) {
if (multiLine[i] == QUOTED_SPACE) bug(1514); // Should never be the case
}
if (breakMatch1[0] == '\"') {
breakMatch1[0] = ' '; // Change to a space (the real break character)
// Scan string for quoted strings
quoteMode = 0;
// Don't put line breaks in anything inside
// _double_ quotes that follows an = sign, such as TITLE="abc def". Ignore
// _single_ quotes (which could be apostrophes). The HTML output code
// must be (and so far is) written to conform to this.
i = 0;
while (multiLine[i]) {
if (multiLine[i] == '"' && i > 0) {
if (!quoteMode && multiLine[i - 1] == '=')
quoteMode = 1;
else
quoteMode = 0;
}
if (multiLine[i] == ' ' && quoteMode)
multiLine[i] = QUOTED_SPACE;
i++;
}
} // if (breakMatch1[0] == '\"')

With this change the generated TeX of my minimal example seems to work as intended:

% This LaTeX file was created by Metamath on 22-Oct-2023 9:40 PM.
\documentclass{article}
\usepackage{amssymb} % amssymb must be loaded before phonetic
\usepackage{phonetic} % for \riota
\usepackage{mathrsfs} % for \mathscr
\usepackage{mathtools} % loads package amsmath
\usepackage{amsthm} % amsthm must be loaded after amsmath
\usepackage{accents} % accents should be loaded after mathtools
\theoremstyle{plain}
\newtheorem{theorem}{Theorem}[section]
\newtheorem{definition}[theorem]{Definition}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{axiom}{Axiom}
\allowdisplaybreaks[1] % Allow page breaks in {align}
\usepackage[plainpages=false,pdfpagelabels]{hyperref}
\hypersetup{colorlinks} % Get rid of boxes around links
\begin{document}

\begin{proof}
\begin{align}
  1 &&  & \text{ A A A A A A A A A A } \text{ A A A A A A A A A A }
    \notag \\ && & \qquad \text{ A A A A A A A A A A }
    \notag \\ && & \qquad \text{ A A A A A A A A A A }
\text{ A A A A A A A A A A }&(\mbox{\ref{eq:ax-1}})\notag
\end{align}
\end{proof}

\end{document}

I'm somewhat surprised by how simple this solution looks, and this makes me wonder whether it might be a false friend due to my own ignorance. So it most likely needs to be overhauled by an expert eye.

src/mminou.c Outdated Show resolved Hide resolved
src/mmwtex.c Outdated
// k counts the scope level we are in.
k = 0;
n = (long)strlen(texFull);
for (j = 1; j < n; j++) { // We don't need to check texFull[0].
Copy link
Contributor Author

@GinoGiotto GinoGiotto Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I wasn't sure what apporach would be best. I defined the texFull string as the line that is feeded to the printLongLine function. The checking cicle starts at the second character of texFull to avoid checking the unorthodox texFull[-1] != '\\'. The first character is texFull[0] = " " so we don't need to check it since it's fixed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

htmRef, NULL),
" \\notag \\\\ && & \\qquad ", // Continuation line prefix
" ");
texFull = cat(" ", // texFull[0] should not be a "{" character.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I added a comment to point out that texFull[0] = " " should not be a "{" character, since it wouldn't be checked by the subsequent counting algorithm.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, and here it obviously will never be "{", since it is a blank space " ".

src/mmwtex.c Outdated
Comment on lines 3127 to 3147
// To avoid generating incorrect TeX, line breaking is forbidden inside
// scopes of curly braces. However breaking between '\{' and '\}' is
// allowed.
// The spaces that should not be matched with 'breakMatch' are
// temporarily changed to ASCII 3 before the 'printLongLine' procedure
// is called. The procedure rewraps the 'line' argument matching the
// unchanged spaces only, thus ensuring bad breaks will be avoided.
// The reverse is done in the 'print2()' function, where all ASCII 3's
// are converted back to spaces.
// k counts the scope level we are in.
k = 0;
n = (long)strlen(texFull);
for (j = 1; j < n; j++) { // We don't need to check texFull[0].
// We enter a non "\{" scope.
if (texFull[j] == '{' && texFull[j - 1] != '\\') k++;
// We escape a non "\}" scope.
if (texFull[j] == '}' && texFull[j - 1] != '\\') k--;
// If k > 0 then we are inside a scope.
if (texFull[j] == ' ' && k > 0) texFull[j] = QUOTED_SPACE;
}
printLongLine(texFull, " \\notag \\\\ && & \\qquad ", /* Continuation line prefix */ " ");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a lot of duplication with the previous part.
Then again the duplicated code was already there before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unduplicated the brace count and the printLongLine call in a new commit.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@tirix
Copy link

tirix commented Oct 27, 2023

Maybe one (last?) remark would be to integrate your minimal example into an additional test in the tests directory, with an additional issue129.in, issue129.mm and issue129.expected.

Unfortunately I don't have the rights to merge, I guess @digama0 does.

@GinoGiotto
Copy link
Contributor Author

My only question is whether issue129.tex should be included as well, since in this case that's the file we are interested verifying. But run_test.sh doesn't seem to be designed to check the generated tex tho.

@tirix
Copy link

tirix commented Oct 31, 2023

Oh right, of course I have not thought so far.
That would have been a nice to have, I'm sorry I feel I have made you add a quite useless test.

@digama0
Copy link
Member

digama0 commented Nov 1, 2023

There is another test which has to check the result of an auxiliary file, and uses a trick calling cat inside metamath to get it to be validated. See the underscores test.

@GinoGiotto
Copy link
Contributor Author

I don't understand what's wrong. On my computer both attempts pass. Maybe it doesn't like backslashes?

@tirix
Copy link

tirix commented Nov 2, 2023

Based on this SO answer, line 113 of run_tests.sh, I would try replacing the cat "$result" with printf '%s' "$result" so that the backslashes are not interpreted.

@GinoGiotto GinoGiotto marked this pull request as draft November 3, 2023 20:49
@GinoGiotto
Copy link
Contributor Author

GinoGiotto commented Nov 4, 2023

(Accidentally pressed the close button)

Thanks for making it work, should be good to merge now.

I wonder if there is any difference between using more or cat in issue129.in

@GinoGiotto GinoGiotto closed this Nov 4, 2023
@GinoGiotto GinoGiotto reopened this Nov 4, 2023
@GinoGiotto GinoGiotto marked this pull request as ready for review November 4, 2023 08:37
@GinoGiotto
Copy link
Contributor Author

Btw sorry @tirix I superficially read your suggestion and made a mess.

@tirix
Copy link

tirix commented Nov 4, 2023

The final one looks good!
I already approved, I think only @digama0 can merge now.

@digama0 digama0 merged commit 8fedf16 into metamath:master Nov 4, 2023
4 checks passed
@GinoGiotto GinoGiotto deleted the latex branch November 4, 2023 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants