Improve LaTeXing of strings #11498

novoselt · 2011-06-16T03:26:26Z

See the attached PDF for an example of a class that has no _latex_. All commands are repeated twice, one with Sage-4.7 and the second one with the patch applied. In both cases typeset checkbox on the top of the worksheet was tuned on.

Also, I invite you to try

RubiksCube()

in the notebook with typeset mode on before and after the patch - such situations are precisely the ones that I wanted to solve (they are so bad, that printout does not reflect them accurately).

It is possible that user classes will not have customized latex method and there are some standard classes that use their string representation. If this representation includes multiple lines, the result in most cases is not very good and sometimes quite horrible, making it unpleasant to use the notebook in typeset mode. The patch aims to fix this situation and clean up latex module along the way.

The major change is in sage.latex.str_function. Also sage.latex.JSMath.eval is simplified a bit and does not alter the code almost at all anymore. The rest are mostly doctest adjustments and some code clean-up and simplifications which happened while I was going through it to understand how it works.

Apply:

attachment: trac_11498_LaTeX.patch

CC: @kcrisman @jhpalmieri

Component: user interface

Keywords: sd31

Author: Andrey Novoseltsev

Reviewer: John Palmieri

Merged: sage-4.7.2.alpha0

Issue created by migration from https://trac.sagemath.org/ticket/11498

The text was updated successfully, but these errors were encountered:

novoselt · 2011-06-16T03:28:30Z

comment:1

Attachment: String_LaTeXing_--_Sage.pdf.gz

I still need to work on these failures:

	sage -t -long devel/sage-main/sage/graphs/generic_graph.py # 1 doctests failed
	sage -t -long devel/sage-main/sage/combinat/free_module.py # 13 doctests failed
	sage -t -long devel/sage-main/sage/modular/hecke/algebra.py # 1 doctests failed
	sage -t -long devel/sage-main/doc/de/tutorial/latex.rst # 1 doctests failed
	sage -t -long devel/sage-main/sage/sets/set.py # 1 doctests failed
	sage -t -long devel/sage-main/sage/misc/html.py # 2 doctests failed
	sage -t -long devel/sage-main/doc/en/tutorial/latex.rst # 1 doctests failed

novoselt · 2011-06-26T18:59:59Z

comment:3

By the way, the issue of this ticket was discussed here:

http://groups.google.com/group/sage-devel/browse_thread/thread/9aee79aaa40c8390/e5d55c0b800775b1

So John is cc-ed due to his explicit request, hint-hint ;-)

jhpalmieri · 2011-06-29T21:10:41Z

comment:4

Overall, this looks pretty good, but it could use some work. Here are some comments.

in html.py, the comments "We work around a limitation of jsmath (it can't typeset '\texttt')::" should be deleted, since those doctests don't involve texttt anymore. (Do we even need to do the replacement s.replace('\\texttt','\\hbox') now, or can that code be deleted?)
in latex.py, for bool_function and None_function, "\mathrm{%s}" is preferable to "{\rm %s}". Actually, I think we can just return "\mathrm{%s}" always, regardless of notebook vs. command line distinctions. Do you know why we had two versions for bool_function before? It looks okay to me without the mbox...
a few typos and rewordings:

@@ -239,7 +239,7 @@ def str_function(x):
     If ``x`` contains only digits with, possibly, a single decimal point and/or
     a sign in front, it is considered to be its own representation. Otherwise
     each line of ``x`` is wrapped in a ``\verb`` command and these lines are
-    assembled in a left-justified array. This gives to complicated string the
+    assembled in a left-justified array. This gives to complicated strings the
     closest look to their "terminal representation".
    
     .. warning:: Such wrappers **cannot** be used as arguments of LaTeX
@@ -289,7 +289,7 @@ def str_function(x):
     # There is a bug in verb-space treatment in jsMath...
     spacer = "\\phantom{%s}"
     # \phantom{\verb!%s!} is more accurate and it works, but it is not a valid
-    # LaTeX and may cause problems, so let's live with the above variant untill
+    # LaTeX and may cause problems, so let's live with the above variant until
     # spaces are properly treated in jsMath/MathJax and we don't need to worry.
     lines = []
     for line in x.split("\n"):

another comment about the old code: I think that the output from dict_function looks better with no extra space after the colon, so I would suggest the following, but take a look yourself and see what you think:

@@ -334,12 +334,10 @@ def dict_function(x):
                \left[\sin\left(z^{2}\right), \frac{1}{2} \, y\right]\right\}
     """
     return "".join([r"\left\{",
-                    ", ".join(r"%s :\: %s" % (latex(key), latex(value))
+                    ", ".join(r"%s : %s" % (latex(key), latex(value))
                               for key, value in x.iteritems()),
                     r"\right\}"])

Also, in JSMath().eval, if x is already a LaTeX expression, then x=str(x) doesn't do anything, and neither does x=latex(x), so I think we can simplify it more:

@@ -1680,11 +1681,7 @@ class JSMath:
             sage: JSMath().eval(type(3), mode='inline')
             <html>...\verb|&lt;type|\phantom{x}\verb|'sage.rings.integer.Integer'&gt;|</span></html>
         """
-        # If x is already a LaTeX expression, i.e. the output of latex(blah),
-        # we will treat it as a string, so that we can see the code itself.
-        if isinstance(x, LatexExpr):
-            x = str(x)
-        # Now get a regular LaTeX representation of x...
+        # Get a regular LaTeX representation of x...

Finally, the output in one of the doctests,

            <html>...\verb|&lt;type|\phantom{x}\verb|'sage.rings.integer.Integer'&gt;|</span></html>

looks like bad latex: the "<" inside of the \verb environment shouldn't be typeset correctly, but it actually seems to work. I wonder why...

When we switch to MathJax (#9774), we'll have to redo some of this.

novoselt · 2011-06-30T03:14:15Z

comment:5

Thanks for the comments!

I don't know why there are differences for different modes, this is the first time I am working with these modules. I agree that the fewer are differences the better and will also remove the extra space.

Here is the reason for extra string conversion:

sage: s = "asdf fs"
sage: l = latex(s)
sage: print l
sage: print latex(l)
sage: print latex(str(l))
\verb|asdf|\phantom{x}\verb|fs|
\verb|asdf|\phantom{x}\verb|fs|
\verb"\verb|asdf|\phantom{x}\verb|fs|"

As I understand, the idea is that when user types latex(x) in the notebook, then (s)he wants to see the actual LaTeX code. So it will be wrapped the second time and displayed.

Regarding < - it is not produced in normal LaTeX but can happen only in jsMath. For some reason it parses < even inside verb and the old solution was to insert extra spaces (which somehow prevents it from this mistake), but this alters the output and looks especially bad on typesetting things like >=. Replacing it with HTML code seems to work, I don't know why and I don't really care - once it is fixed upstream, we will see a doctest break and can remove this workaround. (Hopefully we can also get rid of phantoms eventually, but in MathJax the problem was still present two weeks ago. I don't think it is possible to write a doctest that will detect fixing the spacing issue upstream.)

novoselt · 2011-07-05T19:44:20Z

Updated version.

jhpalmieri · 2011-07-06T02:36:24Z

comment:6

Attachment: trac_11498_improve_string_LaTeXing.patch.gz

I'm mostly happy with this. Here's a slightly different version (a full patch plus a patch showing the difference between the old one and the new one); I think this does a better job of preserving the original behavior with respect to combinatorial free modules. If you're happy with my changes, feel free to mark this as "positive review".

jhpalmieri · 2011-07-06T02:36:24Z

Reviewer: John Palmieri

jhpalmieri · 2011-07-06T02:36:47Z

Attachment: trac_11498_LaTeX.patch.gz

new version; apply only this patch

jhpalmieri · 2011-07-06T02:37:17Z

Attachment: trac_11498_delta.patch.gz

for reference only; difference between old patch and new one

novoselt · 2011-07-06T02:58:29Z

comment:7

Hi John, would you mind explaining to me how does your code work? I am not very familiar with regular expressions.

Also, how did you notice the new patch? It kind of annoys me that trac does not send messages when there are new patches attached to the tickets, so I usually leave some comment to trigger it. This time I was waiting since I didn't yet run tests on the whole library to see if anything got broke ;-)

jhpalmieri · 2011-07-06T04:02:22Z

comment:8

I happened to take a look at the ticket, so I saw the new patch; mine is based on that one.

The documentation for regular expressions in Python is here. In the code

if s.find('\\verb') != -1: 
    import re 
    s = re.sub("\\\\verb(.)(.*?)\\1", "\\2", s) 
    s = s.replace("\\phantom{x}", " ")

the second and third lines are the ones involving regular expressions. Line 3 is a search-and-replace command: it searches the argument s for a pattern \verb followed by a single character (that's the .), and putting that character in parentheses allows you to refer to it later as \1. Then there are any number of characters (that's the .*?, also enclosed in parentheses, so you can refer to it later as \2), followed by whatever the first character was (that's the \\1). It replaces all of this with the string in the second set of parentheses, \2.

There is one subtlety in this: the standard way to refer to any string of characters is .*, but the problem here is that in a string like

\verb|a|, \verb|b|

if we used .*, then the string \2 would match "a|, \verb|b" -- everything between the first "|" and the last "|": by default, regular expression matches use the longest match possible. Using .*? instead of .* tells it to match the shortest string possible instead, so it does two replacements: first \2 matches "a", and then it matches "b", turning this into

a, b

novoselt · 2011-07-06T04:38:46Z

comment:9

Thank you!

All tests pass, positive review!

jdemeyer · 2011-08-01T16:41:28Z

Merged: sage-4.7.2.alpha0

jdemeyer · 2011-12-15T10:18:54Z

comment:14

See #12156 for a related ticket.

novoselt added this to the sage-4.7.1 milestone Jun 16, 2011

novoselt added c: user interface labels Jun 16, 2011

novoselt assigned jasongrout, qed777 and williamstein Jun 16, 2011

This comment has been minimized.

Sign in to view

novoselt added the s: needs review label Jun 16, 2011

novoselt added s: positive review and removed s: needs review labels Jul 6, 2011

This comment has been minimized.

Sign in to view

jdemeyer modified the milestones: sage-4.7.1, sage-4.7.2 Jul 6, 2011

jdemeyer removed the s: positive review label Aug 1, 2011

jdemeyer closed this as completed Aug 1, 2011

This was referenced Sep 23, 2012

Remove jsMath <> workaround #13455

Closed

Fix LaTeXing of strings #14382

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LaTeXing of strings #11498

Improve LaTeXing of strings #11498

novoselt commented Jun 16, 2011

novoselt commented Jun 16, 2011

This comment has been minimized.

novoselt commented Jun 26, 2011

jhpalmieri commented Jun 29, 2011

novoselt commented Jun 30, 2011

novoselt commented Jul 5, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

novoselt commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

novoselt commented Jul 6, 2011

This comment has been minimized.

jdemeyer commented Aug 1, 2011

jdemeyer commented Dec 15, 2011

Improve LaTeXing of strings #11498

Improve LaTeXing of strings #11498

Comments

novoselt commented Jun 16, 2011

novoselt commented Jun 16, 2011

This comment has been minimized.

novoselt commented Jun 26, 2011

jhpalmieri commented Jun 29, 2011

novoselt commented Jun 30, 2011

novoselt commented Jul 5, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

novoselt commented Jul 6, 2011

jhpalmieri commented Jul 6, 2011

novoselt commented Jul 6, 2011

This comment has been minimized.

jdemeyer commented Aug 1, 2011

jdemeyer commented Dec 15, 2011