Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbconvert: fixed latex characters not escaped properly in nbconvert #3951

Merged
merged 13 commits into from Aug 8, 2013

Conversation

jdfreder
Copy link
Member

@jdfreder jdfreder commented Aug 8, 2013

No description provided.

@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

@ellisonbg this fixes the output for the IPython.ipynb notebook you pointed me to.

@ivanov
Copy link
Member

ivanov commented Aug 8, 2013

Travis still failing on 2.6 and 2.7

FAIL: escape_latex test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7/local/lib/python2.7/site-packages/IPython/testing/_paramtestpy2.py", line 54, in run_parametric
    next(testgen)
  File "/home/travis/virtualenv/python2.7/local/lib/python2.7/site-packages/IPython/nbconvert/filters/tests/test_latex.py", line 37, in test_escape_latex
    yield self._try_escape_latex(test[0], test[1])
  File "/home/travis/virtualenv/python2.7/local/lib/python2.7/site-packages/IPython/nbconvert/filters/tests/test_latex.py", line 42, in _try_escape_latex
    self.assertEqual(escape_latex(test), result)
AssertionError: 'How are \\{\\textbackslash\\}you doing today?' != 'How are \\textbackslashyou doing today?'
    "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'" = '%s != %s' % (safe_repr('How are \\{\\textbackslash\\}you doing today?'), safe_repr('How are \\textbackslashyou doing today?'))
    "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'" = self._formatMessage("'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'", "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'")
>>  raise self.failureException("'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'")

@fperez
Copy link
Member

fperez commented Aug 8, 2013

This gives me three failures here (linux, python 2.7):

======================================================================
FAIL: escape_latex test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fperez/usr/lib/python2.7/site-packages/IPython/testing/_paramtestpy2.py", line 54, in run_parametric
    next(testgen)
  File "/home/fperez/usr/lib/python2.7/site-packages/IPython/nbconvert/filters/tests/test_latex.py", line 37, in test_escape_latex
    yield self._try_escape_latex(test[0], test[1])
  File "/home/fperez/usr/lib/python2.7/site-packages/IPython/nbconvert/filters/tests/test_latex.py", line 42, in _try_escape_latex
    self.assertEqual(escape_latex(test), result)
AssertionError: 'How are \\{\\textbackslash\\}you doing today?' != 'How are \\textbackslashyou doing today?'
    "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'" = '%s != %s' % (safe_repr('How are \\{\\textbackslash\\}you doing today?'), safe_repr('How are \\textbackslashyou doing today?'))
    "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'" = self._formatMessage("'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'", "'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'")
>>  raise self.failureException("'How are \\\\{\\\\textbackslash\\\\}you doing today?' != 'How are \\\\textbackslashyou doing today?'")


======================================================================
FAIL: Generate PDFs with graphics if notebooks have spaces in the name?
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fperez/usr/lib/python2.7/site-packages/IPython/nbconvert/tests/test_nbconvertapp.py", line 91, in test_filename_spaces
    assert os.path.isfile('notebook with spaces.pdf')
AssertionError: 
    assert <module 'os' from '/usr/lib/python2.7/os.pyc'>.path.isfile('notebook with spaces.tex')
    assert <module 'os' from '/usr/lib/python2.7/os.pyc'>.path.isdir('notebook with spaces_files')
>>  assert <module 'os' from '/usr/lib/python2.7/os.pyc'>.path.isfile('notebook with spaces.pdf')

======================================================================
FAIL: Do post processors work?
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fperez/usr/lib/python2.7/site-packages/IPython/nbconvert/tests/test_nbconvertapp.py", line 103, in test_post_processor
    assert os.path.isfile('notebook1.pdf')
AssertionError: 
    assert <module 'os' from '/usr/lib/python2.7/os.pyc'>.path.isfile('notebook1.tex')
>>  assert <module 'os' from '/usr/lib/python2.7/os.pyc'>.path.isfile('notebook1.pdf')

----------------------------------------------------------------------

@minrk
Copy link
Member

minrk commented Aug 8, 2013

We've been looking at it on HipChat, and have a new, simpler regex-free approach that should work better.

@@ -22,7 +22,7 @@
#Latex substitutions for escaping latex.
LATEX_SUBS = (
(re.compile('\033\[[0-9;]+m'),''), # handle console escapes
(re.compile(r'\\'), r'\\textbackslash'),
(re.compile(r'\\'), r'{\\textbackslash}'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this change makes all tests pass for me. (only one test was failing with this PR: FAIL: escape_latex test

@fperez
Copy link
Member

fperez commented Aug 8, 2013

Mmh, now I'm seeing the last two of those three above also on master... Not good.

It seems the issue is the latex tikz package, which I don't have on this box. Do you know if it's necessary?

! LaTeX Error: Filetikz.sty' not found.`

I installed the ubuntu pgf and those two spurious failures are now gone, the other one is still there.

@fperez
Copy link
Member

fperez commented Aug 8, 2013

Ah @minrk, ok. So this one will get closed eventually, I take?

@ivanov
Copy link
Member

ivanov commented Aug 8, 2013

@jdfreder maybe include the notebook Brian pointed you toward (or the relevant portion of it) into the test suite - not sure what portion of it is causing things to fail

@minrk
Copy link
Member

minrk commented Aug 8, 2013

Yup, @jdfreder had to run, but I think he plans to get to it tonight or tomorrow morning. Or I can do it in the morning if he doesn't get to it. The new approach is very simple.

(re.compile(r'"'), r"''"),
(re.compile(r'\.\.\.+'), r'\\ldots'),
)
# Latex substitutions for escaping latex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need to apply the first and last of these, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can apply the first without a regular expression. We already make that substitution in ansi.strip_ansi , I'll just call that here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last also needed logic to be able to apply multicharacter replace

@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

@minrk updated

@minrk
Copy link
Member

minrk commented Aug 8, 2013

I made a PR against your branch last night - should have both fixes.

@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

Sorry I didn't catch that, I rushed over to here first thing in the morning

@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

I'll go take a peek

This reverts commit 69adeb1.
This allows me to merge min's code to give him credit.
@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

@minrk I reverted my fix and merged yours so you could get credit 😁

@minrk
Copy link
Member

minrk commented Aug 8, 2013

You didn't have to do that - it was just putting the regex replacements back in that mattered. But thanks :)

@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

Any reason to do the ansi sub using a regex instead of the ansi.strip_ansi filter/function?

return_text = text
for pattern, replacement in LATEX_SUBS:
return_text = pattern.sub(replacement, return_text)
return return_text
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove square brackets here

(re.compile(r'\^'), r'\^{}'),
(re.compile(r'"'), r"''"),
LATEX_RE_SUBS = (
(re.compile('\033\[[0-9;]+m'), ''), # handle console escapes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this one - it's redundant

for pattern, replacement in LATEX_SUBS:
return_text = pattern.sub(replacement, return_text)
return return_text
text = ''.join([LATEX_SUBS.get(c, c) for c in text])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Square brackets not needed, generator comprehension will work by itself.


def escape_latex(text):
"""
Escape characters that may conflict with latex.

Remove ansi codes and escape characters that may conflict with latex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update this to reflect that it doesn't remove ansi codes

@@ -443,7 +443,7 @@ Note: For best display, use latex syntax highlighting. =))
((* macro custom_verbatim(text) -*))
\begin{alltt}
((*- if resources.sphinx.centeroutput *))\begin{center} ((* endif -*))
((( text | wrap_text(wrap_size) )))
((( text | wrap_text(wrap_size) | escape_latex )))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be wrapped - it's already in a verbatim environment, wrapping messes that up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added here as a safeguard. If it's not here, the output of a long sequence of character blows outside of the table and messes things up. Are you sure you want me to remove it? This won't allow users to output a long base64 string, a long byte string, etc... Latex isn't smart enough to break long words.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I guess not. It's definitely doing the wrong thing, even with relatively short lines. Since it fixes a real issue, we can look at it more carefully at a later point. Shall I go ahead and merge this now, then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

This reverts commit 0b94505.
@jdfreder
Copy link
Member Author

jdfreder commented Aug 8, 2013

👍

minrk added a commit that referenced this pull request Aug 8, 2013
nbconvert: fixed latex characters not escaped properly in nbconvert

use simple dict lookup process instead of sequential regular expressions that confuse each other.
@minrk minrk merged commit 87855e3 into ipython:master Aug 8, 2013
@jakobgager jakobgager mentioned this pull request Aug 8, 2013
7 tasks
@jdfreder jdfreder deleted the custom_verbate_esc_tex branch March 10, 2014 18:42
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
nbconvert: fixed latex characters not escaped properly in nbconvert

use simple dict lookup process instead of sequential regular expressions that confuse each other.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants