Skip to content
This repository has been archived by the owner on Jan 13, 2024. It is now read-only.

Commit

Permalink
Browse files Browse the repository at this point in the history
replace unicode characters in latex produced by nbconvert
  • Loading branch information
sdpython committed Aug 30, 2015
1 parent 6063f92 commit c0a262e
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 8 deletions.
18 changes: 18 additions & 0 deletions _doc/sphinxdoc/source/blog/2015/2015-08-30_unicode_notebook.rst
@@ -0,0 +1,18 @@


.. index:: babel, sphinx, issue

.. blogpost::
:title: Why do I see invered question in a notebook converted into PDF?
:keywords: latex,
:date: 2015-08-30
:categories: sphinx, notebook

The function :func:`process_notebooks <pyquickhelper.helpgen.process_notebooks.process_notebooks>`
still uses the executable
`pdflatex <https://en.wikipedia.org/w/index.php?title=PdfTeX&redirect=no>`_
and not
`xetex <https://en.wikipedia.org/wiki/XeTeX>`_
which can handle inline unicode characters.
That's why they are replaced by *¿* by function
:func:`post_process_latex <pyquickhelper.helpgen.post_process.post_process_latex>`.
4 changes: 2 additions & 2 deletions _unittests/ut_helpgen/test_notebooks_bug_utf8.py
Expand Up @@ -63,9 +63,9 @@ def test_notebook_utf8(self):
fLOG(_)
assert os.path.exists(_[0])

with open(os.path.join(temp, "seance4_projection_population_correction.tex"), "r", encoding="utf8") as f:
with open(os.path.join(temp, "simple_example.tex"), "r", encoding="utf8") as f:
content = f.read()
exp = "seance4_projection_population_correction_50_0.pdf"
exp = "textquestiondown"
if exp not in content:
raise Exception(content)

Expand Down
28 changes: 22 additions & 6 deletions src/pyquickhelper/helpgen/post_process.py
Expand Up @@ -414,10 +414,32 @@ def post_process_latex(st, doall, info=None):
.. versionchanged:: 1.2
remove ascii character in *[0..31]* in each line, replace them by space.
.. index:: chinese characters, latex, unicode
@warning Unicode, chinese characters are an issue because the latex compiler
prompts on those if the necessary packages are not installed.
`pdflatex <https://en.wikipedia.org/w/index.php?title=PdfTeX&redirect=no>`_
does not accepts inline chinese
characters, `xetex <https://en.wikipedia.org/wiki/XeTeX>`_
should be used instead:
see `How to input Traditional Chinese in pdfLaTeX <http://tex.stackexchange.com/questions/200449/how-to-input-traditional-chinese-in-pdflatex>`_.
Until this is being implemetend, the unicode will unfortunately be removed
in this function.
@todo Check latex is properly converted in HTML files
"""
fLOG(" ** enter post_process_latex", doall, "%post_process_latex" in st)

def clean_unicode(c):
if ord(c) >= 255:
return "\\textquestiondown"
else:
return c

lines = st.split("\n")
st = "\n".join("".join(map(clean_unicode, line)) for line in lines)

# we count the number of times we have \$ (which is unexpected unless the
# currency is used.
dollar = st.split("\\$")
Expand Down Expand Up @@ -512,12 +534,6 @@ def post_process_latex(st, doall, info=None):
raise HelpGenException(
"unable to add new instructions usepackage in file {0}".format(info))

if "\\usepackage[utf8]" in st:
st = st.replace(
"\\usepackage[utf8]{inputenc}", "\\usepackage{silence}\\usepackage{ucs}\\usepackage[utf8x]{inputenc}")
st = st.replace(
"\\DeclareUnicodeCharacter{00A0}{\\nobreakspace}", "%\\DeclareUnicodeCharacter{00A0}{\\nobreakspace}")

# SVG does not work unless it is converted (nbconvert should handle that
# case)
reg = re.compile("([\\\\]includegraphics[{].*?[.]svg[}])")
Expand Down

0 comments on commit c0a262e

Please sign in to comment.