Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frame label do not accept Chinese characters(UnicodeDecodeError: 'ascii' codec can't decode byte) #11

Closed
ssendeavour opened this issue Oct 20, 2013 · 3 comments
Labels

Comments

@ssendeavour
Copy link

label in frame can not contain Chinese characters. it gives the following error:

*** Error while highlighting:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 25: >ordinal not in range(128)
(file "/usr/lib/python2.7/codecs.py", line 351, in write)
(./sample.out.pyg)
Runaway argument?
commandchars={}
! File ended while scanning use of \FV@GetKeyValues.

\par
l.33 \end{minted}

? x
No pages of output.
Transcript written on sample.log.

Chinese characters appeared in other places worked as expected, no error occurred.

In pygment's documens I read that there is an encoding option, I call pygmentize from command line with -P encoding=UTF8 and those minted handed to pygmentize found in sample.log, but it doesn't work. It seems that this is pygment's fault, but I am not sure.

My environment:
Ubuntu 12.04 64bit
TeX Live 2013
XeTeX 3.1415926-2.5-0.9999.3-2013060708 (TeX Live 2013)
python 2.7.3

sample code to reproduce the error, you will have to install a Chinese font and a LaTeX package(xeCJK) to compile. I compile it with command:

xelatex -shell-escape -8bit sample.tex

\documentclass[a4paper,12pt]{article}
\usepackage{xltxtra}
\usepackage{minted}
% better handle Chinese, set Chinese font
\usepackage{xeCJK}

% this is a free Chinese font, available in Ubuntu as ttf-wqy-microhei
\setCJKmainfont{WenQuanYi Micro Hei}

\begin{document}

code sample,示例代码

\begin{itemize}

    \item with Verbatim
        \begin{Verbatim}[label=main函数, frame=lines, tabsize=4]
int main(void)
{
    return 0;
    // some Chinese 中文
}
        \end{Verbatim}

    \item with minted
        %\begin{minted}[frame=lines, label=main, tabsize=4]{cpp}
        \begin{minted}[frame=lines, label=main函数]{cpp}
int main(void)
{
    return 0;
    // some Chinese 中文
}
        \end{minted}

    \end{itemize}

\end{document}
@gpoore
Copy link
Owner

gpoore commented Oct 20, 2013

The fact that you're getting a Python error means that things are going wrong in Pygments. That may or may not indicate that this is a Pygments issue, though.

minted is running pygmentize, and somewhere between LaTeX and Pygments there are character encoding issues. The first thing you need to try is downloading the latest minted.sty (v2.0alpha2), and using encoding=utf8 (or whatever is appropriate for your file) as the first optional argument for the minted environment. (You can just put the new minted.sty in the same directory as your document for testing purposes.)

If that doesn't work, there may be system encoding issues. Can you type/paste these Chinese characters into the terminal? What is your terminal's encoding? (One way to get that is to open a terminal, start Python, and then type import sys; sys.stdout.encoding.)

It's possible that even if all that is fine, pygmentize won't accept anything but ASCII from the command line. This Pygments issue suggests that that may indeed be the case. If that's the cause of the problem, then this is a Pygments issue.

If possible, you could also try all of this with Python 3.2+.

One workaround is to put your label in a macro, say \newcommand{\mylabel}{main函数}, and then in the minted optional argument use label=\mylabel.

@ssendeavour
Copy link
Author

thanks for your quick response.

I have solved my problem with your suggestion.

  1. I first tried lastest minted.sty(v2.0alpha2) and using encoding=utf8 as first argument, do not work.
  2. My terminal's encoding is UTF8, and environment variable LANG=en_US.UTF8. it can copy/paste Chinese.
  3. your workaround works.
  4. Use python3 works, but have to use encoding=utf8 as first optional argument to minted environment to get it work. I removed python-pygments installed from Ubuntu repository, and installed python3, download Pygments-1.6.tar.gz from pypi.python.org, extract and install it with sudo python3 setup.py install. This way python remain linked to python2.7 and I can use pygmentize with python3 (the first line of /usr/local/bin/pygmentize set to #!/usr/bin/python3).

also have to use latest minted.sty(v2.0alpha2). minted 1.7 do not work, get error

! Package keyval Error: encoding undefined.

See the keyval package documentation for explanation.
Type H for immediate help.

@gpoore
Copy link
Owner

gpoore commented Oct 21, 2013

Thanks for the nice summary. It looks like the Pygments issue I referenced is responsible for the problem under Python 2.7.

minted 1.7 won't work because it doesn't support the encoding option; only the latest versions of minted support it.

I am closing this issue since the only existing issue is on the Pygments side. I will also add a note in the documentation for the next release about this use case.

@gpoore gpoore closed this as completed Oct 21, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants