New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding issues for plots #172

Closed
romunov opened this Issue Mar 14, 2012 · 8 comments

Comments

4 participants
@romunov

romunov commented Mar 14, 2012

I've read the plot text encoding issues and none of the solutions worked for me (I'm using utf-8 encoding of my documents). If I set the correct pdf.options, I can get knit to finish, but the resulting plot has garbled text where localized chars are suppose to appear. I (we) would really appreciate it if you could put this on your priority list.

@yihui

This comment has been minimized.

Owner

yihui commented Mar 14, 2012

I guess this is not closely related to knitr per se; it is a pdf() device problem (r-help or stackoverflow may be better places to ask). Anyway, how did you set up your pdf.options()? (I think the family argument should be critical)

@romunov

This comment has been minimized.

romunov commented Mar 14, 2012

I use "CP1250" as resolved for Sweave (see http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters). I've tried different families with no success.

@yihui

This comment has been minimized.

Owner

yihui commented Mar 14, 2012

Encoding problems are usually difficult for Windows, which uses all different locales/encodings. Linux usually sticks with UTF8, so it is much easier to work under Linux. I just tried the CP1250 encoding under Ubuntu and it worked for me. For Windows, the key is to set options(encoding='UTF-8') before you call knit(), and I do not need any special settings like pdf.options(encoding = "CP1250") in order to get the correct PDF figure. You should guarantee that your document encoding is really UTF8, though.

For character encodings, you can try

knit(..., encoding = 'UTF-8')  # if the input document is encoded in UTF-8
@romunov

This comment has been minimized.

romunov commented Mar 14, 2012

You are right, it does work if I set encoding to UTF8 before calling knit(). However there seems to be some overlap between č and other characters. Can you replicate this? This the script I'm using.

\documentclass[a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[slovene]{babel}
\SweaveOpts{out.width = "\\textwidth"}

\title{\texttt{knitr} ima z vključevanjem šumnikov \\ v risbe še vedno probleme}
\author{Roman Luštrik}
\date{\today}

\begin{document}

\maketitle

% setup chunk for knitr
<<echo=FALSE>>=
options(width = 60)
pdf.options(encoding = "CP1250")
#options(encoding="native.enc")
@

<<>>=
x <- runif(100)
mean(x)
y <- runif(100)
mean(y)
cor(x, y)
@

Povprečje spremenljivke \texttt{x} je \Sexpr{mean(x)}, \texttt{y} pa
\Sexpr{mean(y)}.

<<fig=TRUE>>=
par(mfrow = c(2,2))
plot(x, main = "mal je že še stlačeno")
#plot(x, pch = 4)
abline(h = mean(x))
hist(x, col = "light blue")

#plot(y, main = "ščćž - žćčš")
plot(y)
abline(h = mean(y))
hist(y, col = "light blue")
@

\newpage
<<>>=
sessionInfo()
@

\end{document}
@yihui

This comment has been minimized.

Owner

yihui commented Mar 15, 2012

I do not see overlap on my Windows machine, but I see č becomes c. Here is a screenshot:

@romunov

This comment has been minimized.

romunov commented Mar 15, 2012

This is probably due to your different locale. Well, encoding sure keeps us busy. Fun. :)

@yihui yihui closed this in b7e6964 Apr 22, 2012

@klmr

This comment has been minimized.

Contributor

klmr commented Mar 23, 2015

Since this only happens in knitr and not when creating the plots directly inside R, how is this not a bug in knitr? Wouldn’t it be better if pdf.options were set by knitr internally rather than having to be set by the user?

yihui added a commit that referenced this issue Oct 12, 2016

@dawidh15

This comment has been minimized.

dawidh15 commented Apr 5, 2018

Here is a workaround:

If you have a source file, like source.R, you should save it in the system default encoding. Then fix the characters that are affected in the encoding conversion process.

And when you call the functions contained in the source.R to plot the data from your R Sweave, or R Markdown files (even if they are in UTF-8 encoding), the characters will display correctly.

This worked for me, after trying without success different encoding files using dev.args = list(encoding = <file>.enc) from the chunk options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment