New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R Sweave] Cyrillic characters in plots #436

Closed
artemklevtsov opened this Issue Dec 12, 2012 · 7 comments

Comments

3 participants
@artemklevtsov

artemklevtsov commented Dec 12, 2012

Hi.

I was faced with the fact that when create a PDF cyrillic characters not is processed. Simple example:

\documentclass[russian]{article}
\usepackage[T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[russian]{babel}

\begin{document}

<<"setup", include=FALSE>>=
options(width=65, digits=2)
pdf.options(encoding = "Cyrillic")
@
<<"pie-plot">>=
x <- gl(3, 4, 50, labels=c("Высшее", "Среднее", "Ср.-тех."))
pie(table(x), main="Образование")
@

\end{document}

When I try generate PDF from RStudio I get following ouput:

> grDevices::pdf.options(useDingbats = FALSE); require(knitr); opts_knit$set(concordance = TRUE); knit('111.Rnw')
Loading required package: knitr


processing file: 111.Rnw
  |>>>>>>>>>>>>>                                                    |  20%
  ordinary text without R code

  |>>>>>>>>>>>>>>>>>>>>>>>>>>                                       |  40%
label: setup (with options) 
List of 1
 $ include: logi FALSE

  |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                          |  60%
  ordinary text without R code

  |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>             |  80%
label: pie-plot
Warning in replayPlot(x) : font width unknown for character 0xb2
Warning in replayPlot(x) : font metrics unknown for character 0xb2
Warning in replayPlot(x) : font metrics unknown for character 0xeb
Warning in replayPlot(x) : font metrics unknown for character 0xe1
Warning in replayPlot(x) : font metrics unknown for character 0xe8
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font width unknown for character 0xc1
Warning in replayPlot(x) : font metrics unknown for character 0xc1
Warning in replayPlot(x) : font metrics unknown for character 0xe0
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font metrics unknown for character 0xd4
Warning in replayPlot(x) : font metrics unknown for character 0xdd
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font width unknown for character 0xc1
Warning in replayPlot(x) : font metrics unknown for character 0xc1
Warning in replayPlot(x) : font metrics unknown for character 0xe0
Warning in replayPlot(x) : font metrics unknown for character 0xe2
Warning in replayPlot(x) : font metrics unknown for character 0xd5
Warning in replayPlot(x) : font metrics unknown for character 0xe5
Warning in replayPlot(x) : font width unknown for character 0xbe
Warning in replayPlot(x) : font width unknown for character 0x9e
Warning in replayPlot(x) : font width unknown for character 0xe9
Warning in replayPlot(x) : font width unknown for character 0x1
  |>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>| 100%
  ordinary text without R code


output file: /tmp/111.tex

[1] "111.tex"
> 
> 
Running pdflatex on 111.tex...completed

Created PDF: /tmp/111.pdf

Issues: 1 badbox

For post it I launch RStudio with LANG=C rstudio command. My default sessionInfo():

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=ru_RU.UTF-8       LC_NUMERIC=C               LC_TIME=ru_RU.UTF-8       
 [4] LC_COLLATE=C               LC_MONETARY=ru_RU.UTF-8    LC_MESSAGES=ru_RU.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] Rcpp_0.10.1  tools_2.15.2
> packageVersion("knitr")
[1] ‘0.9

Rnw file in UTF-8 encoding. My system locale is:

locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE=C
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=

Enter this commands in RStudio console give a fine results.

x <- gl(3, 4, 50, labels=c("Высшее", "Среднее", "Ср.-тех."))
pie(table(x), main="Образование")

Thanks.

Results:

1
2

@yihui

This comment has been minimized.

Owner

yihui commented Dec 12, 2012

Search for Encoding and read that section in this page: http://yihui.name/knitr/demo/graphics/

I guess the encoding should be CP1251. Please let me know if it works.

@artemklevtsov

This comment has been minimized.

artemklevtsov commented Dec 13, 2012

Windows encoding under Linux? Ok, i tried that and it not work.
Workground solution is use dev="png" ("cairo_pdf" seems also works), but i get some warnings (I use warnings=FALSE to supresss that).
1
2

@artemklevtsov

This comment has been minimized.

artemklevtsov commented Dec 13, 2012

I tried following code with various encodings without positive results.

pdf("/tmp/graph.pdf", encoding="CP1251")
x <- gl(3, 4, 50, labels=c("Высшее", "Среднее", "Ср.-тех."))
pie(table(x), main="Образование")
dev.off()

Seems this is not knitr bug.
But

cairo_pdf("/tmp/graph.pdf")
x <- gl(3, 4, 50, labels=c("Высшее", "Среднее", "Ср.-тех."))
pie(table(x), main="Образование")
dev.off()

works fine.

Why knitr try processing the symbols when I set dev="png"?

@yihui

This comment has been minimized.

Owner

yihui commented Dec 29, 2012

Because kntir records plots using the pdf() device, then redraw the plots using the device specified in the chunk option.

Since cairo_pdf() works better, you can try this:

<<setup, include=FALSE>>=
options(device = function(file, width = 7, height = 7, ...) {
  cairo_pdf(tempfile(), width = width, height = height, ...)
})
@

<<cyrillic, dev='cairo_pdf'>>=
x <- gl(3, 4, 50, labels=c("Высшее", "Среднее", "Ср.-тех."))
pie(table(x), main="Образование")
@

@yihui yihui closed this in bffba93 Jan 2, 2013

@gasyoun

This comment has been minimized.

gasyoun commented Jan 20, 2014

r-russian-issues

Sorry for intruding - but what does the "grouping" of Russian letters mean in a plot?

@yihui

This comment has been minimized.

Owner

yihui commented Jan 20, 2014

@gasyoun Sorry I have no idea.

@artemklevtsov

This comment has been minimized.

artemklevtsov commented May 25, 2014

@gasyoun It should work:

<<include = FALSE>>=
options(device = "cairo_pdf")
@

<<echo=FALSE>>=
plot(data = cars, xlab = "Скорость", ylab = "Дистанция")
@

Note: you should define device option before a building plot chunk.

abnova added a commit to abnova/diss-floss that referenced this issue Sep 8, 2014

Some improvements: 1) fixed non-appearing of plots in PDF report (don…
…'t use fig.align - use fig.caption instead: rstudio/rmarkdown#148); 2) added knitr setting to use Cairo PDF device (for potential use in the future): yihui/knitr#436; 3) added absolute path for generated figures via fig.path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment