Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching does not work with R-devel when cache path contains special characters (R-4.0.0) #1840

Closed
2 tasks done
jarauh opened this issue Apr 28, 2020 · 14 comments
Closed
2 tasks done

Comments

@jarauh
Copy link

jarauh commented Apr 28, 2020

Minimal example:

---
title: 'Minimal example'
author: jr
output:
  html_document:
    self_contained: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  cache.path = "K\u00e4sch/utf8",
  echo = FALSE, warning = FALSE, message = FALSE
)
```

```{r block, cache = TRUE}
df <- mtcars
```

```{r output}
knitr::kable(df)
```

```{r results='asis'}
sessionInfo()
```

In R-4.0 I get:

Quitting from lines 17-18 (utf8-cache.Rmd)
Fehler in lazyLoadDBinsertVariable(vars[i], from, datafile, ascii, compress,  :
  kann Datei 'Käsch/utf8block_e0c2c6fd51407138c4350b68ac183fa5.rdb' nicht öffnen: No such file or directory
Ruft auf: <Anonymous> ... <Anonymous> -> <Anonymous> -> lazyLoadDBinsertVariable
Ausführung angehalten

In R-3.6 things work fine.

(I'm working on a computer that is not connected to the internet, so I can't check the development version of knitr.)

By filing an issue to this repo, I promise that

  • I have fully read the issue guide at https://yihui.org/issue/.
  • [/] I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('knitr'). I have upgraded all my R packages to their latest versions. I do not currently have access to newer RStudio and the development version: remotes::install_github('yihui/knitr').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

> xfun::session_info('knitr')
R version 4.0.0 (2020-04-24) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server x64 (build 14393), RStudio 1.2.5033  Locale:   LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252   LC_NUMERIC=C                    LC_TIME=German_Germany.1252      Package version:   evaluate_0.14   glue_1.4.0      graphics_4.0.0  grDevices_4.0.0 highr_0.8       knitr_1.28      magrittr_1.5      markdown_1.1    methods_4.0.0   mime_0.9        stats_4.0.0     stringi_1.4.6   stringr_1.4.0   tools_4.0.0       utils_4.0.0     xfun_0.13       yaml_2.2.1
@yihui
Copy link
Owner

yihui commented Jun 22, 2020

Since the same thing worked with R 3.6.3 but not 4.0, it is most likely to be an issue with R 4.0 instead of knitr (there were some NEWS items about file.exists() in 4.0.0 and I'm not sure if they are relevant to your issue). I just reported a minimal reproducible example to https://stat.ethz.ch/pipermail/r-devel/2020-June/079666.html. I'll follow up once the cause of this issue is clearer. Thanks!

@yihui
Copy link
Owner

yihui commented Jun 22, 2020

By R 3.6, which exact version did you mean? I tested 3.6.3 and 4.0.1, and your example failed with both versions of R.

@yihui
Copy link
Owner

yihui commented Jun 24, 2020

I tested with R 3.6.3 and 4.0.1 again, but this time I changed my system locale to German (from the Control Panel -> Region and Language):

image

Both 3.6.3 and 4.0.1 worked fine. Previously they failed because I was using a Chinese locale.

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

Thank you for investigating. Now, it also works for me. Such encoding issues are tedious to debug. The version of RStudio might also play a role (so it might also have been an RStudio-bug). So I will close. Thank you again!

@jarauh jarauh closed this as completed Jun 24, 2020
@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

Sorry, I was too quick. It still does not work with RStudio 1.3.959 with R-4.0.0, neither if I let RStudio knit nor if I run the knit command in the console. I also checked the "current language for non-Unicode programs"-setting, and it is German.

So I reopen (but I completely understand if you do not find the time to investigate). I might try to debug myself to see where the problematic conversion happens.

I also tried cache.path = iconv(to = "latin1", "K\u00e4sch/utf8"), but that did not help.

@jarauh jarauh reopened this Jun 24, 2020
@yihui
Copy link
Owner

yihui commented Jun 24, 2020

I reported this issue to R core: https://stat.ethz.ch/pipermail/r-devel/2020-June/079694.html As I mentioned above, later I found that the problem was not reproducible with the German locale. Could you run the code below and report the output?

owd = setwd(tempdir())
z = 'K\u00e4sch.txt'
file.create(z)
list.files()
file.exists(list.files())
setwd(owd)

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

That seems to work:

> owd = setwd(tempdir())
> z = 'K\u00e4sch.txt'
> file.create(z) [1] TRUE
> list.files()
 [1] "Käsch.txt"                                       
 [2] "rs-graphics-7b97ca4d-12fd-4cdc-a629-fb0f754d3c9d"
> file.exists(list.files())
 [1] TRUE TRUE
> setwd(owd)

Note that in my case, the non-ASCII-character is in the directory name, not the file name.

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

I figured out some more things:

  1. In cache$save the line path = cache_path(hash) switches the encoding. No matter what the encoding of hash is, the encoding of path is UTF-8.
    If I debug and do path <- iconv(from = "UTF-8", path) after the above line, things seem to work out. If I don't, things do not work out.

  2. The actual error appears in tools::makeLazyLoadDB, which seems to have problems. In this function, the files are accessed using internal functions, e.g. .Internal(lazyLoadDBinsertValue(...). So it is not base-R-functions that access the files.

So, in any case it is not a knitr-problem, I presume.

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

For the record, I'm comparing R-3.6.3 and R-4.0.0. I cannot try R-4.0.1 or R-4.0.2 right now.

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

Got it:

> p <- "Föö/Bär"
> Encoding(p)
[1] "latin1"
> Encoding(dirname(p))
[1] "UTF-8"
> Encoding(basename(p))
[1] "UTF-8"

Should that happen?

@jarauh
Copy link
Author

jarauh commented Jun 24, 2020

To sum up, I guess that there are two problems:

  1. In my setup (but apparently not on yours?), tools::makeLazyLoadDB accesses files in a way that makes it sensible to the encoding of the filename.
  2. On the other hand, I cannot work around that by specifying the filename in the "right" encoding, since dirname and basename reencode everything to UTF-8.

@jarauh
Copy link
Author

jarauh commented Jul 3, 2020

With R-4.0.2, everything works again. Thanks again for your help.

@jarauh jarauh closed this as completed Jul 3, 2020
@yihui
Copy link
Owner

yihui commented Jul 3, 2020

I saw your report at https://stat.ethz.ch/pipermail/r-devel/2020-June/079732.html but didn't read it. Anyway, it's good to know the problem has been fixed in the latest version of R (and again, it is important to test the latest version of software). Thanks!

@github-actions
Copy link

github-actions bot commented Jan 6, 2021

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants