Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always read and write in UTF-8 #649

Merged
merged 3 commits into from Aug 23, 2017
Merged

Always read and write in UTF-8 #649

merged 3 commits into from Aug 23, 2017

Conversation

hadley
Copy link
Member

@hadley hadley commented Aug 17, 2017

  • Helpers read_lines and write_lines do the right thing
  • readLines() and writeLines() through errors to prevent accidental re-use in the future
  • Warn if package encoding is not utf-8

Fixes #564. Fixes #592

@hadley
Copy link
Member Author

hadley commented Aug 18, 2017

@hadley
Copy link
Member Author

hadley commented Aug 18, 2017

@krlmlr do you happen to remember why the useBytes = TRUE in writeLines(contents, path, useBytes = TRUE) is necessary for windows?

@gaborcsardi
Copy link
Member

@hadley otherwise the text is re-encoded to the native encoding. I think it is necessary on all platforms that are not utf-8 and not byte-oriented.

@krlmlr
Copy link
Member

krlmlr commented Aug 18, 2017

I have code for reading UTF-8 files in my dep-free utf8 package, which I'll release very soon.

@hadley
Copy link
Member Author

hadley commented Aug 18, 2017

@gaborcsardi even though I'm writing to a utf-8 connection??!

@krlmlr
Copy link
Member

krlmlr commented Aug 18, 2017

Yes, R does an involuntary roundtrip via the native encoding.

@hadley
Copy link
Member Author

hadley commented Aug 18, 2017

Still fails on windows with useBytes. I'll diagnose in my windows VM.

hadley and others added 3 commits August 23, 2017 12:56
* Helpers read_lines and write_lines do the right thing
* readLines() and writeLines() through errors to prevent accidental re-use in the future
* Warn if package encoding is not utf-8

Fixes #564. Fixes #592
@hadley hadley merged commit 5cbceb1 into master Aug 23, 2017
@hadley hadley deleted the utf8 branch August 23, 2017 17:58
dschlaep added a commit to DrylandEcology/rSFSW2 that referenced this pull request Mar 11, 2019
- since r-lib/roxygen2#649 (`roxygen2` >= v6.1.0) warns if a `DESCRIPTION` file does not explicitly define its encoding as UTF-8
- this commits adds encoding information to the DESCRIPTION file even if it may be at odds with `Writing R Extensions` (https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues), but see r-lib/roxygen2#774

- at least, this works for me locally and eliminates the `roxygen2` warning "roxygen2 requires Encoding: UTF-8"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants