Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plumber() and Plumber$new() will read source file as UTF-8 encoded #312

Closed
wants to merge 3 commits into from

Conversation

@shrektan
Copy link
Contributor

shrektan commented Oct 5, 2018

Related to #296

DESCRIPTION

R uses the encoding "unknown" or "native" when reading text files by default. However, the fact that the default encoding varies (not UTF-8) on Windows, leads to a big headache for our multiple-bytes charset users on Windows, which is we have to store the source file as the native encoding in order to use plumber.

Thanks to the great work of RStudio (especially, @yihui), the common practice of handling this issue is to read the file as UTF-8 encoding by default(may provide a param "encoding" allowing users to change) and this PR is going to address this.

WHAT THIS PR DOES?

  • add the new param encoding for plumber() and Plumber$new()
  • enable the UTF-8 as the default

As mentioned in #296, there's another issue that the function input should be marked as UTF-8, which I would like file another PR after this being merged.

Thanks.

@CLAassistant

This comment has been minimized.

Copy link

CLAassistant commented Oct 5, 2018

CLA assistant check
All committers have signed the CLA.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 5, 2018

Codecov Report

❗️ No coverage uploaded for pull request base (master@9327541). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #312   +/-   ##
=========================================
  Coverage          ?   90.04%           
=========================================
  Files             ?       25           
  Lines             ?     1145           
  Branches          ?        0           
=========================================
  Hits              ?     1031           
  Misses            ?      114           
  Partials          ?        0
Impacted Files Coverage Δ
R/plumber.R 84.86% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9327541...6c8908f. Read the comment docs.

@yihui

This comment has been minimized.

Copy link
Member

yihui commented Oct 5, 2018

YES! Use UTF-8, or we can't be friends.

Perhaps in knitr 2.0 and rmarkdown 2.0, I'll just remove the encoding arguments, and force UTF-8 like what Pandoc has been doing for years, which is something I have envied for long.

@schloerke

This comment has been minimized.

Copy link
Collaborator

schloerke commented Oct 5, 2018

@yihui Do you think it's a good idea remove the argument and just force UTF-8?

@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 5, 2018

At least, it’s a good idea to me, actually. :P

@yihui

This comment has been minimized.

Copy link
Member

yihui commented Oct 5, 2018

@schloerke Yes. That is what I have been doing in bookdown (since 2016) and blogdown (2017). In future packages, I'll always do this. The encoding argument brings enormous pain to developers, whereas it only bugs users a tiny bit when they are asked to save a text file in UTF-8 (they can set this as the default in RStudio).

@schloerke

This comment has been minimized.

Copy link
Collaborator

schloerke commented Oct 5, 2018

@shrektan Mind removing it from the args and adding it as a global package variable w/ comment to this PR? (Feel free to copy from below)

# Hard code UTF-8 file encoding
# Removes encoding headache at minor cost of setting encoding in editor
# https://github.com/trestletech/plumber/pull/312
utf8Encoding <- "UTF-8"


plumb <- function(file, dir = "."){
# ...
x <- source(entrypoint, encoding = utf8Encoding)
# etc...
}
@shrektan shrektan force-pushed the shrektan:add-param-encoding branch from 798ca94 to 731704e Oct 5, 2018
@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 5, 2018

@schloerke I have removed it.

R/plumber.R Outdated Show resolved Hide resolved
R/plumber.R Outdated Show resolved Hide resolved
R/plumber.R Outdated Show resolved Hide resolved
Copy link
Collaborator

schloerke left a comment

See comments. Should be last round of review

Thank you for the test file!

@shrektan shrektan changed the title add the encoding param for plumber() and Plumber$new() plumber() and Plumber$new() will read source file as UTF-8 encoded Oct 12, 2018
…contain

non-ASCII characters
@shrektan shrektan force-pushed the shrektan:add-param-encoding branch from 127f042 to 1b19e3a Oct 12, 2018
@shrektan

This comment has been minimized.

Copy link
Contributor Author

shrektan commented Oct 12, 2018

@schloerke I've made the changes you requested and add a NEWS file (I made a forced-push in order to have a clean history).

Can it be merged? It's very inconvenient to maintain/edit the file as encoding other than UTF-8. Moreover, it's consistent with shiny, which forces the UTF-8 encoding in the very early stage.

shiny 0.10.1

Added Unicode support for Windows. Shiny apps running on Windows must use the UTF-8 encoding for ui.R and server.R (also the optional global.R) if they contain non-ASCII characters. See this article for details and examples: http://shiny.rstudio.com/gallery/unicode-characters.html (#516)

In addition, there's a minor issue about using utf8Encoding to represent the literal string "UTF-8" (see the quote below). Although I made the change, I don't support it because it's not a "magic" value in R. As you can see in other packages like shiny or in pkgload, they almost always are written directly.

@shrektan Mind removing it from the args and adding it as a global package variable w/ comment to this PR? (Feel free to copy from below)

# Hard code UTF-8 file encoding
# Removes encoding headache at minor cost of setting encoding in editor
# https://github.com/trestletech/plumber/pull/312
utf8Encoding <- "UTF-8"


plumb <- function(file, dir = "."){
# ...
x <- source(entrypoint, encoding = utf8Encoding)
# etc...
}
@schloerke schloerke requested a review from jcheng5 Oct 19, 2018
@schloerke schloerke removed the request for review from jcheng5 Oct 26, 2018
@schloerke schloerke added the QA label Oct 26, 2018
schloerke added 2 commits Oct 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.