Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uft8 source file support #328

Merged
merged 25 commits into from
Nov 6, 2018
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
1b19e3a
The source files used in plumber must use the UTF-8 encoding if they …
shrektan Oct 12, 2018
0b82e79
comments
schloerke Oct 29, 2018
6c8908f
Merge branch 'master' into add-param-encoding
schloerke Oct 29, 2018
5d60c8a
Merge branch 'add-param-encoding' of https://github.com/shrektan/plum…
schloerke Oct 29, 2018
8578fb0
display appveyor local
schloerke Oct 29, 2018
139f4fd
try in install script
schloerke Oct 29, 2018
d3d1326
chinese only
schloerke Oct 29, 2018
e13c0a4
top level scripts only
schloerke Oct 29, 2018
98cda72
no cache
schloerke Oct 29, 2018
89b85f0
use testthat::test_path when providing relative test files
schloerke Oct 31, 2018
b9014fc
add sourceUTF8 and helper functions from shiny
schloerke Nov 1, 2018
509562b
remove appveyor file as locale / utf8 can be tested on travis
schloerke Nov 1, 2018
12b0d45
add second PR to news item
schloerke Nov 1, 2018
c84d5ec
add note for why sourcing is done
schloerke Nov 2, 2018
1026715
add note about keeping the source location when parsing
schloerke Nov 2, 2018
fb41fa1
merged master
schloerke Nov 5, 2018
cde0d3d
Evalauate all file expression blocks only once, not once or twice
schloerke Nov 5, 2018
aa2ed73
activateBlock -> evaluateBlock
schloerke Nov 5, 2018
9a3a40f
add comments about LC_ALL and use charToRaw to test UTF-8 strings
schloerke Nov 5, 2018
503bd82
minimize travis crand and github pkg downloads
schloerke Nov 5, 2018
8fba37f
remove travis comments
schloerke Nov 5, 2018
370d17a
added news item about sourcing code only once
schloerke Nov 5, 2018
d8d0dd0
bold **must use**
schloerke Nov 5, 2018
f79b685
add default appveyor
schloerke Nov 6, 2018
b1fb950
Merge branch 'master' into shrektan-add-param-encoding
schloerke Nov 6, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,3 @@ docs
scripts
^revdep$
^cran-comments\.md$
^appveyor\.yml$
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Roxygen: list(markdown = TRUE)
Authors@R: c(
person(family="Trestle Technology, LLC", role="aut", email="cran@trestletech.com"),
person("Jeff", "Allen", role="cre", email="cran@trestletech.com"),
person("Barret", "Schloerke", role="ctb", email="barret@rstudio.com"),
person("Frans", "van Dunné", role="ctb", email="frans@ixpantia.com"),
person("Sebastiaan", "Vandewoude", role="ctb", email="sebastiaanvandewoude@gmail.com"),
person(family="SmartBear Software", role=c("ctb", "cph"), comment="swagger-ui"))
Expand Down Expand Up @@ -65,4 +66,5 @@ Collate:
'serializer.R'
'session-cookie.R'
'swagger.R'
RoxygenNote: 6.0.1
'utf8.R'
RoxygenNote: 6.1.0
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,15 @@ plumber 0.4.7
--------------------------------------------------------------------------------
* Add support for swagger for mounted routers (@bradleyhd, [#274](https://github.com/trestletech/plumber/issues/274)).
* BUGFIX: A multiline POST body is now collapsed to a single line ([#270](https://github.com/trestletech/plumber/issues/270)).
* The source files used in plumber must use the UTF-8 encoding if they contain
non-ASCII characters (@shrektan, [#312](https://github.com/trestletech/plumber/pull/312),
[#328](https://github.com/trestletech/plumber/pull/328)).
* SECURITY: Wrap `jsonlite::fromJSON` to ensure that `jsonlite` never reads
input as a remote address (such as a file path or URL) and attempts to parse
that. The only known way to exploit this behavior in plumber unless an
API were using encrypted cookies and an attacker knew the encryption key in
order to craft arbitrary cookies.
order to craft arbitrary cookies. ([#325](https://github.com/trestletech/plumber/pull/325))


plumber 0.4.6
--------------------------------------------------------------------------------
Expand Down
15 changes: 8 additions & 7 deletions R/plumber.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ plumb <- function(file, dir="."){
on.exit(setwd(old))

# Expect that entrypoint will provide us with the router
x <- source(entrypoint)
# Do not 'poison' the global env. Using a local environment
# sourceUTF8 returns the (visible) value object. No need to call source()$value()
pr <- sourceUTF8(entrypoint, environment())

# source returns a list with value and visible elements, we want the (visible) value object.
pr <- x$value
if (!inherits(pr, "plumber")){
stop("entrypoint.R must return a runnable Plumber router.")
}
Expand Down Expand Up @@ -177,10 +177,11 @@ plumber <- R6Class(
private$notFoundHandler <- default404Handler

if (!is.null(file)){
private$lines <- readLines(file)
private$parsed <- parse(file, keep.source=TRUE)

source(file, local=private$envir, echo=FALSE, keep.source=TRUE)
private$lines <- readUTF8(file)
private$parsed <- parseUTF8(file)
# populate the local envir with parameters defined within the file
# https://github.com/trestletech/plumber/issues/329
sourceUTF8(file, private$envir)
schloerke marked this conversation as resolved.
Show resolved Hide resolved

for (i in 1:length(private$parsed)){
e <- private$parsed[i]
Expand Down
81 changes: 81 additions & 0 deletions R/utf8.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Shiny has been tried and tested on different encodings
# adoption from https://github.com/rstudio/shiny/blob/6ede0194c6e52d18be3ec13514defd8e0f22a3c1/R/utils.R#L1459-L1532

isWindows <- function() .Platform$OS.type == 'windows'

# assume file is encoded in UTF-8, but warn against BOM
checkEncoding <- function(file) {
# skip *nix because its locale is normally UTF-8 based (e.g. en_US.UTF-8), and
# *nix users have to make a conscious effort to save a file with an encoding
# that is not UTF-8; if they choose to do so, we cannot do much about it
# except sitting back and seeing them punished after they choose to escape a
# world of consistency (falling back to getOption('encoding') will not help
# because native.enc is also normally UTF-8 based on *nix)
if (!isWindows()) return('UTF-8')
size <- file.info(file)[, 'size']
if (is.na(size)) stop('Cannot access the file ', file)
# BOM is 3 bytes, so if the file contains BOM, it must be at least 3 bytes
if (size < 3L) return('UTF-8')

# check if there is a BOM character: this is also skipped on *nix, because R
# on *nix simply ignores this meaningless character if present, but it hurts
# on Windows
if (identical(charToRaw(readChar(file, 3L, TRUE)), charToRaw('\UFEFF'))) {
warning('You should not include the Byte Order Mark (BOM) in ', file, '. ',
'Please re-save it in UTF-8 without BOM. See ',
'http://shiny.rstudio.com/articles/unicode.html for more info.')
return('UTF-8-BOM')
}
x <- readChar(file, size, useBytes = TRUE)
if (is.na(iconv(x, 'UTF-8', 'UTF-8'))) {
warning('The input file ', file, ' does not seem to be encoded in UTF8')
}
'UTF-8'
}

# read a file using UTF-8 and (on Windows) convert to native encoding if possible
readUTF8 <- function(file) {
enc <- checkEncoding(file)
file <- base::file(file, encoding = enc)
on.exit(close(file), add = TRUE)
x <- enc2utf8(readLines(file, warn = FALSE))
tryNativeEncoding(x)
}

# if the UTF-8 string can be represented in the native encoding, use native encoding
tryNativeEncoding <- function(string) {
if (!isWindows()) return(string)
string2 <- enc2native(string)
if (identical(enc2utf8(string2), string)) string2 else string
}

# similarly, try to source() a file with UTF-8
sourceUTF8 <- function(file, envir = globalenv()) {
exprs <- parseUTF8(file)

eval(exprs, envir)
}


parseUTF8 <- function(file) {
lines <- readUTF8(file)
enc <- if (any(Encoding(lines) == 'UTF-8')) 'UTF-8' else 'unknown'
src <- srcfilecopy(file, lines, isFile = TRUE) # source reference info

# oddly, parse(file) does not work when file contains multibyte chars that
# **can** be encoded natively on Windows (might be a bug in base R); we
# rewrite the source code in a natively encoded temp file and parse it in this
# case (the source reference is still pointed to the original file, though)
if (isWindows() && enc == 'unknown') {
file <- tempfile(); on.exit(unlink(file), add = TRUE)
writeLines(lines, file)
}

# keep the source locations within the file while parsing
exprs <- try(parse(file, keep.source = TRUE, srcfile = src, encoding = enc))
if (inherits(exprs, "try-error")) {
stop("Error sourcing ", file)
}

exprs
}
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# plumber

[![Build Status](https://travis-ci.org/trestletech/plumber.svg?branch=master)](https://travis-ci.org/trestletech/plumber)
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/trestletech/plumber?branch=master&svg=true)](https://ci.appveyor.com/project/trestletech/plumber)
[![](https://www.r-pkg.org/badges/version/plumber)](https://www.r-pkg.org/pkg/plumber)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/plumber?color=brightgreen)](https://www.r-pkg.org/pkg/plumber)
[![codecov](https://codecov.io/gh/trestletech/plumber/branch/master/graph/badge.svg)](https://codecov.io/gh/trestletech/plumber)
Expand Down
74 changes: 0 additions & 74 deletions appveyor.yml

This file was deleted.

4 changes: 4 additions & 0 deletions tests/testthat/files/multibytes.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#* @get /echo
function() {
"中文消息"
}
2 changes: 1 addition & 1 deletion tests/testthat/test-content-type.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ test_that("contentType works in files", {

res <- PlumberResponse$new()

r <- plumber$new("files/content-type.R")
r <- plumber$new(test_path("files/content-type.R"))
val <- r$serve(make_req("GET", "/"), res)
expect_equal(val$headers$`Content-Type`, "text/plain")
})
Expand Down
3 changes: 1 addition & 2 deletions tests/testthat/test-deprecated.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,8 @@ test_that("addGlobalProcessor continues to work", {

test_that("addAssets continues to work", {
pr <- plumber$new()
expect_warning(pr$addAssets("./files/static", "/public"))
expect_warning(pr$addAssets(test_path("./files/static"), "/public"))
res <- PlumberResponse$new()
val <- pr$route(make_req("GET", "/public/test.txt"), res)
expect_true(inherits(val, "PlumberResponse"))
})

14 changes: 7 additions & 7 deletions tests/testthat/test-filters.R
Original file line number Diff line number Diff line change
@@ -1,40 +1,40 @@
context("filters")

test_that("Filters work", {
r <- plumber$new("files/filters.R")
r <- plumber$new(test_path("files/filters.R"))
expect_equal(length(r$filters), 4+2) #4 for post, query string, cookie, and shared secret filters

expect_equal(r$filters[[5]]$name, "something")
expect_equal(r$filters[[6]]$name, "nospace")
})

test_that("Filters can update req$args", {
r <- plumber$new("files/filters.R")
r <- plumber$new(test_path("files/filters.R"))

req <- make_req("GET", "/")
res <- PlumberResponse$new()
expect_equal(r$serve(req, res)$body, jsonlite::toJSON(23))
})

test_that("Redundant filters fail", {
expect_error(plumber$new("files/filter-redundant.R"), regexp="Multiple @filters")
expect_error(plumber$new(test_path("files/filter-redundant.R")), regexp="Multiple @filters")
})

test_that("Empty filters fail", {
expect_error(plumber$new("files/filter-empty.R"), regexp="No @filter name specified")
expect_error(plumber$new(test_path("files/filter-empty.R")), regexp="No @filter name specified")
})

test_that("Filter and path fails", {
expect_error(plumber$new("files/filterpath.R"), regexp="can only be")
expect_error(plumber$new(test_path("files/filterpath.R")), regexp="can only be")
})

test_that("Filter and assets fails", {
expect_error(plumber$new("files/filterasset.R"), regexp="can only be")
expect_error(plumber$new(test_path("files/filterasset.R")), regexp="can only be")
})

test_that("Terminal filters indeed terminate", {
res <- PlumberResponse$new()
r <- plumber$new("files/terminal-filter.R")
r <- plumber$new(test_path("files/terminal-filter.R"))
expect_equal(r$route(make_req("GET", "/"), res), 1)
})

Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-image.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
context("Images")

test_that("Images are properly rendered", {
r <- plumber$new("files/image.R")
r <- plumber$new(test_path("files/image.R"))

resp <- r$serve(make_req("GET", "/png"), PlumberResponse$new())
expect_equal(resp$status, 200)
Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-include.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
context("Includes")

test_that("Includes work", {
r <- plumber$new("files/includes.R")
r <- plumber$new(test_path("files/includes.R"))

# When running, we setwd to the file's dir. Simulate that here.
cwd <- getwd()
Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-injection.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
context("Injection")

test_that("Injected arguments on req$args get passed on.", {
r <- plumber$new("files/filter-inject.R")
r <- plumber$new(test_path("files/filter-inject.R"))

res <- PlumberResponse$new()
expect_equal(r$serve(make_req("GET", "/"), res)$body, jsonlite::toJSON(13))
Expand Down
39 changes: 39 additions & 0 deletions tests/testthat/test-multibytes.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
context("multibytes source file")

with_locale <- function(locale, locale_val, expr) {
prev_locale_val <- Sys.getlocale(locale)
Sys.setlocale(locale, locale_val)
on.exit({
Sys.setlocale(locale, prev_locale_val)
})
force(expr)
}

expect_utf8 <- function(x) {
expect_equal(Encoding(x), "UTF-8")
}

test_that("support files with multibytes", {

with_locale("LC_CTYPE", "C", {
schloerke marked this conversation as resolved.
Show resolved Hide resolved
with_locale("LC_COLLATE", "C", {

r <- plumber$new(test_path("files/multibytes.R"))
req <- make_req("GET", "/echo")
res <- PlumberResponse$new()
out <- r$serve(req, res)$body

# ?Quotes
# "Unicode escapes can be used to enter Unicode characters not in the current locale's
# charset (when the string will be stored internally in UTF-8)."
expected <- jsonlite::toJSON("\u4e2d\u6587\u6d88\u606f")

expect_identical(Sys.getlocale("LC_CTYPE"), "C")
expect_identical(Sys.getlocale("LC_COLLATE"), "C")
expect_utf8(expected)
expect_utf8(out)
expect_identical(out, expected)
schloerke marked this conversation as resolved.
Show resolved Hide resolved

})
})
})
2 changes: 1 addition & 1 deletion tests/testthat/test-path-subst.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ test_that("path regex's are created properly", {
})

test_that("integration of path parsing works", {
r <- plumber$new("files/path-params.R")
r <- plumber$new(test_path("files/path-params.R"))

expect_equal(r$route(make_req("GET", "/car/13"), PlumberResponse$new()), "13")
expect_equal(r$route(make_req("GET", "/car/int/13"), PlumberResponse$new()), 13)
Expand Down
Loading