I’m running RestRserve in a linux container starting from image rocker/r-ver:3.6.3. I’ve noticed that when the size of the multipart body grows above a certain threshold, RestRserve returns an empty response. This does not happen when I run RestRserve directly on my Windows machine.
This is a minimal example: Dockerfile
FROM rocker/r-ver:3.6.3
RUN install2.r --error \
RestRserve \
readr
COPY / /
EXPOSE 80
ENTRYPOINT ["Rscript", "restrserve.R"]
restrserve.R
library(RestRserve)
library(readr)
app = Application$new()
app$logger$set_log_level("trace")
app$add_post(
path = "/echo",
FUN = function(request, response) {
app$logger$info(msg="Reading dta", fileName = c("dta"))
dt <- read_csv(request$get_file("dta"))
app$logger$info(msg="Data read successful", fileName = c("dta"))
response$set_body(format_csv(dt))
}
)
backend = BackendRserve$new()
backend$start(app, http_port = 80)
I build the image and start the container mapping tha container's port 80 to the host's port 5017 (port 80 not allowed on windows). Then I submit a post request to the endpoint.
library(readr)
# Generate data
n<-20000000 # observations
dta<-data.frame(
numVar=round(rnorm(n),3),
charVar=c("foo", "bar")[(runif(n)<0.5)+1]
)
# Write to file
tmp <- tempfile()
write_csv(dta, tmp)
utils:::format.object_size(file.info(tmp)$size, "auto")
#> [1] "198.1 Mb"
# POST request with file
time<-system.time({
rs <- POST(
url = "http://127.0.0.1:5017/echo",
body = list(dta = upload_file(tmp)),
encode = "multipart"
)
})
On my machine, when n=10000000 (99.1 Mb) everything works fine and the container log looks like this.
{"timestamp":"2020-06-08 13:24:13.828893","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","request":{"method":"POST","path":"/echo","parameters_query":{},"parameters_path":[],"headers":{"content-type":"multipart/form-data; boundary=------------------------7909616e2142a7b4","host":"127.0.0.1:5017","user-agent":"libcurl/7.64.1 r-curl/4.3 httr/1.4.1","content-length":"103879300","accept-encoding":["deflate","gzip"],"accept":["application/json","text/xml","application/xml","*/*"]}}}}
{"timestamp":"2020-06-08 13:24:13.829847","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_request middleware"}}
{"timestamp":"2020-06-08 13:24:13.849487","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"try to match requested path '/echo'"}}
{"timestamp":"2020-06-08 13:24:13.850235","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"requested path matched"}}
{"timestamp":"2020-06-08 13:24:13.851160","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"call handler '1'"}}
{"timestamp":"2020-06-08 13:24:13.852220","level":"INFO","name":"Application","pid":19,"msg":"Reading dta","fileName":"dta"}
Parsed with column specification:
cols(
numVar = col_double(),
charVar = col_character()
)
{"timestamp":"2020-06-08 13:24:18.141534","level":"INFO","name":"Application","pid":19,"msg":"Data read successful","fileName":"dta"}
{"timestamp":"2020-06-08 13:24:21.769933","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_response middleware"}}
{"timestamp":"2020-06-08 13:24:22.279853","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","response":{"status_code":200,"headers":{"Server":"RestRserve/0.2.2"}}}}
But when the data size is above a certain threshold (on my machine n=20000000, 198.1 Mb) the endpoint returns an empty response and the log looks like this
{"timestamp":"2020-06-08 13:28:03.809148","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","request":{"method":"POST","path":"/echo","parameters_query":{},"parameters_path":[],"headers":{"content-type":"multipart/form-data; boundary=------------------------355f8f515b2f0e05","host":"127.0.0.1:5017","user-agent":"libcurl/7.64.1 r-curl/4.3 httr/1.4.1","content-length":"207762542","accept-encoding":["deflate","gzip"],"accept":["application/json","text/xml","application/xml","*/*"]}}}}
{"timestamp":"2020-06-08 13:28:03.820926","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_request middleware"}}
{"timestamp":"2020-06-08 13:28:03.877614","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"try to match requested path '/echo'"}}
{"timestamp":"2020-06-08 13:28:03.912894","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"requested path matched"}}
{"timestamp":"2020-06-08 13:28:03.913440","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"call handler '1'"}}
{"timestamp":"2020-06-08 13:28:03.913912","level":"INFO","name":"Application","pid":19,"msg":"Reading dta","fileName":"dta"}
i.e. the call seems to get stuck at parsing the multipart body.
The container should have enough memory available. Any idea how to debug this? Is there some inherent limit I should be aware of? I would like to post files to the endpoint with max size of about 2 GB.
I’m running
RestRservein a linux container starting from imagerocker/r-ver:3.6.3. I’ve noticed that when the size of the multipart body grows above a certain threshold, RestRserve returns an empty response. This does not happen when I runRestRservedirectly on my Windows machine.This is a minimal example: Dockerfile
restrserve.R
I build the image and start the container mapping tha container's port 80 to the host's port 5017 (port 80 not allowed on windows). Then I submit a post request to the endpoint.
On my machine, when n=10000000 (99.1 Mb) everything works fine and the container log looks like this.
But when the data size is above a certain threshold (on my machine n=20000000, 198.1 Mb) the endpoint returns an empty response and the log looks like this
i.e. the call seems to get stuck at parsing the multipart body.
The container should have enough memory available. Any idea how to debug this? Is there some inherent limit I should be aware of? I would like to post files to the endpoint with max size of about 2 GB.