Skip to content

Empty reply from server when size of multipart body above certain threshold #150

@rplati

Description

@rplati

I’m running RestRserve in a linux container starting from image rocker/r-ver:3.6.3. I’ve noticed that when the size of the multipart body grows above a certain threshold, RestRserve returns an empty response. This does not happen when I run RestRserve directly on my Windows machine.

This is a minimal example: Dockerfile

FROM rocker/r-ver:3.6.3

RUN install2.r --error \
  RestRserve \
  readr
  
COPY / /
EXPOSE 80
ENTRYPOINT ["Rscript", "restrserve.R"]

restrserve.R

library(RestRserve)
library(readr)

app = Application$new()
app$logger$set_log_level("trace")

app$add_post(
  path = "/echo",
  FUN = function(request, response) {
    app$logger$info(msg="Reading dta", fileName = c("dta"))
    dt <- read_csv(request$get_file("dta"))
    app$logger$info(msg="Data read successful", fileName = c("dta"))

    response$set_body(format_csv(dt))
  }
)

backend = BackendRserve$new()
backend$start(app, http_port = 80)

I build the image and start the container mapping tha container's port 80 to the host's port 5017 (port 80 not allowed on windows). Then I submit a post request to the endpoint.

library(readr)

# Generate data
n<-20000000      # observations
dta<-data.frame(
  numVar=round(rnorm(n),3),
  charVar=c("foo", "bar")[(runif(n)<0.5)+1]
  )

# Write to file
tmp <- tempfile()
write_csv(dta, tmp)
utils:::format.object_size(file.info(tmp)$size, "auto")
#> [1] "198.1 Mb"

# POST request with file
time<-system.time({
  rs <- POST(
    url = "http://127.0.0.1:5017/echo",
    body = list(dta = upload_file(tmp)),
    encode = "multipart"
  )
})

On my machine, when n=10000000 (99.1 Mb) everything works fine and the container log looks like this.

{"timestamp":"2020-06-08 13:24:13.828893","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","request":{"method":"POST","path":"/echo","parameters_query":{},"parameters_path":[],"headers":{"content-type":"multipart/form-data; boundary=------------------------7909616e2142a7b4","host":"127.0.0.1:5017","user-agent":"libcurl/7.64.1 r-curl/4.3 httr/1.4.1","content-length":"103879300","accept-encoding":["deflate","gzip"],"accept":["application/json","text/xml","application/xml","*/*"]}}}}
{"timestamp":"2020-06-08 13:24:13.829847","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_request middleware"}}
{"timestamp":"2020-06-08 13:24:13.849487","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"try to match requested path '/echo'"}}
{"timestamp":"2020-06-08 13:24:13.850235","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"requested path matched"}}
{"timestamp":"2020-06-08 13:24:13.851160","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","message":"call handler '1'"}}
{"timestamp":"2020-06-08 13:24:13.852220","level":"INFO","name":"Application","pid":19,"msg":"Reading dta","fileName":"dta"}
Parsed with column specification:
cols(
  numVar = col_double(),
  charVar = col_character()
)
{"timestamp":"2020-06-08 13:24:18.141534","level":"INFO","name":"Application","pid":19,"msg":"Data read successful","fileName":"dta"}
{"timestamp":"2020-06-08 13:24:21.769933","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_response middleware"}}
{"timestamp":"2020-06-08 13:24:22.279853","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"587be48a-a98b-11ea-8a0a-0242ac110002","response":{"status_code":200,"headers":{"Server":"RestRserve/0.2.2"}}}}

But when the data size is above a certain threshold (on my machine n=20000000, 198.1 Mb) the endpoint returns an empty response and the log looks like this

{"timestamp":"2020-06-08 13:28:03.809148","level":"DEBUG","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","request":{"method":"POST","path":"/echo","parameters_query":{},"parameters_path":[],"headers":{"content-type":"multipart/form-data; boundary=------------------------355f8f515b2f0e05","host":"127.0.0.1:5017","user-agent":"libcurl/7.64.1 r-curl/4.3 httr/1.4.1","content-length":"207762542","accept-encoding":["deflate","gzip"],"accept":["application/json","text/xml","application/xml","*/*"]}}}}
{"timestamp":"2020-06-08 13:28:03.820926","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","middleware":"EncodeDecodeMiddleware","message":"call process_request middleware"}}
{"timestamp":"2020-06-08 13:28:03.877614","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"try to match requested path '/echo'"}}
{"timestamp":"2020-06-08 13:28:03.912894","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"requested path matched"}}
{"timestamp":"2020-06-08 13:28:03.913440","level":"TRACE","name":"Application","pid":19,"msg":"","context":{"request_id":"e18f5a04-a98b-11ea-8a0a-0242ac110002","message":"call handler '1'"}}
{"timestamp":"2020-06-08 13:28:03.913912","level":"INFO","name":"Application","pid":19,"msg":"Reading dta","fileName":"dta"}

i.e. the call seems to get stuck at parsing the multipart body.
The container should have enough memory available. Any idea how to debug this? Is there some inherent limit I should be aware of? I would like to post files to the endpoint with max size of about 2 GB.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions