-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using parquet parser seems to fail #854
Comments
With R arrow::read_parquet. What do you get? dt <- arrow::read_parquet('payload.snappy.parquet') |
That works as expected: > dt <- arrow::read_parquet('/opt/ml/payload.snappy.parquet')
> class(dt)
# [1] "tbl_df" "tbl" "data.frame" And then using |
After some digging I think I found a fix. This seems to be a bug(?) in parser_pq <- function(...) {
function(value, ...) {
tmp <- tempfile()
on.exit({
if (file.exists(tmp)) file.remove(tmp)
}, add = TRUE)
readr::write_file(value, tmp)
arrow::read_parquet(tmp)
}
}
register_parser("pq", parser_pq, fixed = "application/parquet")
#' Parse input and return prediction from model
#' @post /invocations
#' @parser pq
function(req) {
setDT(req$argsBody)
print(req$argsBody)
} |
I have not seen this type of error before with I am ok with |
I can't reproduce this consistently; closing. |
System details
Output of
sessioninfo::session_info()()
:Example application or steps to reproduce the problem
This is trying to set up a container to use with Sagemaker batch transforms. From the sagemaker examples, that will be sent as a post like:
curl -v -X POST --data-binary @payload.snappy.parquet -H "Content-Type: application/vnd.apache.parquet" http://localhost:8080/invocations --output payload.snappy.parquet.out
Then the
plumber.R
file looks something like this.Describe the problem in detail
Based on #661, I was expecting the above to print a
data.table
object with the same structure as the parquet file. Which has 5 rows and 12 columns, and is fine to open locally:I looked at following some other tips on parsers for other content types, but kept running into a version of this
embedded nul in string
error when trying to process the binary data. Lots of those examples are based on sending data using forms, which I don't think I can use here.The text was updated successfully, but these errors were encountered: