Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-9869: [R] Implement full S3FileSystem/S3Options constructor #8197

Closed
wants to merge 13 commits into from

Conversation

nealrichardson
Copy link
Member

@nealrichardson nealrichardson commented Sep 15, 2020

Also adds tests that run on minio, which I've tested locally, added documentation for how to run them, and added them to the macOS ("autobrew") nightly test run (because the Linux builds still don't have aws-sdk-cpp support).

@github-actions
Copy link

@nealrichardson nealrichardson force-pushed the r-s3-options branch 2 times, most recently from 5864fbf to 425923d Compare September 17, 2020 18:46
@nealrichardson
Copy link
Member Author

@github-actions crossbow submit homebrew-r-autobrew

@github-actions
Copy link

Revision: 425923d

Submitted crossbow builds: ursa-labs/crossbow @ actions-545

Task Status
homebrew-r-autobrew TravisCI

@nealrichardson
Copy link
Member Author

@github-actions crossbow submit homebrew-r-autobrew

@github-actions
Copy link

Revision: 2b89af1

Submitted crossbow builds: ursa-labs/crossbow @ actions-546

Task Status
homebrew-r-autobrew TravisCI

@nealrichardson
Copy link
Member Author

@github-actions crossbow submit homebrew-r-autobrew

@github-actions
Copy link

Revision: 51b93df

Submitted crossbow builds: ursa-labs/crossbow @ actions-547

Task Status
homebrew-r-autobrew TravisCI

@nealrichardson nealrichardson marked this pull request as ready for review September 17, 2020 21:34
/// If non-empty, override region with a connect string such as "localhost:9000"
SEXP endpoint_override = options["endpoint_override"];
if (!Rf_isNull(endpoint_override)) {
s3_opts.endpoint_override = cpp11::as_cpp<std::string>(endpoint_override);
Copy link
Member

@pitrou pitrou Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse get_optional_string here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't want to set it if the user doesn't provide it (i.e. if it is NULL). Or is empty string "" the same thing? s3fs.h doesn't give these fields default values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty string is implicitly the default value.

#' @param ... extra parameters passed to `read_feather()`.
#'
#' @return A `data.frame` if `as_data_frame` is `TRUE` (the default), or an
#' Arrow [Table] otherwise
#' @seealso [read_feather()] for writing IPC files. [RecordBatchReader] for a
#' lower-level interface.
#' @export
read_ipc_stream <- function(file, as_data_frame = TRUE, ...) {
read_ipc_stream <- function(file, as_data_frame = TRUE, filesystem = NULL, ...) {
if (!inherits(file, "InputStream")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of this happening. Does this really need file= and filesystem. Can this be something like:

read_ipc_stream(filesystem$open(file))

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it's spelled filesystem$OpenInputStream(file) if it's for reading, or filesystem$OpenOutputStream(file) for writing. I resisted adding the extra argument, but I thought it was less bad than requiring users to learn those.

But we could think of other solutions. One possibility is having some sort of FileLocator object that is filesystem + path; we've started exploring that in places in the C++ library, and I've been looking into similar things for ARROW-9870. Maybe we could revisit this question in that PR (next week-ish)?

@nealrichardson
Copy link
Member Author

I'm going to merge this and we can follow up with UX questions in the next PR.

@nealrichardson nealrichardson deleted the r-s3-options branch September 25, 2020 18:14
emkornfield pushed a commit to emkornfield/arrow that referenced this pull request Oct 16, 2020
Also adds tests that run on minio, which I've tested locally, added documentation for how to run them, and added them to the macOS ("autobrew") nightly test run (because the Linux builds still don't have aws-sdk-cpp support).

Closes apache#8197 from nealrichardson/r-s3-options

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
Also adds tests that run on minio, which I've tested locally, added documentation for how to run them, and added them to the macOS ("autobrew") nightly test run (because the Linux builds still don't have aws-sdk-cpp support).

Closes apache#8197 from nealrichardson/r-s3-options

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants