Skip to content

Commit

Permalink
Final doc/news tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
nealrichardson committed Sep 23, 2020
1 parent d5b6dde commit 1e14dae
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 9 deletions.
2 changes: 1 addition & 1 deletion r/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
## AWS S3 support

* S3 support is now enabled in binary macOS and Windows (Rtools40 only, i.e. R >= 4.0) packages. To enable it on Linux, you will need to build and install `aws-sdk-cpp` from source, then set the environment variable `EXTRA_CMAKE_FLAGS="-DARROW_S3=ON -DAWSSDK_SOURCE=SYSTEM"` prior to building the R package (with bundled C++ build, not with Arrow system libraries) from source.
* File readers and writers (`read_parquet()`, `write_feather()`, et al.) now accept an `s3://` URI as the source or destination file, as do `open_dataset()` and `write_dataset()`. See `vignette("fs", package = "arrow")` for details.
* File readers and writers (`read_parquet()`, `write_feather()`, et al.), as well as `open_dataset()` and `write_dataset()`, allow you to access resources on S3 (or on file systems that emulate S3) either by providing an `s3://` URI or by passing an additional `filesystem` argument. See `vignette("fs", package = "arrow")` for details.

## Computation

Expand Down
6 changes: 4 additions & 2 deletions r/R/filesystem.R
Original file line number Diff line number Diff line change
Expand Up @@ -138,15 +138,17 @@ FileSelector$create <- function(base_dir, allow_not_found = FALSE, recursive = F
#' AWS configuration set at the environment level.
#' - `session_token`: optional string for authentication along with
#' `access_key` and `secret_key`
#' - `role_arn`: string AWS Role ARN. If provided instead of `access_key` and
#' - `role_arn`: string AWS ARN of an AccessRole. If provided instead of `access_key` and
#' `secret_key`, temporary credentials will be fetched by assuming this role.
#' - `session_name`: optional string identifier for the assumed role session.
#' - `external_id`: optional unique string identifier that might be required
#' when you assume a role in another account.
#' - `load_frequency`: integer, frequency (in seconds) with which temporary
#' credentials from an assumed role session will be refreshed. Default is
#' 900 (i.e. 15 minutes)
#' - `region`: AWS region to connect to (default "us-east-1")
#' - `region`: AWS region to connect to. If omitted, the AWS library will
#' provide a sensible default based on client configuration, falling back
#' to "us-east-1" if no other alternatives are found.
#' - `endpoint_override`: If non-empty, override region with a connect string
#' such as "localhost:9000". This is useful for connecting to file systems
#' that emulate S3.
Expand Down
6 changes: 4 additions & 2 deletions r/man/FileSystem.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion r/src/filesystem.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ std::shared_ptr<fs::S3FileSystem> fs___S3FileSystem__create(
}

// Now handle the rest of the options
/// AWS region to connect to (default "us-east-1")
/// AWS region to connect to (default determined by AWS SDK)
if (region != "") {
s3_opts.region = region;
}
Expand Down
9 changes: 6 additions & 3 deletions r/vignettes/fs.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ the cost of reading the data over the network should be much lower.

Another way to connect to S3 is to create an `S3FileSystem` object once and pass
that to the read/write functions. This may be a convenience when dealing with
long URIs, and it's necessary for some options that aren't supported in the URI
format.
long URIs, and it's necessary for some options and authentication methods
that aren't supported in the URI format.

In the previous example, this would look like:

Expand Down Expand Up @@ -82,7 +82,7 @@ bucket <- SubTreeFileSystem$create("s3://ursa-labs-taxi-data")

## Authentication

To access private S3 buckets, you need two secret parameters:
To access private S3 buckets, you need typically need two secret parameters:
a `access_key`, which is like a user id,
and `secret_key`, like a token.
There are a few options for passing these credentials:
Expand All @@ -95,6 +95,9 @@ There are a few options for passing these credentials:

4. Define them in a `~/.aws/credentials` file, according to the [AWS documentation](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/credentials.html).

You can also use an [AccessRole](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html)
for temporary access by passing the `role_arn` identifier to `S3FileSystem$create()`.

## File systems that emulate S3

The `S3FileSystem` machinery enables you to work with any file system that
Expand Down

0 comments on commit 1e14dae

Please sign in to comment.