Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] reading data from S3 #30183

Closed
asfimport opened this issue Nov 9, 2021 · 3 comments
Closed

[R] reading data from S3 #30183

asfimport opened this issue Nov 9, 2021 · 3 comments

Comments

@asfimport
Copy link

I am trying to read data directly from S3. In my work pipeline I work under proxy, so I set system environments HTTP_PROXY and HTTPS_PROXY. Also I have set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.  I have constructed links for reading:

s3_uri <- paste0("s3://", accessKeyId, ":", URLencode(secretAccessKey, reserved = T),"@", bucketName, "/", path) 
s3_uri_short <- paste0("s3://", bucketName, "/", path)
bucketS3 <- arrow::s3_bucket(bucket = bucketName, access_key = accessKeyId,
                       secret_key = URLencode(secretAccessKey, reserved = T))

But none of them worked for arrow::read_parquet

df <- arrow::read_parquet(s3_uri)
df <- arrow::read_parquet(s3_uri_short) 
df <- arrow::read_parquet(bucketS3$path(x = path))

Error

 IOError: When reading information for key AWS Error [code 15]: No response body. with address : 

How can I fix the problem or maybe get more informative output ? Thanks for attention. 

I knew about same issue https://issues.apache.org/jira/browse/ARROW-12126 but I can't reopen it.

 

Reporter: Mikhail Tolmachev

PRs and other links:

Note: This issue was originally created as ARROW-14640. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Nicola Crane / @thisisnic:
Hi [~2umxal] , I need to confirm, but I think the issue is that this the proxy settings need to be passed in to the call to {}arrow::s3_bucket(){}, but this has not yet been implemented.  Your error message there is the same as one which is returned when the user supplies incorrect credentials, but I think it's caused by not having the proxy bits set.  I'll dig into this more and get back to you - thanks for reporting this!

@asfimport
Copy link
Author

Dewey Dunnington / @paleolimbot:
As Nic noted, this is possible in Python but we haven't implemented in R yet. Relevant C++ defs are the S3ProxyOptions and S3Options. Working on this now!

@asfimport
Copy link
Author

Jonathan Keane / @jonkeane:
Issue resolved by pull request 11691
#11691

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant