Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Writing to AWS S3 fails when using aws sso (13.0.0) #38013

Open
jayceslesar opened this issue Oct 4, 2023 · 6 comments
Open

[C++] Writing to AWS S3 fails when using aws sso (13.0.0) #38013

jayceslesar opened this issue Oct 4, 2023 · 6 comments

Comments

@jayceslesar
Copy link

jayceslesar commented Oct 4, 2023

Describe the bug, including details regarding any error messages, version, and platform.

My org recently switched to using aws sso to configure all of our access and when making the swap, we noticed that we are no longer able to write to S3 when using these types of credentials. When explicitly setting the AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID with previous IAM User accounts, the problem does not exist. For reference, once authenticating with aws sso login and attempting to write some data, I am thrown the following error:

  File "pyarrow/_dataset.pyx", line 3844, in pyarrow._dataset._filesystemdataset_write
  File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: When testing for existence of bucket 'my-s3-bucket': AWS Error ACCESS_DENIED during HeadBucket operation: No response body.

Note also that
eval $(aws configure export-credentials --format env) also does not seem to work as those are AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID from the sso login, which once exported also raise the same issue

I have confirmed that this is not an issue in 12.0.1, only 13.0.0

Component(s)

Parquet

@jayceslesar jayceslesar changed the title Writing to AWS S3 fails when using aws sso Writing to AWS S3 fails when using aws sso (13.0.0) Oct 4, 2023
@kou kou changed the title Writing to AWS S3 fails when using aws sso (13.0.0) [C++] Writing to AWS S3 fails when using aws sso (13.0.0) Oct 4, 2023
@rdbisme
Copy link

rdbisme commented Oct 16, 2023

Wrapping the call around fsspec should work. But it's potentially slower and more memory hungry.

@jayceslesar
Copy link
Author

Wrapping the call around fsspec should work. But it's potentially slower and more memory hungry.

@rdbisme can you share what you mean with an example?

@rdbisme
Copy link

rdbisme commented Oct 19, 2023

s3_path = "s3://my-bucket/path"
with fsspec.open(s3_path, "wb") as file_stream:
              df.to_parquet(file_stream)

should work (didn't test, but I think it does). @jayceslesar let me know!

@jayceslesar
Copy link
Author

Thanks @rdbisme! it also looks like explicitly passing in an initialized s3fs.S3FileSystem class works as well

@drauschenbach
Copy link

C++ hack:

$ eval $(aws configure export-credentials --format env)
std::string access_key = std::getenv("AWS_ACCESS_KEY_ID");
std::string secret_key = std::getenv("AWS_SECRET_ACCESS_KEY");
std::string session_token = std::getenv("AWS_SESSION_TOKEN");

arrow::fs::S3Options options;
options.ConfigureAccessKey(access_key, secret_key, session_token)

@pitrou
Copy link
Member

pitrou commented May 2, 2024

As a starting point, someone should post in that issue 1) what they are doing precisely 2) how it can/should be made to work. Currently the discussion seems to assume Arrow developers are S3 experts, which they are not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants