Describe the bug
Using named profiles along with credential_process appears to be incompatible with the DefaultAWSCredentialsProviderChain class.
Expected Behavior
Using AWS profiles should be compatible with the credential_process.
Current Behavior
In using the PyArrow library I came across an issue where using credential_process to authenticate fails only when using a named AWS profile. That library simply calls to DefaultAWSCredentialsProviderChain, so it seems most likely an issue with the AWS SDK itself.
In short, suppose that I have a config file as follows:
[default]
region = us-east-1
credential_process = /path/to/get-creds.sh
[dev]
region = us-east-1
credential_process = /path/to/get-creds.sh
These commands both work, which validates that both accounts have access to S3:
aws s3 ls s3://bucket/path --profile default
aws s3 ls s3://bucket/path --profile dev
Since PyArrow uses DefaultAWSCredentialsProviderChain under the hood, we should be able to set the environment variables to control which profile is used. Let's use a simple script to test the behavior:
# script.py
import pyarrow.dataset as ds
dataset = ds.dataset("s3://bucket/path/") # authenticates with `DefaultAWSCredentialsProviderChain`
Now, here's the strangeness:
# Check that no environment variables are set in this shell:
env | grep AWS
# This works:
python ./script.py
# So does this:
AWS_PROFILE=default python ./script.py
# But this fails:
AWS_PROFILE=dev python ./script.py
The error is:
Traceback (most recent call last):
File "./script.py", line 3, in <module>
dataset = ds.dataset("s3://bucket/path/")
File "/home/antonstv/miniconda3/envs/pdna/lib/python3.8/site-packages/pyarrow/dataset.py", line 763, in dataset
return _filesystem_dataset(source, **kwargs)
File "/home/antonstv/miniconda3/envs/pdna/lib/python3.8/site-packages/pyarrow/dataset.py", line 446, in _filesystem_dataset
fs, paths_or_selector = _ensure_single_source(source, filesystem)
File "/home/antonstv/miniconda3/envs/pdna/lib/python3.8/site-packages/pyarrow/dataset.py", line 413, in _ensure_single_source
file_info = filesystem.get_file_info(path)
File "pyarrow/_fs.pyx", line 571, in pyarrow._fs.FileSystem.get_file_info
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: When getting information for key 'path' in bucket 'bucket': AWS Error ACCESS_DENIED during HeadObject operation: No response body.
I've also confirmed that the above script works if instead of credential_process, I used fixed credentials that were created by the credential process.
Reproduction Steps
See above.
Possible Solution
No response
Additional Information/Context
No response
AWS CPP SDK version used
I'm unsure; pyarrow is 12.0.1
Compiler and Version used
I'm unsure
Operating System and version
I'm unsure
Describe the bug
Using named profiles along with
credential_processappears to be incompatible with theDefaultAWSCredentialsProviderChainclass.Expected Behavior
Using AWS profiles should be compatible with the
credential_process.Current Behavior
In using the PyArrow library I came across an issue where using
credential_processto authenticate fails only when using a named AWS profile. That library simply calls toDefaultAWSCredentialsProviderChain, so it seems most likely an issue with the AWS SDK itself.In short, suppose that I have a config file as follows:
These commands both work, which validates that both accounts have access to S3:
Since PyArrow uses
DefaultAWSCredentialsProviderChainunder the hood, we should be able to set the environment variables to control which profile is used. Let's use a simple script to test the behavior:Now, here's the strangeness:
The error is:
I've also confirmed that the above script works if instead of
credential_process, I used fixed credentials that were created by the credential process.Reproduction Steps
See above.
Possible Solution
No response
Additional Information/Context
No response
AWS CPP SDK version used
I'm unsure; pyarrow is 12.0.1
Compiler and Version used
I'm unsure
Operating System and version
I'm unsure