Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The AWS Access Key Id you provided does not exist in our records #85

Closed
mnunes opened this issue Jun 6, 2017 · 3 comments
Closed

The AWS Access Key Id you provided does not exist in our records #85

mnunes opened this issue Jun 6, 2017 · 3 comments
Labels

Comments

@mnunes
Copy link

mnunes commented Jun 6, 2017

I'm trying to run the very first example given in your main page: https://github.com/datasciencebr/serenata-toolbox

from serenata_toolbox.datasets import Datasets
datasets = Datasets('/tmp/serenata-data/')

# now lets see what datasets are available
for dataset in datasets.remote.all:
    print(dataset)  # and you'll see a long list of datasets!

I changed the datasets folder to an existing folder in my computer and everything works. However, when I try to run the above loop, I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/marcus/Documents/Research/Big Data/Serenata de Amor/serenata-toolbox/serenata_toolbox/datasets/remote.py", line 74, in all
    response =  self.s3.list_objects(Bucket=self.bucket)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 557, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the ListObjects operation: The AWS Access Key Id you provided does not exist in our records.

It seems I need an AWS Access Key. However, I am using the config.inifile you provided:

[Amazon]
Bucket: serenata-de-amor-data
AccessKey: YOUR_ACCESS_KEY
Region: sa-east-1
SecretKey: YOUR_SECRET_KEY

Since you said "If you don't plan to upload anything to S3 please don't bother about keys and secrets in this file.", I didn't look for these credentials.

Anyway, how can I download data to my computer using this toolbox? What am I doing wrong? I am no very familiar to python, so maybe this is a very simple question.

@jtemporal
Copy link
Collaborator

Hi @mnunes your question tells me there is a flaw in the documentation so thanks for that (I'll open up a issue so we can fix it later). Now about your doubts:

from serenata_toolbox.datasets import Datasets
datasets = Datasets('/tmp/serenata-data/')

# now lets see what datasets are available
for dataset in datasets.remote.all:
    print(dataset)  # and you'll see a long list of datasets!

The code above will only print to the screen the names of all the files that we have backed up on our S3 instance. So without the key you won't be able to list it.

You can download the data by using either:

latest = list(dataset.downloader.LATEST)
datasets.downloader.download(latest)

or, if you know what file you want to download you can use:

datasets.downloader.download('2016-12-06-reibursements.xz')

you can checkout a list of the most recent files here.

Could you try any of these and let me know here if worked?

@jtemporal jtemporal mentioned this issue Jun 6, 2017
@mnunes
Copy link
Author

mnunes commented Jun 6, 2017

Yep, it worked, @jtemporal . Thanks.

@jtemporal
Copy link
Collaborator

@mnunes awesome!! I'm closing this one then ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants