Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wr.athena.create_athena_bucket() sometimes does not create the S3 query-results/staging bucket #735

Closed
cfregly opened this issue Jun 4, 2021 · 5 comments · Fixed by #738
Assignees
Labels
bug Something isn't working minor release Will be addressed in the next minor release ready to release
Milestone

Comments

@cfregly
Copy link

cfregly commented Jun 4, 2021

version

awswrangler==2.7.0

code

wr.athena.create_athena_bucket()

expected bucket

s3://aws-athena-query-results-ACCOUNT-REGION/

No bucket is created and subsequent calls wr.athena.read_sql_query() are failing with "The specified bucket does not exist: NoSuchBucket" exception.

For now, we are replacing the create_athena_bucket() call with an equivalent boto3 s3 create_bucket API call.

@cfregly cfregly added the bug Something isn't working label Jun 4, 2021
@cfregly
Copy link
Author

cfregly commented Jun 4, 2021

note that this is difficult to reproduce, but it happens when we run large workshops with 100's and 1000's of users running in separate AWS accounts (not a single shared account).

@cfregly
Copy link
Author

cfregly commented Jun 4, 2021

specific ask would be to verify that the bucket is created before returning from that call. perhaps an exponential backoff retry mechanism would be useful here? https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#adaptive-retry-mode

@jaidisido
Copy link
Contributor

jaidisido commented Jun 7, 2021

There are two difficulties here 1) it's not easy to reproduce the issue as it seems to be a rare occurrence 2) there is no error thrown by the API call when the operation is not successful.

One potential workaround is to leverage the wait_until_exists method of an S3 resource (See #738). It would poll every 5s and check if the bucket has been created. This way you can at least capture the error when the waiter call fails:

botocore.exceptions.WaiterError: Waiter BucketExists failed: Max attempts exceeded. Previously accepted state: Matched expected HTTP status code: 404

@cfregly
Copy link
Author

cfregly commented Jun 7, 2021

Nice, yes this seems reasonable. Are you suggesting that you would put this within the create_athena_bucket() call? Or are you suggesting that we should do this ourselves as callers of this api.

@jaidisido
Copy link
Contributor

Wrangler will call the wait_until_exists method. However, it will be up to the user to handle the error that might ensue

@jaidisido jaidisido added minor release Will be addressed in the next minor release ready to release labels Jun 11, 2021
@jaidisido jaidisido linked a pull request Jun 11, 2021 that will close this issue
@jaidisido jaidisido added this to the 2.9.0 milestone Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working minor release Will be addressed in the next minor release ready to release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants