Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Caching of s3:// public resources fail without credentials #2929

Closed
OyvindTafjord opened this issue Jun 6, 2019 · 4 comments · Fixed by #2939
Closed

Caching of s3:// public resources fail without credentials #2929

OyvindTafjord opened this issue Jun 6, 2019 · 4 comments · Fixed by #2939

Comments

@OyvindTafjord
Copy link
Contributor

Caching a resource with an s3:// url (like s3://allennlp/models/nlvr-erm-model-2018-12-18.tar.gz) will fail without appropriate AWS credentials, even though the resource is public and getting it using a https:// url (like https://allennlp.s3.amazonaws.com/models/nlvr-erm-model-2018-12-18.tar.gz) will work.

The stack trace points to the s3_etag function as at least one culprit. This can be a problem if, say, accidentally publishing a model with such references inside, which will work fine locally, but will then fail on systems without the right credentials.

With credentials:

>>> from allennlp.common.file_utils import cached_path
>>> cached_path("https://allennlp.s3.amazonaws.com/models/nlvr-erm-model-2018-12-18.tar.gz")
'/Users/tafjord/.allennlp/cache/8929ad3fd9879baf5ed801677043e6f872eeca71138a8110464083d7d1078a85.7c013018695dc0f4edca076a710c3b47e1245492a2a11afb8351a75eb1cd760a'
>>> cached_path("s3://allennlp/models/nlvr-erm-model-2018-12-18.tar.gz")
'/Users/tafjord/.allennlp/cache/ab5daf7104f2249fe550c00ea91fd2c7fb5abdc7dc25b95c1240eb26135df722.7c013018695dc0f4edca076a710c3b47e1245492a2a11afb8351a75eb1cd760a'

Without credentials:

>>> from allennlp.common.file_utils import cached_path
>>> cached_path("https://allennlp.s3.amazonaws.com/models/nlvr-erm-model-2018-12-18.tar.gz")
'/home/oyvindt/.allennlp/cache/8929ad3fd9879baf5ed801677043e6f872eeca71138a8110464083d7d1078a85.7c013018695dc0f4edca076a710c3b47e1245492a2a11afb8351a75eb1cd760a'
>>> cached_path("s3://allennlp/models/nlvr-erm-model-2018-12-18.tar.gz")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/oyvindt/allennlp/allennlp/common/file_utils.py", line 97, in cached_path
    return get_from_cache(url_or_filename, cache_dir)
  File "/home/oyvindt/allennlp/allennlp/common/file_utils.py", line 183, in get_from_cache
    etag = s3_etag(url)
  File "/home/oyvindt/allennlp/allennlp/common/file_utils.py", line 131, in wrapper
    return func(url, *args, **kwargs)
  File "/home/oyvindt/allennlp/allennlp/common/file_utils.py", line 147, in s3_etag
    return s3_object.e_tag
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/boto3/resources/factory.py", line 339, in property_loader
    self.load()
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/boto3/resources/factory.py", line 505, in do_action
    response = action(self, *args, **kwargs)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/client.py", line 648, in _make_api_call
    operation_model, request_dict, request_context)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/client.py", line 667, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/signers.py", line 157, in sign
    auth.add_auth(request)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/auth.py", line 425, in add_auth
    super(S3SigV4Auth, self).add_auth(request)
  File "/home/oyvindt/miniconda3/envs/allennlp/lib/python3.6/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials
@kl2806
Copy link
Contributor

kl2806 commented Jun 7, 2019

Did you run into this with one of our models? Or with your own code?

@OyvindTafjord
Copy link
Contributor Author

I ran into it on a (non-published) model shared with me, and things worked fine locally, but then broke when running in beaker.

@epwalsh
Copy link
Member

epwalsh commented Jun 10, 2019

I just reproduced this exception as well. Maybe we could add a check for credentials, and if no credentials are found, we fall back to unsigned requests: https://stackoverflow.com/questions/34865927/can-i-use-boto3-anonymously

@matt-gardner
Copy link
Contributor

@epwalsh, that sounds like a reasonable solution to me. PR welcome.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants