Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions coming from boto3/botocore when running boto3.client('sts') too many times simultaneously #1592

Open
moalexmonk opened this issue Jun 12, 2018 · 18 comments
Labels
enhancement This issue requests an improvement to a current feature. feature-request This issue requests a feature. p2 This is a standard priority issue sts

Comments

@moalexmonk
Copy link
Contributor

import boto3, threading
for i in range(50):
    threading.Thread(target=lambda: boto3.client('sts')).start()

And you get, tested on my Windows 10 machine (boto3 version 1.5.31, botocore version 1.8.45) and also on an Amazon Linux EC2, a bunch of exceptions like this:

Exception in thread Thread-20:
Traceback (most recent call last):
  File "C:\Program Files (x86)\Python35-32\lib\threading.py", line 914, in _bootstrap_inner
    self.run()
  File "C:\Program Files (x86)\Python35-32\lib\threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "<stdin>", line 2, in <lambda>
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\boto3\__init__.py", line 83, in client
    return _get_default_session().client(*args, **kwargs)
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\boto3\session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 850, in create_client
    credentials = self.get_credentials()
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 474, in get_credentials
    'credential_provider').load_credentials()
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 926, in get_component
    del self._deferred[name]
KeyError: 'credential_provider'

or:

Exception in thread Thread-24:
Traceback (most recent call last):
  File "C:\Program Files (x86)\Python35-32\lib\threading.py", line 914, in _bootstrap_inner
    self.run()
  File "C:\Program Files (x86)\Python35-32\lib\threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "<stdin>", line 2, in <lambda>
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\boto3\__init__.py", line 83, in client
    return _get_default_session().client(*args, **kwargs)
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\boto3\session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 851, in create_client
    endpoint_resolver = self.get_component('endpoint_resolver')
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 726, in get_component
    return self._components.get_component(name)
  File "C:\Users\alexander.monk\AppData\Roaming\Python\Python35\site-packages\botocore\session.py", line 926, in get_component
    del self._deferred[name]
KeyError: 'endpoint_resolver'

Normally seems to be several credential_provider ones followed by several endpoint_resolver ones.
The chance of getting these exceptions seems to increase with the number of threads.

moalexmonk added a commit to moalexmonk/botocore that referenced this issue Jun 13, 2018
By locking around multiple operations on self._deferred

Fixes boto/boto3#1592
@moalexmonk
Copy link
Contributor Author

Workaround for this in the mean time: Put a boto3.client('sts') call immediately before the for loop.

@jamesls
Copy link
Member

jamesls commented Jun 19, 2018

You're correct about this behavior. Historically we've said that session methods aren't thread safe, but once you've created a client or resource, we guarantee that those calls are thread safe (http://boto3.readthedocs.io/en/latest/guide/resources.html#multithreading-multiprocessing).

While we can investigate changing that stance, I suspect there will be more work than just the component logic, though that's a good start and one of the most common content points in sessions.

@jamesls jamesls added the enhancement This issue requests an improvement to a current feature. label Jun 19, 2018
@moalexmonk
Copy link
Contributor Author

So it looks like another way around this is boto3.Session().client (or boto3.session.Session() as in that linked page) instead of just boto3.client.

@zgoda-mobica
Copy link

So finally sessions and resources can be shared across threads or not? Immediately after I start using them as described in the docs I hit KeyError: 'credential_provider' or similar. My threads do upload only so I guess I'm on safe side but I keep finding statements contradicting the docs.

@rkiyanchuk
Copy link

There is also an outstanding unanswered question in botocore about thread safety. Would be great to know what's the actual correct approach should be.

@johnnytardin
Copy link

I'm having the same problem. I'm calling boto3.client('sts') for each thread created and i hit KeyError: 'endpoint_resolver'.

@azec-pdx
Copy link

So finally sessions and resources can be shared across threads or not? Immediately after I start using them as described in the docs I hit KeyError: 'credential_provider' or similar. My threads do upload only so I guess I'm on safe side but I keep finding statements contradicting the docs.

@zgoda-mobica , Same issue here. I had ThreadPoolExecutor and each thread was invoking boto3.client('lambda').invoke() without using Session() which throws KeyError: 'credential_provider' error.

So now using following in thread invoked method....

session = boto3.session.Session()
lambda_client = session.client('lambda')
lambda_response = lambda_client.invoke(...)
....

and it seems to be ok.

@rohnigam
Copy link

So finally sessions and resources can be shared across threads or not? Immediately after I start using them as described in the docs I hit KeyError: 'credential_provider' or similar. My threads do upload only so I guess I'm on safe side but I keep finding statements contradicting the docs.

@zgoda-mobica , Same issue here. I had ThreadPoolExecutor and each thread was invoking boto3.client('lambda').invoke() without using Session() which throws KeyError: 'credential_provider' error.

So now using following in thread invoked method....

session = boto3.session.Session()
lambda_client = session.client('lambda')
lambda_response = lambda_client.invoke(...)
....

and it seems to be ok.

This solves it, but makes the overall threaded push a bit slow.

mhucka added a commit to caltechlibrary/handprint that referenced this issue Jan 7, 2020
It seems that the boto3 library is not threadsafe.  The solution
discussed in GitHub issues such as

  boto/botocore#1246
  boto/boto3#1592

and the documentation at

  https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html?highlight=multithreading#multithreading-multiprocessing

suggest a very simple change that sems to make things work.
shaftoe added a commit to shaftoe/api-l3x-in that referenced this issue Jun 22, 2020
@hannes-ucsc
Copy link

hannes-ucsc commented Nov 17, 2020

The documentation says that it is "recommended" to create a resource per thread but the example code below that recommendation creates a resource AND a session per thread. IOW, it is ambiguous if sessions can be shared between threads or not. Further down, the documentation states that resources are NOT thread-safe. That would imply that it's not merely "recommended" to create a resource per thread, but rather that it's a requirement to do so.

I'm confused.

@andrewdunai
Copy link

The documentation says that it is "recommended" to create a resource per thread but the example code below that recommendation creates a resource AND a session per thread. IOW, it is ambiguous if sessions can be shared between threads or not. Further down, the documentation states that resources are NOT thread-safe. That would imply that it's not merely "recommended" to create a resource per thread, but rather that it's a requirement to do so.

I'm confused.

I second this. There doesn't seem to be a generic best practice for this case.

rix0rrr added a commit to hedyorg/hedy that referenced this issue Jul 1, 2021
The 'boto3.client' method is not thread safe: boto/boto3#1592

Put a lock around initializing an S3 client, so the two threads that
might do this in parallel don't stomp on each other.
fpereiro added a commit to hedyorg/hedy that referenced this issue Jul 1, 2021
The 'boto3.client' method is not thread safe: boto/boto3#1592

Put a lock around initializing an S3 client, so the two threads that
might do this in parallel don't stomp on each other.

Co-authored-by: fpereiro <fpereiro@gmail.com>
@tim-finnigan
Copy link
Contributor

The documentation on multithreading was updated in this PR: #2848.

Here are links to the multithreading documentation for Clients, Resources, and Sessions. Does that help clarify things?

@tim-finnigan tim-finnigan added the feature-request This issue requests a feature. label Apr 6, 2022
@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Apr 6, 2022
@hannes-ucsc
Copy link

Looking at just the PR: not really. It says:

clients are generally thread-safe.

and further down:

Similar to Resource objects, Session objects are not thread safe

Correct me if I am wrong, but clients are obtained from sessions. This would imply that if I have two threads, each one needs their own session, but that a client obtained from either of the two sessions can be shared by both threads. That seems odd and deserves an explanation or a correction. Without either, I'd still be confused as a user of boto3.

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Apr 7, 2022
ravi-mosaicml added a commit to mosaicml/composer that referenced this issue Jul 5, 2022
`boto3` sessions are not thread safe. When used in the object store logger with `use_procs: False`, the default session was shared across threads, which caused us to run into boto/boto3#1592. To fix, this PR creates a new session within each `S3ObjectStore` instance.

Closes https://mosaicml.atlassian.net/browse/CO-651
ravi-mosaicml added a commit to mosaicml/composer that referenced this issue Jul 16, 2022
`boto3` sessions are not thread safe. When used in the object store logger with `use_procs: False`, the default session was shared across threads, which caused us to run into boto/boto3#1592. To fix, this PR creates a new session within each `S3ObjectStore` instance.

Closes https://mosaicml.atlassian.net/browse/CO-651
@aBurmeseDev aBurmeseDev added the p2 This is a standard priority issue label Nov 10, 2022
@Gatsby-Lee
Copy link

This happened to ms as well.
Mine is super simple. It's just one thread.
Even with one thread, boto.client("cloudwatch") failed.

What can be the workaround?
Retrying?

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.9/site-packages/common_pylib/mongosrc_etl_maint_lib.py", line 51, in run_post_maint_ping_metric
    post_maint_ping_metric(etl_job_name)
  File "/usr/local/lib/python3.9/site-packages/common_pylib/mongosrc_etl_maint_lib.py", line 18, in post_maint_ping_metric
    boto3_cloudwatch_client = boto3_cloudwatch_client or boto3.client("cloudwatch")
  File "/usr/local/lib/python3.9/site-packages/boto3/__init__.py", line 92, in client
    return _get_default_session().client(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/boto3/session.py", line 299, in client
    return self._session.create_client(
  File "/usr/local/lib/python3.9/site-packages/botocore/session.py", line 951, in create_client
    credentials = self.get_credentials()
  File "/usr/local/lib/python3.9/site-packages/botocore/session.py", line 507, in get_credentials
    self._credentials = self._components.get_component(
  File "/usr/local/lib/python3.9/site-packages/botocore/session.py", line 1112, in get_component
    del self._deferred[name]
KeyError: 'credential_provider'

@Gatsby-Lee
Copy link

I documented the issue based on my own understanding.
https://medium.com/@life-is-short-so-enjoy-it/aws-boto3-misunderstanding-about-thread-safe-a7261d7391fd

@nateprewitt
Copy link
Contributor

Thanks for linking the blog post, @Gatsby-Lee! That's correct that clients are safe to share between threads must be instantiated in the primary thread. We'd be happy to accept a PR updating the client documentation to be clearer if you have ideas. Otherwise, I'll pass this along to our doc writers to make adjustments.

@Gatsby-Lee
Copy link

@nateprewitt
hi, thank you for your comment.
I have question for you.

What is the reason that the "DEFAULT_SESSION" was introduced at the first place?

If it makes more issues than benefits, why don't we update the boto3.client code like this?
Basically, this code returns new client for every call.
( Too many connections? )

def client(*args, **kwargs):
    """
    Create a low-level service client by name using the default session.
    See :py:meth:`boto3.session.Session.client`.
    """
    return Session().client(*args, **kwargs)

@nateprewitt
Copy link
Contributor

What is the reason that the "DEFAULT_SESSION" was introduced at the first place?

Creating a Session is a very resource intensive process as it has to load and deserialize several models before it can create clients. If this was required every time a client was created we'd see unacceptable performance impact. It's a rare use-case customers actually need individual sessions, so we avoid this by creating a single instance by default for reuse. If this is undesired behavior customers are free to instantiate their own Session as you're showing.

With the code you've proposed above, it will have the same performance as creating one client currently. However, we see creating a client take ~15x longer once we start creating many clients. This is because we are currently amortizing that Session cost over each client creation by reusing a single DEFAULT_SESSION instance.

Performance Impact Example

import timeit
import boto3

CLIENT_COUNT = 1_000

global_time = timeit.timeit(
    'boto3.client("s3")',
    setup='import boto3',
    number=CLIENT_COUNT
)
print(f"Average time per client with Global Session: {global_time/CLIENT_COUNT} sec")

unique_time = timeit.timeit(
    'boto3.session.Session().client("s3")',
    setup='import boto3',
    number=CLIENT_COUNT
)
print(f"Average time per client with Unique Session: {unique_time/CLIENT_COUNT} sec")

We'll see results over 1000 runs come out as seen below:

$ python perf_test.py
Average time per client with Global Session: 0.001492504167 sec
Average time per client with Unique Session: 0.022752786 sec

As you can see this quickly becomes cost prohibitive for time sensitive applications. This is why the current documentation instructs users to pass clients to threads rather than create them in the thread.

@Gatsby-Lee
Copy link

@nateprewitt
Thank you very much for the detailed explanation.

I think the decision can be very opinionated.
I can think two approaches.

  1. keeping the a session in the Boto3 like the way it is now.
  2. let delegate the session caching responsibility to the user.

I guess that there is not a right or wrong answer.

And, since the existing behavior in creating client is almost everywhere, it is not even simple to change it.

So, the mitigation can be

  1. create boto3 client in advance before assigning it to thread
  2. use boto3.session.Session() if boto3 client has to be created in Thread.

Thank you!!

dbaty added a commit to pass-culture/pass-culture-main that referenced this issue Mar 28, 2023
… connector

`boto3.client()` is not thread-safe. We sometimes get the following
error:

    KeyError: 'default_config_resolver'
      File "pcapi/tasks/ubble_tasks.py", line 23, in store_id_pictures_task
        ubble_subscription_api.archive_ubble_user_id_pictures(payload.identification_id)
      [...]
      File "pcapi/connectors/beneficiaries/outscale.py", line 13, in upload_file
        client = boto3.client(
      File "__init__.py", line 92, in client
        return _get_default_session().client(*args, **kwargs)
      [...]
      File "botocore/session.py", line 1009, in get_component
        del self._deferred[name]

To work around that, we now use a dedicated session. And we keep one
for each thread to lessen the overhead of the session initialization.

References:
 - discussion of the issue and possible workarounds: boto/boto3#1592
 - similar fix in Sentry client code: getsentry/sentry@f58ca5e
dbaty added a commit to pass-culture/pass-culture-main that referenced this issue Mar 28, 2023
… connector

`boto3.client()` is not thread-safe. We sometimes get the following
error:

    KeyError: 'default_config_resolver'
      File "pcapi/tasks/ubble_tasks.py", line 23, in store_id_pictures_task
        ubble_subscription_api.archive_ubble_user_id_pictures(payload.identification_id)
      [...]
      File "pcapi/connectors/beneficiaries/outscale.py", line 13, in upload_file
        client = boto3.client(
      File "__init__.py", line 92, in client
        return _get_default_session().client(*args, **kwargs)
      [...]
      File "botocore/session.py", line 1009, in get_component
        del self._deferred[name]

To work around that, we now use a dedicated session. And we keep one
for each thread to lessen the overhead of the session initialization.

References:
 - discussion of the issue and possible workarounds: boto/boto3#1592
 - similar fix in Sentry client code: getsentry/sentry@f58ca5e
dbaty added a commit to pass-culture/pass-culture-main that referenced this issue Apr 4, 2023
… connector

`boto3.client()` is not thread-safe. We sometimes get the following
error:

    KeyError: 'default_config_resolver'
      File "pcapi/tasks/ubble_tasks.py", line 23, in store_id_pictures_task
        ubble_subscription_api.archive_ubble_user_id_pictures(payload.identification_id)
      [...]
      File "pcapi/connectors/beneficiaries/outscale.py", line 13, in upload_file
        client = boto3.client(
      File "__init__.py", line 92, in client
        return _get_default_session().client(*args, **kwargs)
      [...]
      File "botocore/session.py", line 1009, in get_component
        del self._deferred[name]

To work around that, we now use a dedicated session. And we keep one
for each thread to lessen the overhead of the session initialization.

References:
 - discussion of the issue and possible workarounds: boto/boto3#1592
 - similar fix in Sentry client code: getsentry/sentry@f58ca5e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue requests an improvement to a current feature. feature-request This issue requests a feature. p2 This is a standard priority issue sts
Projects
None yet
Development

Successfully merging a pull request may close this issue.