You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the credentials retrieved by IAMCredentialsProvider are set to expire() only 1s before the expiry time returned by the Instance metadata service
this is not enough time for the CachedSupplier to recover from any failure as on a failure it sets a jitter on a retry to ComparableUtils.minimum(Duration.ofMillis(exponentialBackoffMillis), Duration.ofSeconds(10)). That is: up to 9 seconds after the credentials expire.
It's notable that ContainerCredentialsProvider has more resilience in
credentials are set to expire 15 minutes before the credentials become invalid
it also retries 5 times with no sleep on a request failure.
Expected Behavior
IAMCredentialsProvider to always issue valid credentials. This should include asynchronous refreshing of credentials when enabled, far enough in advance of their expiry that recovery attempts can be repeated.
Current Behavior
invalid credentials were returned when requesting credentials to sign a request.
Reproduction Steps
see the hadoop bug report. The deployment was a single EC2 node running many services, long enough for the EC2 credentials to expire and need refreshing.
Possible Solution
I really don't see what we can do in the hadoop codebase to recover from this. We could consider extending our own IAMInstanceCredentialsProvider to wrap the retry failures with our own sleep/retry, but as it'd take > 10 seconds, API calls awaiting signing (e.g. S3 Express CreateSession) will time out.
I'd propose copying ContainerCredentialsProvider, if not with the retries then at least the expiry time many minutes ahead of actual credential expiry
Additional Information/Context
No response
AWS Java SDK version used
2.24.6
JDK version used
an openjdk java 8 build
Operating System and version
linux
The text was updated successfully, but these errors were encountered:
Is there an approximate timeline for this? I don't want to have to reimplement our own refresh thread code -not because it is hard, but because maintenance and testing with fault injection are the pain points.
Describe the bug
The full details and analysis are in HADOOP-19181. IAMCredentialsProvider throttle failures
ComparableUtils.minimum(Duration.ofMillis(exponentialBackoffMillis), Duration.ofSeconds(10))
. That is: up to 9 seconds after the credentials expire.It's notable that
ContainerCredentialsProvider
has more resilience inExpected Behavior
IAMCredentialsProvider
to always issue valid credentials. This should include asynchronous refreshing of credentials when enabled, far enough in advance of their expiry that recovery attempts can be repeated.Current Behavior
invalid credentials were returned when requesting credentials to sign a request.
Reproduction Steps
see the hadoop bug report. The deployment was a single EC2 node running many services, long enough for the EC2 credentials to expire and need refreshing.
Possible Solution
I really don't see what we can do in the hadoop codebase to recover from this. We could consider extending our own
IAMInstanceCredentialsProvider
to wrap the retry failures with our own sleep/retry, but as it'd take > 10 seconds, API calls awaiting signing (e.g. S3 Express CreateSession) will time out.I'd propose copying ContainerCredentialsProvider, if not with the retries then at least the expiry time many minutes ahead of actual credential expiry
Additional Information/Context
No response
AWS Java SDK version used
2.24.6
JDK version used
an openjdk java 8 build
Operating System and version
linux
The text was updated successfully, but these errors were encountered: