-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Increasing DefaultAzureCredentialOptions MaxRetries dramatically slows down startup of app #33124
Comments
Thank you for your feedback. Tagging and routing to the team member best able to assist. |
//cc: @schaabs |
Because of the US holiday season, please expect delayed responses. There are two important things that I notice about your snippet and the error message:
The If you are to use managed credential authorization, I'd suggest referencing the troubleshooting guide provided in the error message to help investigate.
If you are attempting to use managed identity authorization, this would seem to indicate that you are not giving the credential enough time to authenticate with your host's MI endpoint. |
Hi @jsquire, We do require the ManagedIdentityCredential so that's not an option, but I guess my main concern is WHY would the retry count/timeout modification make so much difference when running locally (and managed identity is perfectly accessible). It's not failing, but it seems the sheer act of modifying the MaxRetries make such a large difference during startup? I did remove our modifications - will let them default for now and see if the prevents the (very) intermittent server crashes, but would love to get an explanation on how increasing MaxRetries makes such an impact if possible. |
Can you help me understand your statement that "managed identity is perfectly accessible?" The error message and stack trace clearly indicate that requests to the managed identity endpoint are timing out:
It appears that the MI endpoint is intermittently not responding quickly enough to meet your timeout value. As mentioned, you're overriding the default and allowing about 10% of the recommended value. I'd suggest removing your override to the network timeout and testing with the default value of 100 seconds to see if that alleviates the issue. |
Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you! |
Library name and version
Azure.Identity 1.7.0
Describe the bug
We've seen our azure app service crash twice now (about 30 days apart) and the only error message (pasted below) says that ManagedIdentityCredential authentication failed: retry failed after 3 tries.
I updated the code in our DefaultAzureCredential (see below) updating the MaxRetries to 6 (from 3) and the time our app took to get to the OnModelCreating surged from 38 to 54 seconds. With 9 MaxRetries the startup is 103 seconds. These numbers are pretty consistent. There were no key vault errors and this happens consistently every time since I updated the parameters. If I decrease the parameters the time goes back down.
But since my KeyVault IS accessible why would changing the MaxRetries make any difference?
========== Message from Azure App Service when our app crashed today (this is why I'm trying to up the retries)
Application: w3wp.exe
CoreCLR Version: 6.0.1122.52304
.NET Version: 6.0.11
Description: The process was terminated due to an unhandled exception.
Exception Info: Azure.Identity.AuthenticationFailedException: ManagedIdentityCredential authentication failed: Retry failed after 3 tries. Retry settings can be adjusted in ClientOptions.Retry. (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.) (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.) (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.)
See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/managedidentitycredential/troubleshoot
---> System.AggregateException: Retry failed after 3 tries. Retry settings can be adjusted in ClientOptions.Retry. (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.) (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.) (The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.)
---> System.Threading.Tasks.TaskCanceledException: The operation was cancelled because it exceeded the configured timeout of 0:00:05. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.
---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.
--- End of inner exception stack trace ---
Expected behavior
Updating MaxRetries from 3 to 6 in the code below shouldn't cause any delay UNLESS the key vault is unreachable.
return new DefaultAzureCredential(
new DefaultAzureCredentialOptions
{
// Prevent deployed instances from trying things that don't work and generally take too long
ExcludeInteractiveBrowserCredential = isDeployed,
ExcludeVisualStudioCodeCredential = isDeployed,
ExcludeVisualStudioCredential = isDeployed,
ExcludeSharedTokenCacheCredential = isDeployed,
ExcludeAzureCliCredential = isDeployed,
ExcludeManagedIdentityCredential = false,
Retry =
{
MaxRetries = 6,
NetworkTimeout = TimeSpan.FromSeconds(10),
MaxDelay = TimeSpan.FromSeconds(10)
},
...
Actual behavior
Our app startup time went from 36 to 54 seconds increasing MaxRetries from 3 to 6. For 9 retries startup time was > 100 seconds.
Reproduction Steps
Adjusted the MaxRetries as described above and observed startup timing.
Environment
Currently running locally on Win 10 laptop with 32GB memory, VS 2022 (64bit) v 17.2.2
.NET SDK (reflecting any global.json):
Version: 6.0.300
Commit: 8473146e7d
Runtime Environment:
OS Name: Windows
OS Version: 10.0.19044
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\6.0.300\
global.json file:
Not found
Host:
Version: 6.0.9
Architecture: x64
Commit: 163a63591c
.NET SDKs installed:
2.2.110 [C:\Program Files\dotnet\sdk]
3.1.301 [C:\Program Files\dotnet\sdk]
5.0.407 [C:\Program Files\dotnet\sdk]
6.0.300 [C:\Program Files\dotnet\sdk]
.NET runtimes installed:
Microsoft.AspNetCore.All 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.All 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.24 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 3.1.25 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 5.0.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 6.0.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.24 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 3.1.25 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 5.0.16 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 6.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 6.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.WindowsDesktop.App 3.1.24 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 3.1.25 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 5.0.16 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
Microsoft.WindowsDesktop.App 6.0.5 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
The text was updated successfully, but these errors were encountered: