Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialization: Fixes the SDK to retry if the initialization fails #3027

Merged
merged 10 commits into from
Feb 14, 2022

Conversation

j82w
Copy link
Contributor

@j82w j82w commented Feb 11, 2022

Pull Request Template

Description

The SDK has an initialization task to get the account information and other info. If this task fails with a DocumentClientException it was never being recreated. This is a problem because if a 408 is thrown or some other transient problem the SDK will always returned the cached failure and will not actually retry the request.

Solution:
A new initialization function factory was created. This allows the initialization task to be recreated. The EnsureValidClientAsync now calls a new method that is thread safe to get or create a new task if the existing one failed.

Type of change

Please delete options that are not relevant.

  • [] Bug fix (non-breaking change which fixes an issue)
  • [] New feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] This change requires a documentation update

Closing issues

To automatically close an issue: closes #2990

@j82w j82w added the bug Something isn't working label Feb 11, 2022
@j82w j82w self-assigned this Feb 11, 2022
@j82w j82w merged commit 83c2c73 into master Feb 14, 2022
@j82w j82w deleted the users/jawilley/bug/failedInitTask branch February 14, 2022 17:05
@johngallardo
Copy link
Member

Thanks for addressing this issue. I have found that this can also cause problems for psuedo-transient errors like permissions. E.g., if you construct a CosmosClient before setting SQL RBAC permissions for the identity accessing CosmosDB, the failed authorization exception can be returned to clients until a fresh CosmosClient object is created. These manifest as persistent exceptions that contain something like:

One or more errors occurred. (Request blocked by Auth jgalla-cosmosdb-test : Request is blocked because principal [<redacted>] does not have required RBAC permissions to perform action [Microsoft.DocumentDB/databaseAccounts/readMetadata] on resource [/]. Learn more: https://aka.ms/cosmos-native-rbac.This could be because the user's group memberships were not present in the AAD token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SDK gets stuck if failed on initialization
4 participants