Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(DataStore): endless retry of mutation request when server responds with 401 error code (#3511) #3512

Merged
merged 3 commits into from
Feb 26, 2024

Conversation

MuniekMg
Copy link
Contributor

@MuniekMg MuniekMg commented Feb 10, 2024

Issue #

#3511

Description

When server respond with 401 error on request mutation, DataStore will try to retry this request endlessly.

General Checklist

  • Added new tests to cover change, if needed
  • Build succeeds with all target using Swift Package Manager
  • All unit tests pass
  • All integration tests pass
  • PR title conforms to conventional commit style
  • New or updated tests include Given When Then inline code documentation and are named accordingly testThing_condition_expectation()

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@MuniekMg MuniekMg requested a review from a team as a code owner February 10, 2024 17:47
@@ -320,7 +320,7 @@ class SyncMutationToCloudOperation: AsynchronousOperation {

/// - Warning: Must be invoked from a locking context
private func shouldRetryWithDifferentAuthType() -> RequestRetryAdvice {
let shouldRetry = (authTypesIterator?.count ?? 0) > 0
let shouldRetry = authTypesIterator?.hasNext == true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was causing this bug #3511

Copy link
Member

@lawmicha lawmicha Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this makes sense. Just to clarify, the bug was that count is always returning for > 0 when there are multi auth rules defined on the model, so shouldRetry always returned true. The retry true advice caused the request to be replayed with the lowest priority auth over and over again since the response was 401, is that right?

Could you provide your schema or the just the model type that has the multi-auth rules on it, redacted if needed, that you saw this issue with? Just want to run through a real scenario here, with the context of auth type ordering used

Copy link
Contributor Author

@MuniekMg MuniekMg Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is current situation in my project:

  • We maintain various versions of "the same" model (e.g., House and HouseV2)

  • House auth types:

model.authRules = [
  rule(allow: .owner, ownerField: "owner", identityClaim: "cognito:username", provider: .userPools, operations: [.create, .update, .delete, .read])
]
  • HouseV2 auth types:
model.authRules = [
  rule(allow: .custom, provider: .function, operations: [.create, .update, .delete, .read])
]
  • „authorizationType": "AWS_LAMBDA" for awsAPIPlugin in amplifyconfiguration.json

  • authModeStrategy: .multiAuth in AWSDataStorePlugin configuration in order to use model's @auth rule for each model.

#3511 Bug flow:
1. The user has no permissions to mutate model HouseV2 with the ID "abc."
2. The user attempts to modify model HouseV2 with the ID "abc."
3. DataStore sends a mutation request with authType = .function.
4. Our lambda returns error 401.
5. DataStore asks shouldRetryWithDifferentAuthType() and returns true because authTypesIterator.count > 0.
6. DataStore retries the mutation request with the next authType from authTypesIterator.next(), which is nil.
7. Since authType == nil, DataStore fall-back to the authorizationType from amplifyconfiguration.json, which is authType = .function (AWS_LAMBDA) again.
8. Return to step 4."

There is a simple solution in our scenario: remove model.authRules from HouseV2, so DataStore will use the default authType from amplifyconfiguration.json, and shouldRetryWithDifferentAuthType() will return false in case of a 401 error. (Because authTypesIterator.count == 0)

But what if there is a need for other authentication rules, e.g. with an OIDC provider, which can also return a 401 error, causing the same endless loop?
I mean, the first authentication attempt with OIDC will return a 401, and then falling back to the authorizationType from amplifyconfiguration.json (AWS_LAMBDA) will also result in a 401. And this is the reason for this PR.

But there is one big difference between my implementation and the current one. My implementation will not fall-back to the default authType from amplifyconfiguration.json, which I believe is correct and in accordance with the documentation. However, many existing projects may rely on this behavior, and we should maintain it as is. Therefore, the implementation of shouldRetryWithDifferentAuthType() should resemble this:

    /**
    - Parameters:
      - request: failed request
    - Warning: Must be invoked from a locking context
   */
    private func shouldRetryWithDifferentAuthType(request: GraphQLRequest<MutationSync<AnyModel>>) -> RequestRetryAdvice {
        let failedAuthType = (request.options?.pluginOptions as? AWSPluginOptions)?.authType
        
        // if failedAuthType is nil, it means that this request just failed with the lowest priority auth
        // which is „fall-back to default authType from amplifyconfiguration.json”
        let shouldRetry = failedAuthType != nil
        
        return RequestRetryAdvice(shouldRetry: shouldRetry, retryInterval: .milliseconds(0))
    }

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the details, let me reiterate some of the points you've made, making sure we're on the same page

  • Your DataStore is set up with multi-auth support is enabled by setting DataStore's auth mode strategy configuration to .multiAuth. This means DataStore will attempt to use different auth types to send the request, in an order that is documented here
  • The default auth type on the API is the value set in the configuration, amplifyconfiguration.json, for the APIPlugin this is set to AWS Lambda. This is the authorizationType of the APIPlugin (configuration file format) that is used when no authType is provided at runtime. DataStore calls Amplify.API without the authType paramter will this default auth mode.

The problem in the current implementation is that the default auth type on the API is used repeatedly due to the two pieces of logic in SyncMutationToCloudOperation

  1. authTypesIterator?.count ?? 0) > 0 will always return true for multi-auth scenarios since count is initialized to non-zero in the AWSAuthorizationTypeIterator's initializer here. In your case, the count is 1, House has UserPool provider auth rule and and HouseV2 has Function provider auth rule.
  2. When retry advice is determined to be true, retry attempt will retrieve the next authType authTypesIterator?.next() here. This causes the infinite loop you described in the Bug flow above. authTypesIterator?.next() is nil so it goes into line 255 to retry using the default auth type, fail with 401, and retry advice is once again true because of the non-zero count.

This PR in this change peeks for the next auth type rather than check the count, with let shoulRetry = authTypesIterator?.hasNext == true. This will eventually return false on the latest attempt when the iterator's position is past the last element.

My implementation will not fall-back to the default authType from amplifyconfiguration.json, which I believe is correct and in accordance with the documentation. However, many existing projects may rely on this behavior, and we should maintain it as is.

Let me think through this in next comment

Copy link
Member

@lawmicha lawmicha Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In multi-auth scenarios where models are defined with multiple (1 or more) auth rules in the schema, the default auth type on the API is far disconnected from those rules. Like in your scenario, the House model uses User Pool auth but the library eventually attempts to send the request with the default auth type defined as Function/AWS_LAMBDA. The request will always fail for any other auth type other than the ones defined in the auth rules. On the other hand, If the default auth type is the same as one of the model’s auth rules, say the default auth mode is set to Function/AWS_LAMBDA which matches the HouseV2 model’s auth rule then it would be a repeat of a previous request that was already attempted, an attempt that was already made when iterating over the auth types from authTypesInterator.

Scenario 1. Default auth type on the API does not match one of the auth rules of the model. The request using default auth type is made unnecessarily since it will always fail. This is the scneario with the House model. The mutation event will be retried indefinitely, which blocks other MutationEvents from being processed.

Scenario 2. Default auth type on the API matches one of the auth rules of the model. The request using the default auth type may be a repeat of the previous request (requests that were made with authType parameter, an auth type retrieved from the authTypesIterator). The request is repeated unnecessarily. This is the case for HouseV2 model. In this case, it's not only repeated unncessarily, but current implementation repeats it indefinitely. The logic for Unauthorized should be to finish and process the response, by sending the error response back to DataStore's errorHandler. The MutationEvent should be then deleted, and will not be attempted to be synced at a later time.

Not making the request with default auth type (this PR) or making the request at most once (proposal in your comment w.r.t. let shouldRetry = failedAuthType != nil) both will unblock the MutationEvents queue.

However, there's a Scenario 3 where the default auth type on the API matches one of the auth rules of the model. The authTypesIterator is instantiated without the particular auth type because the user is not signed in. An auth type from an auth rule is not added to the authTypeIterator here. If we make the request with the default auth type, which is the one that was not included in the authTypeIterator, while the user is not signed in, then the request will fail with user not signed in, and the mutation event will be retried indefinitely according to this logic. The request with this auth type shouldn't have been made at all. This also blocks other MutationEvents from being processed.

Not making the request will unblock the MutationEvents queue, while making it at most once will go down the route of retrying indefinitely as well based the logic that handles .signedOut state here.

I believe this PR to stop using the default auth type when multi-auth is enabled is the only way to break out of the infinite retry for Scenario 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed explanation. You are absolutely right, I didn't consider the Scenario 3.

@lawmicha
Copy link
Member

…s with 401 error code (aws-amplify#3511)

Co-authored-by: Michael Law <1365977+lawmicha@users.noreply.github.com>
@lawmicha lawmicha merged commit 7ae55d1 into aws-amplify:main Feb 26, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants