Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When client exhausts retry attempts for any API call, it does not make retries for subsequent API calls #4754

Closed
3 tasks done
trivikr opened this issue May 24, 2023 · 4 comments
Assignees
Labels
bug This issue is a bug. p0 This issue is the highest priority queued This issues is on the AWS team's backlog

Comments

@trivikr
Copy link
Member

trivikr commented May 24, 2023

Checkboxes for prior research

Describe the bug

When client exhausts retry attempts for any API call, it does not make retries for subsequent API calls

SDK version number

All >=v3.229.0, used v3.335.0 in testing

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

All, used v16.20.0 in testing

Reproduction Steps

Here is a repro which adds custom Handler which returns a Timeout for every call, and a retryStrategy with just 100ms delay between retries.

import { Kinesis } from "@aws-sdk/client-kinesis"; // v3.335.0
import { NodeHttp2Handler } from "@aws-sdk/node-http-handler";
import { ConfiguredRetryStrategy } from "@aws-sdk/util-retry";

// Simlulate a timeout error.
class NodeHttp2HandlerReturnsTimeout extends NodeHttp2Handler {
  async handle(request, options) {
    const timeoutError = new Error("Request timed out");
    timeoutError.name = "TimeoutError";
    throw timeoutError;
  }
}

const client = new Kinesis({
  region: "us-west-2",
  requestHandler: new NodeHttp2HandlerReturnsTimeout(),
  retryStrategy: new ConfiguredRetryStrategy(
    2, // max attempts.
    () => 100 // delay only for 100ms between retries to speed up client side rate limiting.
  ),
});

let calls = 0;
let attempts = 0;
while (true) {
  try {
    await client.listStreams({});
  } catch (error) {
    const currentCallAttempts = error.$metadata.attempts;
    
    calls += 1;
    attempts += currentCallAttempts;
    console.log(`Total: ${calls} calls, ${attempts} attempts`);

    if (error.$metadata.attempts === 1) {
      // No more retries can be made.
      break;
    }
  }
}

To log attempts, the console.log was added in node_modules/@aws-sdk/util-retry/dist-cjs/StandardRetryStrategy.js

// ...
    shouldRetry(tokenToRenew, errorInfo, maxAttempts) {
        const attempts = tokenToRenew.getRetryCount();
        console.log({ attempts, maxAttempts, errorInfo });
        return (attempts < maxAttempts &&
            tokenToRenew.hasRetryTokens(errorInfo.errorType) &&
            this.isRetryableError(errorInfo.errorType));
    }
// ...

Observed Behavior

When the listStreams API call is made the second time, the previous value of attempts, i.e 2 is used, and no retries are made.

{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 1 calls, 3 attempts
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 2 calls, 4 attempts

Expected Behavior

The API calls are repeatedly made till retry tokens are actually exhausted.

{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 1 calls, 3 attempts
{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 2 calls, 6 attempts
{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 3 calls, 9 attempts
// ... and so on

Possible Solution

Needs deep dive, but it looks like StandardRetryStrategy should return new retryToken when acquire is called.

   }
 
   public async acquireInitialRetryToken(retryTokenScope: string): Promise<StandardRetryToken> {
-    return this.retryToken;
+    return getDefaultRetryToken(INITIAL_RETRY_TOKENS, DEFAULT_RETRY_DELAY_BASE);
   }
 
   public async refreshRetryTokenForRetry(

Additional Information/Context

This issue was discovered while working on #4753

@trivikr trivikr added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. p0 This issue is the highest priority queued This issues is on the AWS team's backlog and removed needs-triage This issue or PR still needs to be triaged. labels May 24, 2023
@trivikr trivikr changed the title When SDK exhausts retry tokens any API call, it does not make any subsequent retries for other calls When client exhausts retry attempts any API call, it does not make any retries for subsequent API calls May 24, 2023
@trivikr trivikr changed the title When client exhausts retry attempts any API call, it does not make any retries for subsequent API calls When client exhausts retry attempts for any API call, it does not make retries for subsequent API calls May 24, 2023
@trivikr
Copy link
Member Author

trivikr commented May 24, 2023

One workaround is to use legacy retry strategy

import { Kinesis } from "@aws-sdk/client-kinesis"; // >=v3.229.0
import { StandardRetryStrategy } from "@aws-sdk/middleware-retry";

const client = new Kinesis({
  region: "us-west-2",
  retryStrategy: new StandardRetryStrategy(),
});
Verification code and output
import { StandardRetryStrategy } from "@aws-sdk/middleware-retry";

// ...
const client = new Kinesis({
  region: "us-west-2",
  requestHandler: new NodeHttp2HandlerReturnsTimeout(),
  retryStrategy: new StandardRetryStrategy(),
});
// ...

It's deprecated, but attempts retries for subsequent calls

Total: 1 calls, 3 attempts
Total: 2 calls, 6 attempts
Total: 3 calls, 9 attempts
Total: 4 calls, 12 attempts
Total: 5 calls, 15 attempts
Total: 6 calls, 18 attempts
// ... and so on.

@trivikr
Copy link
Member Author

trivikr commented May 24, 2023

It's not recommended, but alternatively, you can also downgrade your client to <3.229.0.

import { Kinesis } from "@aws-sdk/client-kinesis"; // v3.226.0

const client = new Kinesis({ region: "us-west-2" });
Verification code and output
import { Kinesis } from "@aws-sdk/client-kinesis"; // v3.226.0
import { NodeHttp2Handler } from "@aws-sdk/node-http-handler";

// Simlulate a timeout error.
class NodeHttp2HandlerReturnsTimeout extends NodeHttp2Handler {
  async handle(request, options) {
    const timeoutError = new Error("Request timed out");
    timeoutError.name = "TimeoutError";
    throw timeoutError;
  }
}

const client = new Kinesis({
  region: "us-west-2",
  requestHandler: new NodeHttp2HandlerReturnsTimeout(),
});

let calls = 0;
let attempts = 0;
while (true) {
  try {
    await client.listStreams({});
  } catch (error) {
    const currentCallAttempts = error.$metadata.attempts;

    calls += 1;
    attempts += currentCallAttempts;
    console.log(`Total: ${calls} calls, ${attempts} attempts`);

    if (error.$metadata.attempts === 1) {
      // No more retries can be made.
      break;
    }
  }
}
Total: 1 calls, 3 attempts
Total: 2 calls, 6 attempts
Total: 3 calls, 9 attempts
Total: 4 calls, 12 attempts
Total: 5 calls, 15 attempts
// ...

@kuhe
Copy link
Contributor

kuhe commented May 30, 2023

@kuhe kuhe closed this as completed May 30, 2023
@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue is a bug. p0 This issue is the highest priority queued This issues is on the AWS team's backlog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants