Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[app-configuration] ThrottlingRetryPolicy can retry infinitely #15576

Closed
richardpark-msft opened this issue Jun 5, 2021 · 4 comments · Fixed by #15721
Closed

[app-configuration] ThrottlingRetryPolicy can retry infinitely #15576

richardpark-msft opened this issue Jun 5, 2021 · 4 comments · Fixed by #15721
Assignees
Labels
App Configuration Azure.ApplicationModel.Configuration Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team
Milestone

Comments

@richardpark-msft
Copy link
Member

Problem

The throttlingRetryPolicy in AppConfig has the potential to retry for an extended period if the service continues returning "retry after" headers on subsequent calls.

Here's the snippet of code that handles the "retry after" retries:

  public async sendRequest(httpRequest: WebResource): Promise<HttpOperationResponse> {
    return this._nextPolicy.sendRequest(httpRequest.clone()).catch((err) => {
        // other code elided....
        return delay(delayInMs).then((_: any) => this.sendRequest(httpRequest.clone()));

Full source here: link

Some possible improvements:

  • We should check the abortSignal on the request to see if the user has requested us to stop
  • We should log to indicate why we are retrying (the exponentialRetryPolicy does). So even with debug logging on it's not apparent what's going on.
  • Call this._nextPolicy.sendRequest, rather than this.sendRequest. This might cooperate more nicely with other policies that also handle retries (exponentialRetryPolicy). There would have to be some adjustments since the RestError+429 combo is not retryable, which was part of the reason we needed our own copy of throttlingRetryPolicy.

CC: @MaryanneNjeri who owns the original issue in filed in the appconfig repo

References:

  • Issue where we added the throttlingRetryPolicy: link
@richardpark-msft richardpark-msft added customer-reported Issues that are reported by GitHub users external to the Azure organization. Client This issue points to a problem in the data-plane of the library. App Configuration Azure.ApplicationModel.Configuration labels Jun 5, 2021
@ghost ghost added the needs-team-attention This issue needs attention from Azure service team or SDK team label Jun 5, 2021
@richardpark-msft
Copy link
Member Author

@ramya-rao-a , @xirzec , @chradek , @bterlson : I've never been quite sure what the right interaction between various retry methods should be. I'd like all of your architectural input on this.

@chradek
Copy link
Contributor

chradek commented Jun 7, 2021

I think we should honor the max retry attempt setting across all retry policies. So if max retries was set to 3, and we encountered expired creds 2 times, we'd only allow the throttling retry policy to be invoked max 2 times. (for 4 total attempts.)

Given we have multiple retry policies today, this would mean we'd need some state that was shareable across policies to keep track of the number of retries that have been encountered. I'd prefer a single retry policy that you could configure (possibly by giving it functions that each check if a request is retriable) so the retryCount is kept solely in that single policy, but otherwise storing that state in WebResource is probably the way to go.

@ramya-rao-a
Copy link
Contributor

This is worth cross checking with our friends from other languages to ensure we are treating retries similarly across the board, especially the throttling retry policy we have in Azure Core

@jeremymeng Can you reach out and confirm?

@xirzec
Copy link
Member

xirzec commented Jun 7, 2021

Given we have multiple retry policies today, this would mean we'd need some state that was shareable across policies to keep track of the number of retries that have been encountered. I'd prefer a single retry policy that you could configure (possibly by giving it functions that each check if a request is retriable) so the retryCount is kept solely in that single policy, but otherwise storing that state in WebResource is probably the way to go.

I also want to merge our policies. .NET at least only has one policy and it seems much simpler that way.

@ghost ghost closed this as completed in #15721 Jun 17, 2021
ghost pushed a commit that referenced this issue Jun 17, 2021
…nal (#15721)

## Description
### Issue
 - #15576
 - Azure/AppConfiguration#503

### Changelog
- High request rate would result in throttling. SDK would retry on the failed requests based on the service suggested time from the `retry-after-ms` header in the error response. If there are too many parallel requests, retries for all of them may also result in a high request rate entering into a state which might seem like the application is hanging forever.
  - [#15721](#15721) allows the user-provided abortSignal to be taken into account to abort the requests sooner.
  - More resources - [App Configuration | Throttling](https://docs.microsoft.com/en-us/azure/azure-app-configuration/rest-api-throttling) and [App Configuration | Requests Quota](https://docs.microsoft.com/en-us/azure/azure-app-configuration/faq#which-app-configuration-tier-should-i-use)

Fixes #15576
@HarshaNalluru HarshaNalluru added this to the [2021] July milestone Jun 17, 2021
sadasant pushed a commit to sadasant/azure-sdk-for-js that referenced this issue Jun 21, 2021
…nal (Azure#15721)

## Description
### Issue
 - Azure#15576
 - Azure/AppConfiguration#503

### Changelog
- High request rate would result in throttling. SDK would retry on the failed requests based on the service suggested time from the `retry-after-ms` header in the error response. If there are too many parallel requests, retries for all of them may also result in a high request rate entering into a state which might seem like the application is hanging forever.
  - [Azure#15721](Azure#15721) allows the user-provided abortSignal to be taken into account to abort the requests sooner.
  - More resources - [App Configuration | Throttling](https://docs.microsoft.com/en-us/azure/azure-app-configuration/rest-api-throttling) and [App Configuration | Requests Quota](https://docs.microsoft.com/en-us/azure/azure-app-configuration/faq#which-app-configuration-tier-should-i-use)

Fixes Azure#15576
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
App Configuration Azure.ApplicationModel.Configuration Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team
Projects
None yet
5 participants