Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 429 retry and logging in BundleHandler #2400

Merged
merged 2 commits into from
Dec 16, 2021

Conversation

feordin
Copy link
Contributor

@feordin feordin commented Dec 14, 2021

Description

We sometimes encounter 429 errors when processing a bundle. If we receive a 429 at the BundleHandler layer, we abort processing of the bundle and skip the remaining resources. Here we add an additional retry (in addition to the retry present in the data store layer) that will execute one time per resource which encounters a 429. Also some logging is added to help us identify this occurrence.

Related issues

Addresses [issue AB#87196].

Testing

Unit was updated.

FHIR Team Checklist

  • Update the title of the PR to be succinct and less than 50 characters
  • Add a milestone to the PR for the sprint that it is merged (i.e. add S47)
  • Tag the PR with the type of update: Bug, Dependencies, Enhancement, or New-Feature
  • Tag the PR with Azure API for FHIR if this will release to the Azure API for FHIR managed service (CosmosDB or common code related to service)
  • Tag the PR with Azure Healthcare APIs if this will release to the Azure Healthcare APIs managed service (Sql server or common code related to service)
  • CI is green before merge
  • Review squash-merge requirements

Semver Change (docs)

Patch|Skip|Feature|Breaking (reason)

@feordin feordin added Enhancement Enhancement on existing functionality. Azure API for FHIR Label denotes that the issue or PR is relevant to the Azure API for FHIR labels Dec 14, 2021
@feordin feordin added this to the S78 milestone Dec 14, 2021
@feordin feordin requested a review from a team as a code owner December 14, 2021 22:35
@@ -391,6 +391,25 @@ private async Task<EntryComponent> ExecuteRequests(Hl7.Fhir.Model.Bundle respons

await request.Handler.Invoke(httpContext);

// we will retry a 429 one time per request in the bundle
if (httpContext.Response.StatusCode == (int)HttpStatusCode.TooManyRequests
&& _bundleType == BundleType.Batch)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we care if it is a Batch request? Couldn't we do the same thing for a Transaction request?

var retryAfterValues = httpContext.Response.Headers.GetCommaSeparatedValues("Retry-After");
if (retryAfterValues != StringValues.Empty)
{
if (int.TryParse(retryAfterValues[0], out var retryHeaderValue))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be collapsed into the if statement above it so we only need one?

@@ -286,7 +286,7 @@ public async Task GivenABundle_WhenProcessed_CertainResponseHeadersArePropagated
}

[Fact]
public async Task GivenABundle_WhenOneRequestProducesA429_SubsequentRequestAreSkipped()
public async Task GivenABundle_WhenOneRequestProducesA429_429IsRetriedThenSubsequentRequestAreSkipped()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding a test that has a retry succeed to show the happy retry path?

{
retryDelay = retryHeaderValue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth to retry in absence of "Retry-After" header?

Feels that delay 2 milliseconds wouldn't do much to solve problem...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried that 2 seconds might be too long :). I used 2 seconds as a fall back, because we have similar logic in the CosmosFhirStore code. I would still like to try, given that we did receive a 429.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task.delay?view=net-6.0

reates a cancellable task that completes after a specified number of milliseconds.

It's not a seconds...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for you for catching it Ivan! Sorry I missed the "milliseconds" in your previous comment. I merged this too quickly! I will put up a new PR to fix it.

@feordin feordin merged commit 627cc1c into main Dec 16, 2021
@feordin feordin deleted the personal/jaerwin/partial-429-updates branch December 16, 2021 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure API for FHIR Label denotes that the issue or PR is relevant to the Azure API for FHIR Enhancement Enhancement on existing functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants