Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a retry mechanism for ReadOne calls in rm.createResource #81

Merged
merged 1 commit into from
Apr 15, 2022

Conversation

a-hilaly
Copy link
Member

Issue #, if available:

Description of changes:

In some rare cases and with some specific AWS APIs, calling ReadOne
right after a rm.Create can return a NotFound error. We want to
retry calling rm.ReadOne in hopes of receiving a correct response.

This patch adds a backoff/retry mechanism around ReadOne call in
rm.createResource.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

Copy link
Contributor

@vijtrip2 vijtrip2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add some unit tests for this by mocking the ReadOne call and verifying that right number of ReadOne calls are being made in unit test assertions.


Not against this solution, since it leaves the reconciler behavior as is for services that do not suffer from this ReadOne(after Create) behavior. However i would love to hear your thoughts on following,

a) Do you think it is worth to make this retry-backoff behavior configurable for service teams in generator.yaml? (Have default values when not specified)

b) Why was existing solution of requeueing and complete reconciler loop retry not sufficient enough? (Just add in the PR description)

pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
pkg/runtime/reconciler.go Show resolved Hide resolved
pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
@a-hilaly
Copy link
Member Author

a-hilaly commented Apr 1, 2022

awaiting for soak tests results with ECR
/hold

@ack-bot ack-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 1, 2022
@a-hilaly
Copy link
Member Author

a-hilaly commented Apr 1, 2022

a) Do you think it is worth to make this retry-backoff behavior configurable for service teams in generator.yaml? (Have default values when not specified)

@vijtrip2 Yes i do believe that it can be useful for some portion of the controllers at least. Were thinking about adding a flag to make this retry/backoff configurable?

@a-hilaly a-hilaly force-pushed the retry-readone branch 2 times, most recently from 1201035 to d87e878 Compare April 1, 2022 18:02
Copy link
Collaborator

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @a-hilaly! Good start on this :) Left some comments about changes I'd like to see..

ATTRIBUTION.md Show resolved Hide resolved
pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
@a-hilaly a-hilaly force-pushed the retry-readone branch 2 times, most recently from 9de3d5f to 4e07dce Compare April 15, 2022 16:14
pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
pkg/runtime/reconciler.go Outdated Show resolved Hide resolved
In some rare cases and with some specific AWS APIs, calling `ReadOne`
right after a `rm.Create` can return a `NotFound` error. We want to
retry calling `rm.ReadOne` in hopes of receiving a correct response.

This patch adds a backoff/retry mechanism around `ReadOne` call in
`rm.createResource`.
@vijtrip2
Copy link
Contributor

/lgtm

@a-hilaly
Copy link
Member Author

/unhold

@ack-bot ack-bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 15, 2022
@ack-bot
Copy link
Collaborator

ack-bot commented Apr 15, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: A-Hilaly, vijtrip2

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ack-bot ack-bot merged commit 360e72a into aws-controllers-k8s:main Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm Indicates that a PR is ready to be merged.
Projects
None yet
5 participants