-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add machinery for tracking external resource create time #280
Conversation
Per crossplane-contrib/provider-aws#802 some external APIs (including some AWS APIs) appear to experience some delay between the time a new resource is successfully created and the time at which that resource appears in queries. This commit adds a new 'crossplane.io/external-create-time' annotation and a new 'ResourcePending' field in the Observation struct. Together these can be used by an Observe method to allow for a small grace period before it determines that a resource does not exist. For example: ```go func Observe(ctx context.Context, mg resource.Managed) (managed.ExternalObservation, error) { err := api.Get(AsInput(mg)) if IsNotFound(err) { if t := meta.GetExternalCreateTime(); t != nil && time.Since(t.Time) < 1 * time.Minute { // We're in the grace period - wait a bit longer for the resource to appear. return managed.ExternalObservation{ResourcePending: true}, nil } // The resource does not exist. return managed.ExternalObservation{ResourceExists: false}, nil } if err != nil { return managed.ExternalObservation{}, err } return managed.ExternalObervation{ResourceExists: true} } func Create(ctx context.Context, mg resource.Managed) (managed.ExternalCreation, error) { _ := api.Create(AsInput(mg)) meta.SetExternalCreateTime() return managed.ExternalCreation{ExternalCreateTimeSet: true}, nil } ``` Signed-off-by: Nic Cope <negz@rk0n.org>
Per crossplane-contrib#802 there seems to be some lag between when some EC2 networking resources (RouteTables, InternetGateways) are created and when they actually show up in queries. This commit leverages crossplane/crossplane-runtime#280 to allow for this. Signed-off-by: Nic Cope <negz@rk0n.org>
Per crossplane-contrib#802 there seems to be some lag between when some EC2 networking resources (RouteTables, InternetGateways) are created and when they actually show up in queries. This commit leverages crossplane/crossplane-runtime#280 to allow for this. Signed-off-by: Nic Cope <negz@rk0n.org>
This commit moves the responsibility of setting the external-create-time annotation into the managed.Reconciler. As part of this we now persist annotations after every successful create. Signed-off-by: Nic Cope <negz@rk0n.org>
@@ -631,6 +650,12 @@ func (r *Reconciler) Reconcile(_ context.Context, req reconcile.Request) (reconc | |||
return reconcile.Result{Requeue: true}, errors.Wrap(r.client.Status().Update(ctx, managed), errUpdateManagedStatus) | |||
} | |||
|
|||
if observation.ResourcePending { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to calculate this here instead of expecting controller to return? I wonder if we can say ignore ResourceExists: false if the creation time annotation is set and X seconds delay has not passed yet
where X is given in Setup
function of the controller. I guess the downside is that retries will have some delay (mostly 1-2s maybe?) but it'd let us not have to worry about eventually consistent APIs one by one when people open issues about leaking.
// Create implementations are advised not to alter status, but | ||
// we may revisit this in future. | ||
a := managed.GetAnnotations() | ||
if err := retry.OnError(retry.DefaultRetry, resource.IsAPIError, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we run this update with retries only in case external name assigned because we lose the status when we do this operation and there are cases where information is only available as result of creation output. One possible solution could be to save the creation time before we call Create
though that would require two updates with retries, the one after create being inside conditional, which increases the complexity here.
I think this issue affect almost all AWS services due to a generated ID in external-name, so I think keeping the code duplication/complexity per controller low would be nice. So I wonder if it would be possible with an API more like: // SetupRouteTable adds a controller that reconciles RouteTables.
func SetupRouteTable(mgr ctrl.Manager, l logging.Logger, rl workqueue.RateLimiter, poll time.Duration) error {
name := managed.ControllerName(v1beta1.RouteTableGroupKind)
return ctrl.NewControllerManagedBy(mgr).
Named(name).
WithOptions(controller.Options{
RateLimiter: ratelimiter.NewDefaultManagedRateLimiter(rl),
}).
For(&v1beta1.RouteTable{}).
...
+ managed.WithRecreateGraceDelay(3*time.Minute),
managed.WithLogger(l.WithValues("controller", name)),
...
} Now I have not thought through how it would look from the runtime side, I just feel that it would be hard to review all controllers if the per-controller code is similar to provider-aws#805:
Isn't there almost enough information in the current observe code to know we should wait, if we just do the small API addition suggested above?
|
Closing in favor of #283. |
Description of your changes
Adds plumbing to fix at least one variant of crossplane-contrib/provider-aws#802.
Per crossplane-contrib/provider-aws#802 some external APIs (including some AWS APIs) appear to experience some delay between the time a new resource is successfully created and the time at which that resource appears in queries.
This commit adds a new 'crossplane.io/external-create-time' annotation and a new 'ResourcePending' field in the Observation struct. Together these can be used by an Observe method to allow for a small grace period before it determines that a resource does not exist. For example:
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested
Per #279 I've updated the AWS
RouteTable
andInternetGateway
controllers to use the above pattern and run the script in https://gist.github.com/negz/e1f2e74f18802d15440214a1a1abc981 for 12+ hours. I did not observe any leaked route tables or internet gateways during my testing, whereas without this fix I reliably see 2-3 leaked resources after 2-3 hours.