-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequent reconciliations while waiting for external resource deletion #305
Comments
We're seeing a similar behavior in AWS RDS Instance or EKS clusters as well where deletion takes ~3-10 minutes. In Jet providers, we have callback mechanisms for async calls, so even if we don't requeue after deletion call, we will get an event once it's finished but for native ones, there is no way to know if the deletion is completed unless you run I think using |
I think in the jet providers this could be in part because we're missing a layer of rate limiting. Most Crossplane providers use a rate limiter that applies exponential backoff from 1 to 60 seconds per individual object, and is further constrained by a global token bucket rate limiter that tries to keep overall requeues at 1 per second (allowing bursts of up to 10). You can see that at https://github.com/crossplane/provider-gcp/blob/cf9303bdf/pkg/controller/database/cloudsql.go#L77 It seems like the jet providers are only using the global rate limiter - not the managed resource scoped one, per:
Note that the global rate limiter is never wrapped with a That said, I would expect the global rate limiter alone to level out to reconciling once per second (across all managed resources) with a maximum burst of 10 reconciles per second. Or put otherwise, seeing ~10 reconciles for roughly the first second makes sense to me but after that I'd expect it to happen less frequently. I would expect that native providers - and jet providers if they started using the Note though that I'm in the (slow) process of changing this all per crossplane/crossplane#2595. |
Then what about consuming crossplane-runtime v1.6 in terrajet-based providers and adding the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as |
What happened?
While trying to delete a Managed Resource that I created on provider-jet-azure, I observed that the resource's controller reconciles very often. To see if this behavior is normal, I debug provider-jet-azure and observed that in the Reconcile function of crossplane-runtime, we set the Requeue value of the Result object to true after the deletion requested.
crossplane-runtime/pkg/reconciler/managed/reconciler.go
Line 735 in 21928d2
When I realized this situation, I expected to observe the same behavior in native providers and I did a debugging in provider-azure. I observed the same situation for native providers (frequent reconcile). However, after the second or third reconcile function, the sdk's get call no longer finds the resource in the cloud, so reconciling is not repeated too many times. Therefore, this is not a problem for native providers, as it does not reconcile too many times despite frequent reconciliations. However, since the deletion process in jet-providers takes a long time due to the terraform binary, we observe reconcile operation 5-10 times in a second for a few minutes.
How can we reproduce it?
Firstly, you should create a MR in provider-jet-azure. After the external resource provisioned and the state of MR is ready, you should delete the MR. Then you can check the provicer-jet-azure logs and see the frequent reconciliations.
What environment did it happen in?
Crossplane version: crossplane-runtime v0.15.1-0.20211004150827-579c1833b513
The text was updated successfully, but these errors were encountered: