Description
Consider the case where Reconcile
calls an external API to notify it of some event and may receive an error. I can handle these errors explicitly within Reconcile
to avoid some unnecessary requeues for terminal errors, but I may want to configure a default/backstop policy which requeues as normal (i.e. according to the rate limiter), but stops at a specified failureCount. In other words, make it easier to configure the controller to prevent endless requeueing of all "non-terminal" errors, and therefore prevent spamming the external API by default.
From what I can tell, the current way to do this "per-item requeue limit" is to roll a custom Controller
implementation and call c.queue.Forget(item)
within processNextWorkItem
if the failureCount exceeds some hardcoded number. I see this logic present on some of the client-go examples.
Adding this to the controller builder options seems natural, since we already have RateLimiter
and this is semi-related. The PerItemRequeueLimit
option could default to 0
, which we would treat as "infinity". Any int setting greater than zero would allow us to impose the limit by checking NumRequeues
and, if the limit is violated, we emit metrics and bypass the requeue.
log.V(5).Info("Reconciling")
result, err := c.Reconcile(ctx, req)
switch {
case err != nil:
if errors.Is(err, reconcile.TerminalError(nil)) {
ctrlmetrics.TerminalReconcileErrors.WithLabelValues(c.Name).Inc()
} else if c.PerItemRequeueLimit != 0 && c.Queue.NumRequeues(req) > c.PerItemRequeueLimit {
ctrlmetrics.ExceededPerItemRequeueLimitReconcileErrors.WithLabelValues(c.Name).Inc()
} else {
c.Queue.AddWithOpts(priorityqueue.AddOpts{RateLimited: true, Priority: priority}, req)
}