Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase or Decrease polling rate for instance in a final state ready or failed. #53

Merged
merged 13 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions api/v1alpha1/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ const (
AnnotationMaxRetries = "service-operator.cf.cs.sap.com/max-retries"
// annotation to hold the reconciliation timeout value
AnnotationReconcileTimeout = "service-operator.cf.cs.sap.com/timeout-on-reconcile"
// annotation to increase or decrease the requeue interval at which the operator polls the status of CR after final state ready.
AnnotationPollingIntervalReady = "service-operator.cf.cs.sap.com/polling-interval-ready"
// annotation to increase or decrease the requeue interval at which the operator polls the status of CR after final state failed.
AnnotationPollingIntervalFail = "service-operator.cf.cs.sap.com/polling-interval-fail"
// annotation to adopt orphan CF resources. If set to 'adopt', the operator will adopt orphan CF resource.
// Ex. "service-operator.cf.cs.sap.com/adopt-cf-resources"="adopt"
AnnotationAdoptCFResources = "service-operator.cf.cs.sap.com/adopt-cf-resources"
Expand Down
79 changes: 79 additions & 0 deletions internal/controllers/reconciler_helpers.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
/*
SPDX-FileCopyrightText: 2024 SAP SE or an SAP affiliate company and cf-service-operator contributors
SPDX-License-Identifier: Apache-2.0
*/

/*
Package controllers contains the implementation of various helper functions used by the reconciler.
*/

package controllers

import (
"strconv"
"time"

"github.com/go-logr/logr"
cfv1alpha1 "github.com/sap/cf-service-operator/api/v1alpha1"
ctrl "sigs.k8s.io/controller-runtime"
)

// setMaxRetries sets the maximum number of retries for a service instance based on the value provided in the annotations
// or uses the default value if the annotation is not set or is invalid.
// TODO: Make it Generic so applies to Space and ServiceBinding.
// TODO: Add a test for this function.
func setMaxRetries(serviceInstance *cfv1alpha1.ServiceInstance, log logr.Logger) {
// Default to an infinite number of retries
serviceInstance.Status.MaxRetries = serviceInstanceDefaultMaxRetries

// Use max retries from annotation
maxRetriesStr, found := serviceInstance.GetAnnotations()[cfv1alpha1.AnnotationMaxRetries]
if found {
maxRetries, err := strconv.Atoi(maxRetriesStr)
if err != nil {
log.V(1).Info("Invalid max retries annotation value, using default", "AnnotationMaxRetries", maxRetriesStr)
} else {
serviceInstance.Status.MaxRetries = maxRetries
}
}
}

// getReconcileTimeout reads the reconcile timeout from the annotation on the service instance
// or - if the annotation is not set - uses the default value serviceInstanceDefaultRequeueTimeout
// or else returns the reconcile timeout as a time duration
// TODO: Make it Generic so applies to Space and ServiceBinding.
// TODO: Add a test for this function.
func getReconcileTimeout(serviceInstance *cfv1alpha1.ServiceInstance) time.Duration {
// Use reconcile timeout from annotation, use default if annotation is missing or not parsable
reconcileTimeoutStr, ok := serviceInstance.GetAnnotations()[cfv1alpha1.AnnotationReconcileTimeout]
if !ok {
return serviceInstanceDefaultReconcileInterval
}
reconcileTimeout, err := time.ParseDuration(reconcileTimeoutStr)
if err != nil {
return serviceInstanceDefaultReconcileInterval
}
return reconcileTimeout
}

// getPollingInterval retrieves the polling interval from the annotaion on the service instance
// or - in case the annotation is not set or invalid - returns either the defaultDurationStr or an empty ctrl.Result{}.
// Otherwise, it returns a ctrl.Result with the RequeueAfter field set in the annotation.
func getPollingInterval(annotations map[string]string, defaultDurationStr, annotationName string) ctrl.Result {
pollingIntervalStr, ok := annotations[annotationName]
if ok {
pollingInterval, err := time.ParseDuration(pollingIntervalStr)
if err == nil {
return ctrl.Result{RequeueAfter: pollingInterval}
}
}

// If the polling interval is not set, return the default duration
defaultDuration, err := time.ParseDuration(defaultDurationStr)
if err != nil {
// If the default duration is not parsable, return an empty result
return ctrl.Result{}
}

return ctrl.Result{RequeueAfter: defaultDuration}
}
97 changes: 97 additions & 0 deletions internal/controllers/reconciler_helpers_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
/*
SPDX-FileCopyrightText: 2024 SAP SE or an SAP affiliate company and cf-service-operator contributors
SPDX-License-Identifier: Apache-2.0
*/

package controllers

import (
"testing"
"time"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
cfv1alpha1 "github.com/sap/cf-service-operator/api/v1alpha1"
ctrl "sigs.k8s.io/controller-runtime"
)

func TestReconcilerHelpers(t *testing.T) {
RegisterFailHandler(Fail)
// RunSpecs(t, "Reconciler Helpers Suite")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to activate this test?

}

var _ = Describe("Create a instance with the polling interval annotation | GetPollingInterval", func() {

It("It should return a RequeueAfter of 10 Seconds time duration", func() {
serviceInstance := &cfv1alpha1.ServiceInstance{
ObjectMeta: metav1.ObjectMeta{
Annotations: map[string]string{
cfv1alpha1.AnnotationPollingIntervalReady: "10s",
},
},
}

result := getPollingInterval(serviceInstance.GetAnnotations(), "100m", cfv1alpha1.AnnotationPollingIntervalReady)
Expect(result.RequeueAfter).To(Equal(10 * time.Second))
})

It("It should return a RequeueAfter of 2 Minutes time duration", func() {
serviceInstance := &cfv1alpha1.ServiceInstance{
ObjectMeta: metav1.ObjectMeta{
Annotations: map[string]string{
cfv1alpha1.AnnotationPollingIntervalFail: "2m",
},
},
}

result := getPollingInterval(serviceInstance.GetAnnotations(), "100m", cfv1alpha1.AnnotationPollingIntervalFail)
Expect(result.RequeueAfter).To(Equal(2 * time.Minute))
})
})

var _ = Describe("Create a ServiceBinding without the polling interval annotation | GetPollingInterval", func() {
It("Should return a ctrl.Result with RequeueAfter of default duration", func() {
defaultDurationStr := "100m"

serviceInstance := &cfv1alpha1.ServiceBinding{
ObjectMeta: metav1.ObjectMeta{
Annotations: map[string]string{},
},
}

result := getPollingInterval(serviceInstance.GetAnnotations(), defaultDurationStr, cfv1alpha1.AnnotationPollingIntervalReady)
Expect(result).To(Equal(ctrl.Result{RequeueAfter: 100 * time.Minute}))
})
})

var _ = Describe("Create a Space instance with an invalid polling interval annotation | GetPollingInterval", func() {
It("Should return a ctrl.Result with RequeueAfter of default duration", func() {
defaultDurationStr := "10h"

serviceInstance := &cfv1alpha1.Space{
ObjectMeta: metav1.ObjectMeta{
Annotations: map[string]string{
cfv1alpha1.AnnotationPollingIntervalReady: "invalid",
},
},
}

result := getPollingInterval(serviceInstance.GetAnnotations(), defaultDurationStr, cfv1alpha1.AnnotationPollingIntervalReady)
Expect(result).To(Equal(ctrl.Result{RequeueAfter: 10 * time.Hour}))
})
})

var _ = Describe("Create a Space instance without annotations and empty defaul time duration| GetPollingInterval", func() {
It("Should return an empty ctrl.Result", func() {
defaultDurationStr := ""

space := &cfv1alpha1.Space{
ObjectMeta: metav1.ObjectMeta{},
}

result := getPollingInterval(space.GetAnnotations(), defaultDurationStr, cfv1alpha1.AnnotationPollingIntervalReady)
Expect(result).To(Equal(ctrl.Result{}))
})
})
2 changes: 1 addition & 1 deletion internal/controllers/servicebinding_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ func (r *ServiceBindingReconciler) Reconcile(ctx context.Context, req ctrl.Reque
return ctrl.Result{RequeueAfter: 10 * time.Minute}, nil
}
// TODO: apply some increasing period, depending on the age of the last update
return ctrl.Result{RequeueAfter: 10 * time.Minute}, nil
return getPollingInterval(serviceBinding.GetAnnotations(), "10m", cfv1alpha1.AnnotationPollingIntervalReady), nil
case facade.BindingStateCreatedFailed, facade.BindingStateDeleteFailed:
serviceBinding.SetReadyCondition(cfv1alpha1.ConditionFalse, string(cfbinding.State), cfbinding.StateDescription)
// TODO: apply some increasing period, depending on the age of the last update
Expand Down
47 changes: 3 additions & 44 deletions internal/controllers/serviceinstance_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ import (
"context"
"fmt"
"math"
"strconv"
"time"

"github.com/go-logr/logr"
Expand Down Expand Up @@ -42,7 +41,6 @@ const (

// Default values while waiting for ServiceInstance creation (state Progressing)
serviceInstanceDefaultReconcileInterval = 1 * time.Second
//serviceInstanceDefaultMaxReconcileInterval = 10 * time.Minute

// Default values for error cases during ServiceInstance creation
serviceInstanceDefaultMaxRetries = math.MaxInt32 // infinite number of retries
Expand Down Expand Up @@ -111,7 +109,7 @@ func (r *ServiceInstanceReconciler) Reconcile(ctx context.Context, req ctrl.Requ
// Set a first status (and requeue, because the status update itself will not trigger another reconciliation because of the event filter set)
if ready := serviceInstance.GetReadyCondition(); ready == nil {
serviceInstance.SetReadyCondition(cfv1alpha1.ConditionUnknown, serviceInstanceReadyConditionReasonNew, "First seen")
SetMaxRetries(serviceInstance, log)
setMaxRetries(serviceInstance, log)
return ctrl.Result{Requeue: true}, nil
}

Expand Down Expand Up @@ -385,8 +383,7 @@ func (r *ServiceInstanceReconciler) Reconcile(ctx context.Context, req ctrl.Requ
case facade.InstanceStateReady:
serviceInstance.SetReadyCondition(cfv1alpha1.ConditionTrue, string(cfinstance.State), cfinstance.StateDescription)
serviceInstance.Status.RetryCounter = 0 // Reset the retry counter
// TODO: apply some increasing period, depending on the age of the last update
return ctrl.Result{RequeueAfter: 10 * time.Minute}, nil
return getPollingInterval(serviceInstance.GetAnnotations(), "10m",cfv1alpha1.AnnotationPollingIntervalReady), nil
case facade.InstanceStateCreatedFailed, facade.InstanceStateUpdateFailed, facade.InstanceStateDeleteFailed:
// Check if the retry counter exceeds the maximum allowed retries.
// Check if the maximum retry limit is exceeded.
Expand Down Expand Up @@ -444,43 +441,6 @@ func (r *ServiceInstanceReconciler) SetupWithManager(mgr ctrl.Manager) error {
Complete(r)
}

// IncrementRetryCounterAndCheckRetryLimit increments the retry counter for a ServiceInstance and checks if the number of retries has exceeded the maximum allowed retries.
// The maximum retries is configured per ServiceInstance via the annotation, AnnotationMaxRetries. If not specified,
// a default value is used.
// This function updates the ServiceInstance's Condition and State to indicate a failure when the retry limit is reached.
// Returns:A boolean indicating whether the retry limit has been reached.
func SetMaxRetries(serviceInstance *cfv1alpha1.ServiceInstance, log logr.Logger) {
// Default to an infinite number number of retries
serviceInstance.Status.MaxRetries = serviceInstanceDefaultMaxRetries

// Use max retries from annotation
maxRetriesStr, found := serviceInstance.GetAnnotations()[cfv1alpha1.AnnotationMaxRetries]
if found {
maxRetries, err := strconv.Atoi(maxRetriesStr)
if err != nil {
log.V(1).Info("Invalid max retries annotation value, using default", "AnnotationMaxRetries", maxRetriesStr)
} else {
serviceInstance.Status.MaxRetries = maxRetries
}
}
}

// function to read/get reconcile timeout annotation from the service instance "AnnotationReconcileTimeout = "service-operator.cf.cs.sap.com/timeout-on-reconcile" "
// if the annotation is not set, the default value is used serviceInstanceDefaultRequeueTimeout
// else returns the reconcile timeout as a time duration
func getReconcileTimeout(serviceInstance *cfv1alpha1.ServiceInstance) time.Duration {
// Use reconcile timeout from annotation, use default if annotation is missing or not parsable
reconcileTimeoutStr, ok := serviceInstance.GetAnnotations()[cfv1alpha1.AnnotationReconcileTimeout]
if !ok {
return serviceInstanceDefaultReconcileInterval
}
reconcileTimeout, err := time.ParseDuration(reconcileTimeoutStr)
if err != nil {
return serviceInstanceDefaultReconcileInterval
}
return reconcileTimeout
}

// HandleError sets conditions and the context to handle the error.
// Special handling for retryable errros:
// - retry after certain time interval
Expand All @@ -499,9 +459,8 @@ func (r *ServiceInstanceReconciler) HandleError(ctx context.Context, serviceInst
if serviceInstance.Status.MaxRetries != serviceInstanceDefaultMaxRetries && serviceInstance.Status.RetryCounter >= serviceInstance.Status.MaxRetries {
// Update the instance's status to reflect the failure due to too many retries.
serviceInstance.SetReadyCondition(cfv1alpha1.ConditionFalse, "MaximumRetriesExceeded", "The service instance has failed due to too many retries.")
return ctrl.Result{}, nil // finish reconcile loop
return getPollingInterval(serviceInstance.GetAnnotations(), "",cfv1alpha1.AnnotationPollingIntervalFail), nil // finish reconcile loop
}

// double the requeue interval
condition := serviceInstance.GetReadyCondition()
requeueAfter := 1 * time.Second
Expand Down
2 changes: 1 addition & 1 deletion internal/controllers/space_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ func (r *SpaceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (resu

log.V(1).Info("healthcheck successful")
space.SetReadyCondition(cfv1alpha1.ConditionTrue, spaceReadyConditionReasonSuccess, "Success")
return ctrl.Result{RequeueAfter: 60 * time.Second}, nil
return getPollingInterval(space.GetAnnotations(), "60s",cfv1alpha1.AnnotationPollingIntervalReady), nil
} else if len(serviceInstanceList.Items) > 0 {
space.SetReadyCondition(cfv1alpha1.ConditionUnknown, spaceReadyConditionReasonDeletionBlocked, "Waiting for deletion of depending service instances")
// TODO: apply some increasing period, depending on the age of the last update
Expand Down
58 changes: 58 additions & 0 deletions website/content/en/docs/tutorials/annotations.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mentioning the default values/behavior is helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some changes to the document.
This includes information about the default values and behavior under Default Requeue After Interval.

Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: "Annotations"
linkTitle: "Annotations"
weight: 20
type: "docs"
description: >
How to control and optimize the CF Service Operator behavior via annotations.
---

## Annotation Polling Interval Ready

The AnnotationPollingIntervalReady annotation is used to specify the duration of the requeue after interval at which the operator polls the status of a Custom Resource after final state ready. It is possible to apply this annotations to Space, ServiceInstance and ServiceBiding CRs.

By using this annotation, the code allows for flexible configuration of the polling interval, making it easier to adjust the readiness checking frequency based on specific requirements or conditions.

The value of the annotation is a string representing a duration, such as "100m" or "5h".

Usage:

```yaml
apiVersion: cf.cs.sap.com/v1alpha1
kind: ServiceInstance
metadata:
annotations:
service-operator.cf.cs.sap.com/polling-interval-ready: "3h"
```

In the example above the custom resource will be reconcile every three hours after reaching the state Ready.

**Default Requeue After Interval**

If the annotation AnnotationPollingIntervalReady is not set, the interval duration will be set to 10 minutes by default.

### Annotation Polling Interval Fail

The AnnotationPollingIntervalFail annotation is used to specify the duration of the requeue interval at which the operator polls the status of a Custom Resource after the final states Creation Failed and Deletion Failed. Currently it is possible to apply this annotations to ServiceInstance custom resource only.
RalfHammer marked this conversation as resolved.
Show resolved Hide resolved

By using this annotation, the code allows for flexible configuration of the polling interval, making it easier to adjust the re-queue frequency after the failure based on specific requirements or conditions.

The value of the annotation is a string representing a duration, such as "20s" or "10m".

Usage:

```yaml
apiVersion: cf.cs.sap.com/v1alpha1
kind: ServiceInstance
metadata:
annotations:
service-operator.cf.cs.sap.com/polling-interval-fail: "5m"
```

In the example above the custom resource will be reconcile every five minutes after reaching the final state Failed.

**Default Requeue After Interval**

If the annotation AnnotationPollingIntervalFail is not set, there won't be an immediate requeue. This means the resource will not be re-reconciled right away. The operator will consider the custom resource to be in a stable state, at least for now.

That means there is no default time duration for it, and it will return an empty result, ctrl.Result{}.