Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PVC expansion 馃ぉ #606

Merged
merged 6 commits into from
Feb 18, 2021
Merged

Support PVC expansion 馃ぉ #606

merged 6 commits into from
Feb 18, 2021

Conversation

ChunyiLyu
Copy link
Contributor

@ChunyiLyu ChunyiLyu commented Feb 16, 2021

Note to reviewers: remember to look at the commits in this PR and consider if they can be squashed

Summary Of Changes

  1. update default PVC 'persistence' storage capacity in statefulSetBuilder
  2. reconcilePVC() check if the default PVC 'persistence' storage capacity has updated; if yes:
    1. delete current statefulSet with delete propagation policy set to "Orphan"
    2. update PVCs manually
    3. recreate statefulSet (statefulSet will be automatically created by the CreateOrUpdate() call later as a result of its deletion in step 1; does not need to be done in reconcilePVC() explicitly)
  3. tested on the system test level (unfortunately testEnv does not create any PVC). The system test makes sure that PVC storage capacity is updated and the new capacity is reflected in the rabbitmq pod, while making sure that the pod was not recreated in this process. I've set 2min timeout for both the PVC and the pod storage capacity update because the speed of such operations seem to be dependent on the k8s cluster load
  4. The PVC controller does not allow shrinking. If users try to shrink the size in RabbitmqCluster CRD, the operator will accept the input but Reconcile() will fail at updating PVCs.
  5. Volume expansion system test is skipped in github action. KinD does not support that.

This feature does not work for any additional PVC that's added via statefulSet override If the default PVC 'persistence' is updated via override, expansion will work only for PVC 'persistence', but not for any additional PVCs. I intentionally left added PVCs out of this feature. We can add the support in the future if necessary.

Additional Context

Local Testing

Unit, integration, and system tests all passing

- kinD does not support volume expansion
@mkuratczyk
Copy link
Collaborator

Great stuff! I managed to expand my volume. :)

I noticed two small issues:

  1. Twice, I saw an error in the logs when deleting the cluster. At least once it was after simply running kubectl rabbitmq create foo; kubectl rabbitmq delete foo. I see two issues here - this error shouldn't happen and when it does, it's not clear whether UID in object meta was empty or not interpolated into the error message:
2021-02-17T08:54:16.762Z	ERROR	controller-runtime.manager.controller.rabbitmqcluster	Reconciler error	{"reconciler group": "rabbitmq.com", "reconciler kind": "RabbitmqCluster", "name": "foo", "namespace": "default", "error": "Operation cannot be fulfilled on rabbitmqclusters.rabbitmq.com \"foo\": StorageError: invalid object, Code: 4, Key: /registry/rabbitmq.com/rabbitmqclusters/default/foo, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 0ca15b9b-4015-4e1f-8006-e647bb8f0566, UID in object meta: "}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.2/pkg/internal/controller/controller.go:301
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.2/pkg/internal/controller/controller.go:252
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.2/pkg/internal/controller/controller.go:215
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:185
k8s.io/apimachinery/pkg/util/wait.UntilWithContext
	/go/pkg/mod/k8s.io/apimachinery@v0.20.2/pkg/util/wait/wait.go:99
  1. Incorrect string interpolation in the reconciliation message:
2021-02-17T08:33:10.210Z	INFO	controller-runtime.manager.controller.rabbitmqcluster	updating storage capacity from {{10737418240 0} {<nil>} 10Gi BinarySI} to {{16106127360 0} {<nil>} 15Gi BinarySI}	{"reconciler group": "rabbitmq.com", "reconciler kind": "RabbitmqCluster", "name": "foo", "namespace": "default"}

@ChunyiLyu
Copy link
Contributor Author

ChunyiLyu commented Feb 17, 2021

@mkuratczyk thanks for speedy feedback! 馃檹

  1. is fixed. 2. I can reproduce the same error on main branch. I created a separate bug report to look into it.

ReconcileSuccess status.Condidition and publish
events

- correct logging on storage capacity
@Zerpet Zerpet self-requested a review February 17, 2021 11:38
controllers/rabbitmqcluster_controller.go Outdated Show resolved Hide resolved
controllers/reconcile_persistence.go Outdated Show resolved Hide resolved
controllers/reconcile_persistence.go Show resolved Hide resolved
system_tests/system_tests.go Outdated Show resolved Hide resolved
@ChunyiLyu ChunyiLyu merged commit 44521b4 into main Feb 18, 2021
@ChunyiLyu ChunyiLyu deleted the persistence-resize branch February 18, 2021 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants