Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed owner reference from etcd statefulsets #48

Merged

Conversation

georgekuruvillak
Copy link
Contributor

@georgekuruvillak georgekuruvillak commented Apr 23, 2020

What this PR does / why we need it:
This PR removes the dependance on ownerReference in etcd statefulset. It has been observerd that ownerReference in statefulset prevents VPA from recommending values for vertical scaling.
Which issue(s) this PR fixes:
Fixes #47
Special notes for your reviewer:

Release note:

Removed owner reference from etcd StatefulSet so that HVPA can recommend resource recommendations. VPA does not support StatefulSet having `ownerReferences` set to another top-level controller .

@georgekuruvillak georgekuruvillak requested a review from a team as a code owner April 23, 2020 06:14
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 23, 2020
@georgekuruvillak georgekuruvillak changed the title Removed owner reference from etcd statefulsets [Do-Not-Merge] Removed owner reference from etcd statefulsets Apr 23, 2020
@gardener-robot-ci-3 gardener-robot-ci-3 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Apr 23, 2020
@georgekuruvillak georgekuruvillak changed the title [Do-Not-Merge] Removed owner reference from etcd statefulsets Removed owner reference from etcd statefulsets Apr 23, 2020
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Can we add the test cases to cover at least the following edge cases?

  1. Reconciliation of new etcd resource deployment without any existing statefulsets.
  2. Reconciliation of new etcd resource deployment with adoption of existing statefulset.
  3. Reconciliation of existing etcd resource with statefulset with ownerReference.
  4. Reconciliation of deletion of etcd resource with statefulset with ownerReference.
  5. Reconciliation of deletion of etcd resource with statefulset without ownerReference but with owned-by annotations.
  6. Reconciliation of deletion of etcd resource with statefulset without ownerReference and without owned-by annotations.

Also, can you please add the reasoning to the release notes?

controllers/etcd_controller.go Show resolved Hide resolved
Copy link

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you for PR. Please address the comments.

controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/etcd_controller.go Show resolved Hide resolved
controllers/etcd_controller.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 24, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 24, 2020
Copy link

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, i thought of refactoring controller-ref-manager.go as separate task. But since we are touching part of it, let's at-least refactor and reusae the code that is being touched here.

controllers/etcd_controller.go Outdated Show resolved Hide resolved
charts/etcd/templates/etcd-bootstrap-configmap.yaml Outdated Show resolved Hide resolved
charts/etcd/templates/etcd-service.yaml Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/suite_test.go Show resolved Hide resolved
controllers/etcd_controller_test.go Show resolved Hide resolved
@georgekuruvillak georgekuruvillak force-pushed the remove_ownerref_from_sts branch 2 times, most recently from ce80ff5 to f64df55 Compare April 25, 2020 09:18
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 27, 2020
Copy link

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanx for the test. I have tested the shoot creation/update with Gardener as well. Looks fine.

@ggaurav10 Will you please test against VPA/HVPA? Post that we can merge the PR and release.

@ggaurav10
Copy link

@georgekuruvillak and I have already tested working from VPA/HVPA side, and it worked as expected.

Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test cases! Suggested some changes.

Apart from the suggestions below, I see quite a bit of code copied with variations in the test cases. Would it make sense to clean them up? Table tests could help.

api/v1alpha1/etcd_types.go Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Outdated Show resolved Hide resolved
controllers/controller_ref_manager.go Show resolved Hide resolved
controllers/controller_ref_manager.go Show resolved Hide resolved
time.Sleep(2 * time.Second)
err = c.Delete(context.TODO(), instance)
Expect(err).NotTo(HaveOccurred())
err = retryTillStatefulSetRemoved(c, &sts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Eventually instead.

Expect(err).NotTo(HaveOccurred())
err = retryTillStatefulSetRemoved(c, &sts)
Expect(err).NotTo(HaveOccurred())
err = retryTillEtcdRemoved(c, instance)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Eventually instead.

})

func getStatefulset(c client.Client, instance *druidv1alpha1.Etcd, ss *appsv1.StatefulSet) {
func retryTillPodsDeleted(c client.Client, etcd *druidv1alpha1.Etcd) (*corev1.Pod, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Eventually instead.

return pod, err
}

func retryTillEtcdRemoved(c client.Client, etcd *druidv1alpha1.Etcd) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Eventually instead.

controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Apr 28, 2020
@georgekuruvillak
Copy link
Contributor Author

@amshuman-kr I have changed the tests to table tests. PTAL

Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for refactoring the tests. Could you please add a couple of test cases (one each for creation and deletion scenario) which are small variations on the existing cases. I think they would strengthen druid against some common use-cases.

case WithOwnerReference:
ss = createStatefulsetWithOwnerReference(instance)
Expect(len(ss.OwnerReferences)).ShouldNot(BeZero())
default:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a case for pre-existing StatefulSet with annotations? I know this won't happen when we make the next release. But it would be good to cover the case which would happen for every reconciliation after that. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pondering upon how the test would be like, what would be the expected behavior of the test. That the annotations still persist after the reconcile?

controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
controllers/etcd_controller_test.go Outdated Show resolved Hide resolved
Eventually(func() error { return statefulSetRemoved(c, &sts) }, timeout, pollingInterval).Should(BeNil())
Eventually(func() error { return etcdRemoved(c, instance) }, timeout, pollingInterval).Should(BeNil())
},
Entry("when statefulset with ownerReference and without owner annotations, druid should adopt and delete statefulset", "foo61", WithOwnerReference, false),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a case of pre-existing StatefulSet with no ownerReference and no annotations in which case the it should not be removed?

Copy link
Contributor Author

@georgekuruvillak georgekuruvillak Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if there is no owner reference/ owner annotations in the sts, the druid will adopt it if the selector match. This is because, in the pre druid scenario, we didnt have both and druid went ahead adopting it. Should we not continue the behavior? I have written the test for the same in the following line below.

Copy link
Collaborator

@amshuman-kr amshuman-kr Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it won't adopt if the etcd being deleted, no? Then the StatefulSet would be left untouched. A negative case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please make the change for the negative case, I mentioned above? I did not mean the adoption+deletion case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline conversation. We can take it forward to another PR since this requires reconcile call to be brought in into the test func.

Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the test cases! The one open case can be addressed in another PR.

@georgekuruvillak georgekuruvillak merged commit 6838160 into gardener:master Apr 28, 2020
@georgekuruvillak georgekuruvillak deleted the remove_ownerref_from_sts branch April 29, 2020 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] HVPA not able to scale etcd.
7 participants