Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Pod Management to 'Parallel' and disallow cluster scale down entirely #621

Merged
merged 2 commits into from Mar 3, 2021

Conversation

ChunyiLyu
Copy link
Contributor

@ChunyiLyu ChunyiLyu commented Mar 1, 2021

This closes #298

Note to reviewers: remember to look at the commits in this PR and consider if they can be squashed

Summary Of Changes

  • set podManagementPolicy to Parallel

  • Use cluster_formation.randomized_startup_delay_range to avoid race condition between nodes during initial cluster formation

  • This change solves the problem of failing to restart the cluster when all pods are deleted and recreated at once; with podManagementPolicy set to 'OrderedReady', pod 0 is also the first one getting recreated, but it may not be the last node to shut down. This problem was first reported in community slack

  • Disallow cluster scale down entirely After talking to @mkuratczyk @MirahImage @yaronp68, we agreed that the cluster operator should prevent people from scaling down because it's not a properly supported and tested operation. Once Reconcile() has detect a scale down request, it will error, publish events, and set ReconcileSuccess to false.

  • PodManagementPolicy is immutable For existing clusters, operators won't be able to update the policy successfully. Users would need to manually delete the statefulset with cascading=false first, and then the operator can recreate the statefulSet with the correct settings. This needs to be mentioned in release notes

Additional Context

  • It would have been a cleaner and more elegant solution if we can mark CRD requirements on spec.replicas people from updating it to a less number. That would involve a webhook which adds more component for us to maintain. The current solution is easier to achieve and considering we want to support scale down in the future, it's an OK temporary fix.

  • controller-gen was updated in a previous PR, but the crd tag was not updated. I've included the change in this PR since it's a one liner.

- use cluster_formation.randomized_startup_delay_range
to avoid race condition between nodes during initial
cluster formation
- this change solves the problem of failing to restart
the cluster when all pods are deleted and recreated at
once; with podManagementPolicy set to 'OrderedReady',
pod 0 is also the first one getting recreated, but it
may not be the last node to shut down.
Copy link
Collaborator

@mkuratczyk mkuratczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

- prevent cluster scale down from happening by checking
current number of replicas vs desired number of replicas
after running statefulSetBuilder.Update()
- return errors, logs, publish events and set ReconcileSuccess
to false if scale down request detected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Look into setting podManagementPolicy to Parallel
3 participants