Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PodDisruptionBudget to chart #2078

Merged
merged 2 commits into from
Jul 3, 2024
Merged

Conversation

csp33
Copy link
Contributor

@csp33 csp33 commented Jul 1, 2024

Purpose of this PR

When running the operator in production, it would be nice to have a PDB to ensure HA and prevent service disruption.

Proposed changes:

  • Allow deploying a PDB within the chart

Change Category

Indicate the type of change by marking the applicable boxes:

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Here's a blog post that summarizes the main advantages of PDBs

Checklist

Before submitting your PR, please review the following:

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

x

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Copy link
Contributor

@vara-bonthu vara-bonthu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @csp33 !

/approve

@ChenYi015
Copy link
Contributor

@csp33 What's the minimum k8s version needed to support this feature?

Copy link
Contributor

@ChenYi015 ChenYi015 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I have left some comments.

Comment on lines 12 to 16
# Whether to deploy a PodDisruptionBudget. Make sure you deploy at least 2 replicas before enabling this.
pdb:
enabled: false
minAvailable: 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Could you rename pdb to podDisruptionBudget, and use podDisruptionBudget.enable in order to be consistent with other options like webhook.enable, metrics.enable and podMonitor.enable.
  • You can use # -- to add comments, and it will be rendered in chart README.
  • Could you place this option after podMonitor section since it is a less common configuration.
Suggested change
# Whether to deploy a PodDisruptionBudget. Make sure you deploy at least 2 replicas before enabling this.
pdb:
enabled: false
minAvailable: 1
podDisruptionBudget:
# -- Specifies whether to enable pod disruption budget.
# Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/).
enable: false
# -- The number of pods that must be available.
# Require `replicaCount` to be greater than 1.
minAvailable: 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 1 to 13
{{- if $.Values.pdb.enabled }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.fullname" . }}-pdb
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
minAvailable: {{ $.Values.pdb.minAvailable }}
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud you rename this file to poddisruptionbudget.yaml and the test file to poddisruptionbudget_test.yaml respectively.

Besides, you can add a condition to test whether replica count is greater than 1, since this is needed by the feature.

Suggested change
{{- if $.Values.pdb.enabled }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.fullname" . }}-pdb
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
minAvailable: {{ $.Values.pdb.minAvailable }}
{{- end }}
{{- if .Values.podDisruptionBudget.enable }}
{{- if (gt .ReplicaCount 1) }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.fullname" . }}-pdb
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
minAvailable: {{ .Values.podDisruptionBudget.minAvailable }}
{{- else }}
{{- fail "replicaCount must be greater than 1 to enable pod disruption budget" }}
{{- end }}
{{- end }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@csp33 csp33 force-pushed the feature/pdb branch 2 times, most recently from 275f9d0 to d87839d Compare July 3, 2024 06:47
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
@csp33
Copy link
Contributor Author

csp33 commented Jul 3, 2024

@csp33 What's the minimum k8s version needed to support this feature?

1.21
https://kubernetes.io/docs/tasks/run-application/configure-pdb/

Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ChenYi015, peter-mcclonski, vara-bonthu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ChenYi015
Copy link
Contributor

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Jul 3, 2024
@google-oss-prow google-oss-prow bot merged commit b8c9013 into kubeflow:master Jul 3, 2024
2 checks passed
@csp33 csp33 deleted the feature/pdb branch July 3, 2024 07:46
YanivKunda pushed a commit to YanivKunda/spark-operator that referenced this pull request Jul 4, 2024
* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Yaniv Kunda <ykunda@akamai.com>
jbhalodia-slack pushed a commit to jbhalodia-slack/spark-operator that referenced this pull request Jul 22, 2024
* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
jacobsalway pushed a commit to jacobsalway/spark-operator that referenced this pull request Jul 25, 2024
* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
google-oss-prow bot pushed a commit that referenced this pull request Jul 26, 2024
* Update README and documentation (#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
ChenYi015 pushed a commit to ChenYi015/spark-operator that referenced this pull request Aug 1, 2024
* Update README and documentation (kubeflow#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (kubeflow#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
(cherry picked from commit 4108f54)
google-oss-prow bot pushed a commit that referenced this pull request Aug 1, 2024
* Update helm docs (#2081)

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
(cherry picked from commit eca3fc8)

* Update the process to build api-docs, generate CRD manifests and code (#2046)

* Update .gitignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update .dockerignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update Makefile

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update the process to generate api docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update the workflow to generate api docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Use controller-gen to generate CRD and deep copy related methods

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update helm chart CRDs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update workflow for building spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update README.md

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 779ea3d)

* Add topologySpreadConstraints (#2091)

* Update README and documentation (#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
(cherry picked from commit 4108f54)

* Use controller-runtime to reconsturct spark operator (#2072)

* Use controller-runtime to reconstruct spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update helm charts

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update examples

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 0dc641b)

---------

Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Co-authored-by: jbhalodia-slack <jbhalodia@salesforce.com>
YanivKunda pushed a commit to YanivKunda/spark-operator that referenced this pull request Aug 5, 2024
* Update README and documentation (kubeflow#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (kubeflow#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
sigmarkarl pushed a commit to spotinst/spark-on-k8s-operator that referenced this pull request Aug 7, 2024
* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
sigmarkarl pushed a commit to spotinst/spark-on-k8s-operator that referenced this pull request Aug 7, 2024
* Update README and documentation (kubeflow#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (kubeflow#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants