Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: convert k8s charts deployment -> statefulset #3642

Merged
merged 1 commit into from
Jan 22, 2024

Conversation

conorsch
Copy link
Contributor

Updates the helm charts used for testnet deployments to use a StatefulSet [0], rather than a Deployment [1], as the representation for a Penumbra fullnode/validator. The goal is to leverage the k8s API as best as possible for our workloads, which are indeed stateful in the sense that they require attached storage and cannot maintain their identity absent that storage.

We also benefit from ordered rollouts, meaning that future minor version bumps will be applied sequentially, and paused if any node fails to become ready. This will ensure more predictable behavior as we move toward chain upgrades.

When performing a chain upgrade, the manual steps taken by a human operator are now significantly simpler. In addition to the conversion to Statefulsets, the relevant charts now boast a new future called "maintenanceMode", defaulting to false, which will place nodes in a suspended state so that a human operator can run pd migrate. This mode encapsulates a number of finicky manual steps: override command to be "sleep infinity", for both pd and cometbft, alter securityContext to run as root user for volume permissions, and then undo all that in the reverse order.

[0] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[1] https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Updates the helm charts used for testnet deployments to use a
StatefulSet [0], rather than a Deployment [1], as the representation
for a Penumbra fullnode/validator. The goal is to leverage the k8s API
as best as possible for our workloads, which are indeed stateful in the
sense that they require attached storage and cannot maintain their
identity absent that storage.

We also benefit from ordered rollouts, meaning that future minor version
bumps will be applied sequentially, and paused if any node fails to
become ready. This will ensure more predictable behavior as we move
toward chain upgrades.

When performing a chain upgrade, the manual steps taken by a human
operator are now significantly simpler. In addition to the conversion to
Statefulsets, the relevant charts now boast a new future called
"maintenanceMode", defaulting to false, which will place nodes in a
suspended state so that a human operator can run `pd migrate`. This mode
encapsulates a number of finicky manual steps: override command to be
"sleep infinity", for both pd and cometbft, alter securityContext
to run as root user for volume permissions, and then undo all that in
the reverse order.

[0] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
[1] https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
@conorsch conorsch force-pushed the statefulset-deployments branch from 8ca2c7e to 1423b35 Compare January 22, 2024 22:52
@conorsch conorsch marked this pull request as ready for review January 22, 2024 22:58
@conorsch conorsch merged commit 1d688c2 into main Jan 22, 2024
7 checks passed
@conorsch conorsch deleted the statefulset-deployments branch January 22, 2024 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant