Make Zenith SSHD more robust #364

assumptionsandg · 2024-04-24T13:49:10Z

Set SSHD replicas to 3 by default
Use topology spread constraints to ensure those 3 pods are spread over the workers as evenly as possible (i.e. maxSkew: 1)
Use pod disruption budget with maxUnavailable: 1 to ensure that only one SSHD pod is killed at once

mkjpryor · 2024-04-25T09:49:37Z

@assumptionsandg There are more things than this that we need to do to make Zenith SSHD properly robust. I have changed the title and added some items in the description to reflect this.

charts/server/values.yaml

charts/server/templates/sshd/deployment.yaml

mkjpryor · 2024-05-17T13:36:02Z

Topology spread should work in an AIO, and will make sure that the pods schedule on different nodes by preference on a HA cluster, so I would prefer to have it back...

The reason for using topology spread and not anti-affinity is specifically because you specify a maxSkew between nodes. So if there is one node and 3 replicas, all 3 should go on the same node because there is no other node to compare skew against.

If it wasn't working, we should work out why.

mkjpryor · 2024-05-17T13:36:51Z

@JohnGarbutt Why 2 and not 3 for the replicas? I have a slight preference for 3 as it will result in one per node in our default HA setup.

JohnGarbutt · 2024-05-17T13:57:27Z

So it was failing on the single node deployment, so I was mostly suggesting dialling a few things back to see if that gets it working. Now we have something working, we could always have a look at adding back in the max skew... it looked like it was failing to create the first pod, but was OK if you already had one pod running, id I understand what we found out correctly?

mkjpryor

LGTM

assumptionsandg force-pushed the feature/pdb-zenith branch 2 times, most recently from 1a71ad2 to 2edb4bc Compare April 24, 2024 14:28

mkjpryor changed the title ~~Add PodDisruptionBudget to Zenith SSHD~~ Make Zenith SSHD more robust Apr 25, 2024

assumptionsandg force-pushed the feature/pdb-zenith branch from 2edb4bc to 4a79eea Compare April 26, 2024 13:21

assumptionsandg marked this pull request as ready for review May 1, 2024 11:28

assumptionsandg requested a review from a team as a code owner May 1, 2024 11:28

JohnGarbutt reviewed May 17, 2024

View reviewed changes

charts/server/values.yaml Show resolved Hide resolved

JohnGarbutt reviewed May 17, 2024

View reviewed changes

charts/server/templates/sshd/deployment.yaml Outdated Show resolved Hide resolved

mkjpryor changed the title ~~Make Zenith SSHD more robust~~ Use pod disruption budget for Zenith SSHD May 17, 2024

assumptionsandg force-pushed the feature/pdb-zenith branch from 4d81505 to 0cdb64c Compare May 17, 2024 14:14

Add PodDisruptionBudget to Zenith SSHD

168d6e2

assumptionsandg force-pushed the feature/pdb-zenith branch from 0cdb64c to f1cfb0f Compare May 17, 2024 14:17

assumptionsandg added 3 commits May 17, 2024 15:31

Use maxUnavailable

25d1271

Remove pdb from role

6354146

change topologyKey

5079210

assumptionsandg force-pushed the feature/pdb-zenith branch from 7138ee3 to 5079210 Compare May 17, 2024 14:32

Remove app label

2708c44

mkjpryor changed the title ~~Use pod disruption budget for Zenith SSHD~~ Make Zenith SSHD more robust May 17, 2024

mkjpryor approved these changes May 17, 2024

View reviewed changes

mkjpryor merged commit 0a79612 into main May 17, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Zenith SSHD more robust #364

Make Zenith SSHD more robust #364

assumptionsandg commented Apr 24, 2024 •

edited by mkjpryor

mkjpryor commented Apr 25, 2024

mkjpryor commented May 17, 2024 •

edited

mkjpryor commented May 17, 2024

JohnGarbutt commented May 17, 2024

mkjpryor left a comment

Make Zenith SSHD more robust #364

Make Zenith SSHD more robust #364

Conversation

assumptionsandg commented Apr 24, 2024 • edited by mkjpryor

mkjpryor commented Apr 25, 2024

mkjpryor commented May 17, 2024 • edited

mkjpryor commented May 17, 2024

JohnGarbutt commented May 17, 2024

mkjpryor left a comment

Choose a reason for hiding this comment

assumptionsandg commented Apr 24, 2024 •

edited by mkjpryor

mkjpryor commented May 17, 2024 •

edited