Calico-typha pods - allow Cluster Autoscaler to evict #2235
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Type- bug fix/feature
On our GKE cluster I was looking into reasons why the CA wasn't taking effect against some nodes. It looks as though the calico-typha pods could be one reason. Once I deleted one of the typha pods from a node and let it run on a different node in the cluster, the node I was working on then became a valid candidate for scale down.
Unfortunately because we run on GKE I can't see the logs for the CA to 100% confirm this, but it does seem the most likely. As the pods have local node storage attached (at least they do in GKE) then I also added the safe-to-evict annotation as well as added a PDB.
This might not be right or useful as I'm mostly looking at what in GKE. GKE also runs pods for Typha autoscaling (calico-typha-vertical-autoscaler and calico-typha-horizontal-autoscaler) which also probably require the same settings to be applied.
Perhaps this is something I should feedback directly to GKE rather than via this PR?
Todos
Release Note