-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to tune the number of typha replicas #1295
Comments
Maybe it'd not only be interesting to tune the replica count, but enable the Adding this capability would make the operator an alternative for the on-prem 50 nodes or less deployment variant that currently relies on a separate manifest. This installation variant end-result is currently not covered in the operator and adding this option would make it possible to use the operator for basically all documented deployment variants. |
@bodgit The operator will auto scale typha so if only 1 or 2 nodes are present the appropriate number of typhas will be used. Also if you deploy a large number of nodes, typha will be scaled up as necessary by the operator. Typha will not be scaled to less than 3 if there are at least 3 nodes available as we consider 3 to be the minimum for a high availability purposes. @Omar007 The operator is deploying what we believe to be best practice, and that includes typha with all installation sizes. The operator is a great option for the on-prem 50 node or less deployment. The recommendation that typha should be deployed on clusters with more than 50 nodes should not be taken as a recommendation that it should not be deployed with less than 50 nodes. |
@tmjd Ok fair enough. Does that then mean that part of the documentation is basically outdated/deprecated? Small sidenote; It looks like the scale count doesn't care about the node type and will operate on both master/control-plane nodes as well as worker nodes. Is that something that should be accounted for? |
@tmjd Thanks for the explanation. Is the scaling algorithm documented somewhere, i.e. how many nodes do I need to have before a fourth typha replica is added, etc.? |
@Omar007 I don't think the documentation is outdated. It is perfectly fine to deploy without typha for fewer than 50 pods.
Any host that will be running calico/node should be counted for the purposes for typha scaling. @bodgit The only documentation is in the code 😀,
operator/pkg/common/autoscale.go Line 17 in 4c523e7
|
@tmjd Maybe that was a bit strongly worded. What I meant was more along the lines of: unlike how it is presented in the documentation, it's not the preferred/recommended deployment method (anymore?). As such, when I looked at using Calico based purely on the docs (layout, wording, etc), it suggests to me that:
And when trying to use the operator for the <50 nodes use-case as seemingly documented as being preferred I came upon this issue while trying to figure out how to deploy without typha to match that documented use-case. In reality it seems/feels like it's more like:
Which is fine by me, just not at all clear to me until you elaborated here ;) This information here helps a lot by deciding on the deployment strategy because it basically just flattens the whole selection of options to just 1 for me; use the operator which will deploy Calico with typha, which is the preferred and best practice deployment. |
@tmjd A pointer to the code is fine, that comment block explains it perfectly. I see I will have to get into the realms of 200+ nodes before a fourth typha Pod appears, I think that will be a nice problem to have! |
I ran into this issue today. When using cluster-autoscaler, once you scale to a number of nodes that increases the typha replicas, the cluster autoscaler will never be able to scale nodes back down to a smaller size that would decrease the number of typha replicas. For example here the cluster-autoscaler is unable to evict typha pods to scale below 3 nodes:
I believe this would work if the tigera-operator supported the annotation I can understand why running > 2 typha replicas is recommended for HA during updates, but in smaller non-prod clusters, this can be undesirable. |
@itmustbejj Can I suggest opening a new issue as I think adding |
I'm going to close this Issue as the original request is not something we're going to expose. |
Expected Behavior
I want to provision a reasonable number of small test EKS clusters that don't necessarily require a large number of nodes but do need to be fully functional, i.e. calico, cluster-autoscaler, etc. The typha deployment seems to be set to 3 replicas which means each cluster has a minimum of 3 nodes regardless of how utilised they are.
Current Behavior
The typha deployment is set for 3 replicas which means as soon as it's deployed, the autoscaler kicks in and increases the nodes to 3 to satisfy the deployment and will never scale down again. I can edit the deployment once deployed but it seems the change is overwritten again within a few minutes. Do I need 3 for consensus reasons or can I get away with fewer?
Possible Solution
I've made use of the
registry
andimagePullSecrets
fields on theInstallation
resource as my EKS clusters are entirely private, (thanks for those!), would adding another field here be a potential solution?Your Environment
The text was updated successfully, but these errors were encountered: