-
Notifications
You must be signed in to change notification settings - Fork 1
Add HPA to nginx-ingress-controller deployment #78
Conversation
@@ -95,7 +95,7 @@ spec: | |||
exec: | |||
command: | |||
- sleep | |||
- "15" | |||
- "60" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
15 seconds seemed short for me. I saw other charts with HPA use 60 seconds, so I adapted this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, this makes it more graceful
name: {{ .Values.controller.name }} | ||
minReplicas: 2 | ||
# This is the maximum number of nodes we allow currently. Having more controllers than nodes should not happen. | ||
maxReplicas: 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be hyper safe. We should probably set this lower, but I don't know of a sensible value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this should be a maxed out with number of nodes as @calvix suggested, or maybe a multiple of max number of nodes? but for now I guess let's use 100 and we should somehow test this
@@ -15,6 +15,7 @@ spec: | |||
template: | |||
metadata: | |||
annotations: | |||
cluster-autoscaler.kubernetes.io/safe-to-evict: 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sneaky: I forgot to add this annotation the default-backend.
I tested this on a tenant in gauss which has autoscaling. It seemed to work fine and the count was adjusted from 6 controllers to 2:
|
- type: Resource | ||
resource: | ||
name: memory | ||
targetAverageUtilization: 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did this value come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked some other repos and found the same value here: https://gitlab.com/charts/gitlab/blob/master/charts/nginx/values.yaml#L155-156
Also @puja108 checked with some contacts who also recommended 50%. We should see how this works out though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manuel, the maintainer of the IC said (paraphrased) "nginx above 50% is a strong sign for high load because of the way nginx works, you might waste a bit of resources but it's more safe that way"
IMO it is still most probably less resources than what we have now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for CPU that make sense to me, but for memory not so sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, right, definitely needs to be tuned separately, and tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, as Vaclav said, 50% of CPU utilization but not memory.
I am just little bit scared how will this behave on real workloads as 50% memory can be reached very easily on clusters with many ingress definitions and will cause probably scaling to tons of pods just because of many buffers for nginx. Can we at least somehow limit the number of pods to the number of nodes? that would be the most sane solution for now. We could use similar pod anti-affinity as we do for kvm pods HPA for critical components shoudl be probabbly complete separate story and not sneaked into other stories. As the behavior is hard to predict from testing clusters. |
This is a good shout: When each IC instance must create some static resources for each ingress definition, it could be that HPA gets crazy no matter what the load is, when there's just enough ingresses. Another factor contributing to IC resource usage is number of CPU cores on machine. When IC is running on big machine with default config, the number of workers is equal to number of cores on the machine and this can contribute in some cases a lot to static memory usage. On some customers when there are 128 or 256 cores in node, there has been specific configuration to reduce number of workers, but debugging this is occasionally a bit hairy depending on overall situation / circumstances. Just 2cents. |
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ .Values.controller.name }} | ||
minReplicas: 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we might want to make this configurable, and even if not I'd suggest changing it it to templated value with a default in values.yaml. same with max, and the 2 utilization values below (I'd do them separately)
use case for making min user configurable: some customers that run webscale loads know they never wanna go below x amount of ICs at least in certain times of year if not always.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree the templates should have values with sensible defaults.
Making this config user configurable makes sense to me. Since customers may need to tune the HPA settings on production clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, besides needing more templated values in there.
We can add the user configurability of min (maybe even max values and utilization) in a 2nd PR. /cc @giantswarm/team-batman WDYT
Main thing before we roll this out: We need to test scenarios under load, maybe @giantswarm/sig-customer can help or has ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direction is good and making IC more dynamic is a big improvement. But yes needs careful testing before rollout.
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ .Values.controller.name }} | ||
minReplicas: 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree the templates should have values with sensible defaults.
Making this config user configurable makes sense to me. Since customers may need to tune the HPA settings on production clusters.
we also talked in product that we need to be super careful here because of the nature of our customers, we'll be rolling out cluster-autoscaling even without this, so no rush |
@@ -16,6 +16,12 @@ configmap: | |||
server-tokens: "false" | |||
worker-processes: "4" | |||
|
|||
autoscaling: | |||
minReplicas: 2 | |||
maxReplicas: 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you go so low now? The thing is the default for now will be rolled to all customers. I'm pretty sure a lot of our customers go way higher than 15 (I've seen 64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to check on Asgard
talked to vaclav and he recommended lower. Going to test this a bit, none of these values are set in stone yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you go so low now? The thing is the default for now will be rolled to all customers. I'm pretty sure a lot of our customers go way higher than 15 (I've seen 64)
I'm genuinely interested in load metrics when someone wants to run even 15 nginx's not to say 64. I know on one customer there can be total of 64 but that's more of a mistake than a good practice. Single nginx can drive tons of load given that HW has enough resources and AFAIK we don't have a single tenant with hundreds of nodes which would justify that high number of instances. Please educate me 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, one more thing in here: Was it number of worker processes for single nginx IC or number of IC PODs? Ping @puja108
I'm asking this because default config is num_workers
== num CPU cores in node
which leads also to often wrong configuration when running on very big instance types (but this can be overridden with IC annotations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IC pods, and partly that is not because of nginx being the bottle neck but the EC2 instance networking (which also depends on the type of machine), I think gitlab had 11, so might be ok, but like Tuomas is saying, it is dependent on the machine it runs on. Let's just play around and we'll see with time. maybe @giantswarm/sig-customer then needs to talk to customers and clarify that they might not need that many
I added the HPA template flag. So by default, there is no HPA, we can enable when test shed good results and we are confident. But for now, it can be merged without affecting current state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just a query on whether we can remove global
from the templating.
@@ -0,0 +1,27 @@ | |||
{{- if .Values.global.hpa.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to use global. This was hacky and due to the k8scc migration. As we needed it to install the temp resources.
We still need to simplify the templates but because the IC migration is disabled for all providers I think it can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I do an extra section
@pipo02mix Being extra paranoid what about bumping the release channel to |
Tested in ginger. Version bumped and flag in top-level field. Merging |
Out of this discussion: https://gigantic.slack.com/archives/C3C7ZQXC1/p1549018125001100
Basic idea is to have HPA in order to have a better nginx-ingress setup for autoscaling.
towards https://github.com/giantswarm/giantswarm/pull/2206