New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoscale of hosts and containers based on thresholds on the metrics #3893
Comments
+1 |
+1 |
9 similar comments
👍 |
+1 |
+1 |
+1 |
+1 |
+1 |
👍 |
+1 |
+1 |
Thanks for opening this Request LRancez. Any ideas on how this could be best implemented into Rancher. Could this be a catalog entry or should it be part of the core? Currently rancher doesn't store any tokens or credentials related to cloud providers. I did some small test with the websockets the rancher api exposes using Digital Ocean. It's pretty strait forward creating and destroying hosts using the api. I'll have another look at this. |
Hi @boedy , thanks for the response. As for the credentials, I personally like and trust the way that containers handles internal connection and only exposes the things that need to be exposed. So maybe is just simply add a small db as part of the stack is more that enough for this. |
+1 |
1 similar comment
+1 |
Please use the Add Reactions feature to "+1". This way not everyone gets notified. Thanks :) |
I'd just realize that I didn't mention it, but the ability to scale up and down on a pre defined timed schedule, in parallel to the system metrics mechanism, will also be mostly useful. |
The best feature. \o/ And rebalance containers between nodes. It is important also. A help example, the closest I found to these features is:
Did you see this @alena1108 @vincent99? |
+1 |
3 similar comments
+1 |
+1 |
+1 |
Release v1.2.0-pre1 has experiemental support for Kubernetes 1.3, so does that mean we get autoscaling out of the box if we spin up a Rancher Kubernetes environment? I'll have a go when I get some time and report back. |
Kubernetes Horizontal Pod Autoscaling doesn't work either due to #5578 |
+1 |
@xaka we're mainly scaling up a cluster automatically if we load so many stacks up the CPU comes under pressure. This happens rarely since we've tuned each cluster instance type to accommodate the number of containers we're using. We don't scale hundreds of the same container out so it's either one container per cluster instance (global) or just a single container that gets deployed somewhere onto the cluster. Right now it's just a much easier way of adding/removing hosts to a cluster automatically and saves on cost since they're all Spot Instances. For an application that needs more dedicated resources, we create an application specific cluster, since Rancher labels don't have a way for a host to ONLY run containers with a specific label currently. This way if SpotInst sees high CPU usage it auto-scales the application cluster, and since the stack is setup globally, it brings up more hosts/containers to support it. It's like a modified method of an AWS auto-scaling-group but instead of using an AMI for the application it uses containers instead. There are probably much better ways to do this, and as we progress with our use of Rancher and how others are doing things I'm sure it will change. |
How good are we on this? Have we implemented some parameters for automated scaling of containers in rancher cattle. |
+1. |
+1 has this been added yet? |
+1 |
1 similar comment
+1 |
I'm using Prometheus+Grafana, and I set a webhook on Rancher to scale up my webserver, then Grafana sends the webhooks according with the CPU value. |
+1 |
3 similar comments
+1 |
+1 |
+1 |
@hugodopradofernandes sounds good, need to try that! +1 for integration of Prometheus+Grafana in a simmilar way out of the box! |
+1 |
1 similar comment
+1 |
1+ Also, I would mention that one of the most underrated metrics for autoscaling is response time, if you're dealing with a typical web app. I don't care (that much!) if my cluster is running at 97% CPU usage if the response time is staying within healthy limits. Similarly, if the response time spikes every time the CPU is above 25%, then you will still want to scale up, even though 25% cpu wouldn't be the scale up point in most situations. In some situations, it makes sense to scale based on what actually matters, the speed of your app, not some misc symptom like cpu. Just 2c |
+1 |
4 similar comments
+1 |
+1 |
+1 |
+1 |
Prometheus metrics + webhook trigger based on them should work. But would be nice to have that functionality out of the box! Webhook for pods autoscaling + webhook for ec2 instances / hosts/ worker nodes autoscaling. |
+1 |
Well since webhook seems a nice idea and could be a valid workaround for this missing feature. Well @rancherdev, this issue is one that is open since 2 years and im notified about a +1 minimum twice a week. |
This is obviously never going to happen, its been ignored for 2 years. Its sad, but this is why Rancher is falling out of people's comparison lists when looking at container platforms. It was nice knowing you, Rancher. |
@michael-henderson |
Since there's a lot of people wanting this, I built a little side project to act as the missing autoscale functionality for Rancher v1.6: https://autoscale.co I already implemented autoscaling with Rancher for my own project Codemason and figured I should spin it off as a separate service anyone else who might need it. Hope it helps! |
With the release of Rancher 2.0, development on v1.6 is only limited to critical bug fixes and security patches. |
This enhancement request is based on:
https://forums.rancher.com/t/rancher-host-autoscaling/1098
The idea is that we could configure the some thresholds on the metrics to scale up or scale down. This can be applied to the containers, and also to the hosts using the apis for host creation in the clouds.
We should probably need to be able to configure a minimum of containers and/or host when the autoscale is enabled so it doesn't necessarily kill everything on very low workload. Also a maximum so memory leaks or internal error don't make everything grow indefinitely.
Thinking in the current structure of Rancher, I imagine that the autoscale for host can be at the environment level of configuration and monitoring thresholds that apply to the hosts metrics. Note that autoscaling the host not necessarily means to autoscale the containers in the host. This probably should be combined with the containers autoscaling to move or scale the containers around to leverage the workload.
The containers autoscale can be simply at the service levels monitoring thresholds of the containers that had this enabled.
The flow that I imagine is:
I personally believe that this feature will be extremely useful for cloud environments. What do you think?
Best,
The text was updated successfully, but these errors were encountered: