-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [request]: Automatic autoscaling for CoreDNS addon #1458
Comments
We do some configuration in the config map of coredns to avoid dns resolution in case of intra-cluster queries and do some caching. While migrating to EKS and using EKS managed CoreDNS ADD-ON we would also like to configure these options, preferably using eksctl.
|
@sanusatyadarshi There's an existing issue (#1275 ) for supporting customising the configuration of CoreDNS |
related #1679 |
We ran into an issue today where CoreDNS was not auto-scaled properly, and was the cause of a partial cluster outage. |
Hey, any update on this one? advanced configuration for CoreDNS is supported that is very nice, but autoscaling with CPA for example is not supported as a flag to the advanced configuration like kOps is enabling. |
In our project we have also had some incident and now we change it manually, but this should be temporary. please could you update us? |
You do not need to scale it manually, you can utilize https://github.com/kubernetes-sigs/cluster-proportional-autoscaler which is commonly used to autoscale CoreDNS |
It would be very nice to have cluster-proportional-autoscaler available as an EKS addon to solve this problem. |
Make sure to set it is |
My team also experienced a significant outage of the cluster because CoreDNS was overloaded. |
If you don't need kubernetes internal dns in your containers (some workloads don't) you can add a dns server as a sidecar. This way every container has it's own dns server which proxies queries forward to. By doing this you can skip coredns completely - dns queries just work at all times. I understand that this is not something that everybody can do, but just a thought. I wrote a special dns proxy for this purpose https://github.com/matti/harderdns - It would not be too hard to also add coredns as upstream - this way harderdns would proxy the request to coredns and retry it without causing client to fail. It's also possible to run this in the same container like this:
and then launch it on background in your entrypoint. |
Piggybacking on @jwenz723. I have seen CoreDNS get overwhelmed with requests with default settings a number of times as well. It seems Kubernetes has a solution documented using the cluster-proportional-autoscaler. Kubernetes dns-autoscaler. Could there be a configuration option for the EKS Plugin Add-On for CoreDNS where you could toggle on or off the autoscaler? It would be off by default. Using CoreDNS suggestions here you could come up with a default setting and allow customers to override the defaults through the same configuration values. |
Thank you all for patiently waiting. [1] https://docs.aws.amazon.com/eks/latest/userguide/coredns-autoscaling.html |
@sjastis from the doc you shared
how many nodes or cores it needs to trigger coredns scale up ? |
Our default algo scales per "coresPerReplica":256,"nodesPerReplica":16. However, the idea is abstract this away from users and evolve this based on internal heuristics. |
@sjastis I've gone ahead and tried the feature out in one of our smaller EKS clusters (EKS 1.28 with addon version v1.10.1-eksbuild.11). My configuration values look like:
I've set conflict resolution to use overwrite. The changes seem to be made successfully. After the changes have been applied, I don't see any new resources related to the autoscaler created. My replica count also doesn't scale up from 2. What should I be seeing in the cluster (if anything)? Also, everything I can see related to this implementation suggests that a cluster proportional autoscaler is being used to do the autoscaling. As per coredns/coredns#5915 (comment), can we have the option to do the autoscaling using an HPA? |
@kenny-monster At EKS, the primary scale limit of coreDNS is the 1024 PPS limit from coreDNS to the Amazon-provided DNS servers, that's the reason we choose to horizontally scale coreDNS based on overall cluster size. We don't choose to use HPA at this time due to it's requirement of Metrics Server(which is not enabled in EKS clusters by default). Our implementation is separate from the upstream CPA, and we have designed the api to hide the implementation details so based on the usage and feedbacks we'll evolve our implementation to consider more metrics in the future. (e.g. if Metrics Server is installed, leverage more metrics such as cpu/memory as feedback loop) |
Ah. Yep, looks like that's the reason. Sorry for not reading more carefully. I'll be keeping an eye on this feature. |
@M00nF1sh FYI - 1.30 is missing from the compatibility table. |
Community Note
Tell us about your request
Currently, the AWS-managed Add-on for CoreDNS simply deploys a deployment with 2 pods, with no autoscaling of the deployment as the cluster scales. This results into users encountering problems when their cluster scales to the point where these two pods are no longer able to cope with the level of NS traffic. Given that kOps deployed clusters provide this autoscaling by default, this is likely to confuse a number of users migrating from kOps to EKS.
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Provide DNS in our clusters that scales as our clusters do by default.
Are you currently working around this issue?
By deleting the AWS managed deploy of CoreDNS and deploying our own with an autoscaler via a helm chart on cluster creation.
Additional context
N/A
Attachments
N/A
The text was updated successfully, but these errors were encountered: