-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[addons/CA] Add support for specifying resources and metrics #10281
[addons/CA] Add support for specifying resources and metrics #10281
Conversation
Hi @dntosas. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
79851ef
to
ba32976
Compare
/ok-to-test |
ba32976
to
ec2b3e2
Compare
ec2b3e2
to
0b98e56
Compare
@@ -134,6 +134,12 @@ spec: | |||
metadata: | |||
labels: | |||
app: cluster-autoscaler | |||
{{ if .Metrics }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These annotations are not much in use anymore and we tend not to set them. Instead we expect users to create pod/service monotors themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cases like coreDNS where these annotations seem useful. Do you suggest completely drop from here or migrating to a ServiceMonitor? I am cool with both options ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that since Metrics is a pointer you need to use WithDefaultBool
.
I am not sure we should set any limits. It is rather dangerous to have CAS crashing due to oom. Maybe it is better to remove limits alltoghether. |
0b98e56
to
6f09f45
Compare
Will do that ^ Two more thoughts to consider:
Sorry for the many questions 😋 |
Bumping patch versions is good. I didn't do that as I suspect more patch versions will be released before kops 1.20. but feel free to add to this PR. I don't think making replicas configurable is that useful. But if someone makes a good case for it, I will certainly consider it. |
@dntosas do you have time to look at wrapping this one up? |
sure ^^ i let it stale because i didn't understand if we wanted to close this PR as not needed |
6f09f45
to
48be140
Compare
48be140
to
d1db788
Compare
@dntosas: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
d1db788
to
ac11e07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. My intention was to only provide feedback on improving this.
@@ -134,6 +134,12 @@ spec: | |||
metadata: | |||
labels: | |||
app: cluster-autoscaler | |||
{{ if WithDefaultBool .Metrics false }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the only thing this flag does is add the annotation, it is not that useful. You could just add the annotations without the flag. Typically such a flag would be used for enabling/disabling CAS serving metrics in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed flag also ^^
@@ -83,6 +83,9 @@ func (b *ClusterAutoscalerOptionsBuilder) BuildOptions(o interface{}) error { | |||
if cas.NewPodScaleUpDelay == nil { | |||
cas.NewPodScaleUpDelay = fi.String("0s") | |||
} | |||
if cas.Metrics == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already default this one to false in the manifest. So you do not need this bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it ^^
dc0cc6b
to
8b7fb57
Compare
docs/cluster_spec.md
Outdated
skipNodesWithSystemPods: true | ||
cpuRequest: "100m" | ||
memoryRequest: "300Mi" | ||
metrics: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remove this one, we should be good to go
- Resources We enable users to set their desired capacity for cluster-autoscaler addon. There are edge cases, especially in big clusters, where autoscaler needs to reconcile a large number of objects thus may need increased memory or increased cpu to avoid saturation. - Metrics Cluster autoscaler provides valuable insights for monitoring capacity allocation and scheduling aspects of a cluster. In this commit, we add proper annotation on deployment to enable Prometheus scrape metrics. We also bump patch version of container images. Signed-off-by: dntosas <ntosas@gmail.com>
8b7fb57
to
56fe4ba
Compare
Cool. Thanks |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dntosas, olemarkus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In this PR, we make some additions on Cluster Autoscaler Addon Specs
Resources
We enable users to set their desired capacity for cluster-autoscaler addon.
There are edge cases, especially in big clusters, where autoscaler needs
to reconcile a large number of objects thus may need increased memory to
avoid OOMkills or increased cpu to avoid saturation.
Metrics
Cluster autoscaler provides valuable insights for monitoring capacity
allocation and scheduling aspects of a cluster. In this commit, we
enable users to add proper annotation on deployment to scrape metrics
via Prometheus.
Signed-off-by: dntosas ntosas@gmail.com