Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Add ability to configure flags for cluster-autoscaler-de.yaml [Small] #1422

Merged
merged 1 commit into from
Aug 3, 2018

Conversation

tyrannasaurusbanks
Copy link
Contributor

@tyrannasaurusbanks tyrannasaurusbanks commented Jul 20, 2018

What

Exposing cluster autoscaler options via cluster.yaml

Why

I'd like to be able to configure the cluster-autoscaler to make it more sensitive.
We have a lot of clusters of different shapes/sizes/responsibilities. One cluster in particular runs only 1 app (don’t ask :sadpanda:) which I believe makes it incompatible with the cluster-autoscaler default settings. We recently incurred the following outage:

  • cluster-autoscaler scaled down the cluster to give just enough nodes to accommodate the 15 replicas of our app.
  • team deployed new app version
  • rolling update strategy of maxSurge=20% caused 3 new replica's to be added
  • these 3 new replica's couldn't fit in the cluster
  • cluster-autoscaler kicked in and started to provision new nodes, but new resources weren’t available in time and created a backlog in the scheduling of the new pods
  • deploy eventually failed, i got pinged.

How

I’ve added an Options map to the Autoscaler struct which will be templated out in the cluster-autoscaler-deployment.yaml. This allows us (hotelscom) to specifically add the options we need but also give others the chance to override any they see fit.

Notes

I’ve chosen to retain the opinionated kube-aws direction by leaving the current default values hardcoded in the deployment.yaml, instead of setting them as defaults in the map.
I’ve tested that these values can still be overridden if duplicated in the ‘options’ map - it appears the libraries used by the autoscaler give precedence to whatever appears/is-resolved last.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 20, 2018
@tyrannasaurusbanks tyrannasaurusbanks changed the title Add ability to configure flags for cluster-autoscaler-de.yaml Add ability to configure flags for cluster-autoscaler-de.yaml [Small] Jul 20, 2018
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 20, 2018
@codecov-io
Copy link

codecov-io commented Jul 20, 2018

Codecov Report

Merging #1422 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1422      +/-   ##
==========================================
+ Coverage   37.57%   37.58%   +0.01%     
==========================================
  Files          74       74              
  Lines        4546     4547       +1     
==========================================
+ Hits         1708     1709       +1     
  Misses       2603     2603              
  Partials      235      235
Impacted Files Coverage Δ
core/controlplane/config/config.go 63.62% <100%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2dea8dd...9708330. Read the comment docs.

Copy link
Contributor

@mumoshu mumoshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall! Thanks for the improvement. Hopefully we could find a way to extend this feature to other deployments managed by kube-aws in a consistent manner

# Sensible defaults are already configured in the controller-cloud-config but if you wish to override them simply
# add them here and they'll take presedence.
#options:
# flagName: value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nit but would flag-name would better represent that it must be the flag name minus --?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - makes sense. changing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry this had been in my merge queue for too long! Just merged. Thanks again for your contribution

@mumoshu mumoshu added this to the v0.11.0 milestone Jul 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants