Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to provide custom CoreDNS tolerations and affinity #12234

Merged

Conversation

hierynomus
Copy link
Contributor

No description provided.

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Aug 31, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @hierynomus!

It looks like this is your first PR to kubernetes/kops 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kops has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 31, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @hierynomus. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hierynomus
Copy link
Contributor Author

CLA has been signed.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 31, 2021
@hierynomus
Copy link
Contributor Author

@mikesplain @olemarkus Is there something more needed for this PR? Or you just haven't gotten around to it yet?

@olemarkus
Copy link
Member

Hi, sorry it is on my list. Have you tested the PR locally? I'd be surprised if this actually works.

Further, offering this kind of customization on a core addon is unlikely to be added. We'd at least have to understand the usecase. Kops will not really work on a cluster where there aren't worker nodes workloads cannot be scheduled on by default.

@hierynomus
Copy link
Contributor Author

Hi Ole, No problem!
Yes, we've tested this on our own environment using a source-built kops executable in our provisioning pipeline. The change itself does what it needs to do.

The usecase for us is that our cluster is configured in such a way that by default we want everything in our k8s clusters to run on AWS spot nodes. But CoreDNS needs to be scheduled on the master and on-demand nodes only.

With this change the default behavior is unchanged, but it is now allowed to specify custom tolerations and affinity such as this (taken from our kops cluster definition):

  kubeDNS:
    provider: CoreDNS
    tolerations:
      - effect: NoSchedule
        operator: Exists
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kops.k8s.io/instancegroup
              operator: In
              values:
              - master
              - ondemand-nodes
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - podAffinityTerm:
            labelSelector:
              matchExpressions:
              - key: k8s-app
                operator: In
                values:
                - kube-dns
            topologyKey: kubernetes.io/hostname
          weight: 100

@hierynomus
Copy link
Contributor Author

@olemarkus Is the usecase that I described clear?

@olemarkus
Copy link
Member

So this is essentially to avoid coredns from running on spot instances? CoreDNS should work fine on spot instances. Even more so if the NTH addon is enabled. I would specifically avoid running anything on master nodes that absolutely does not have to run there.

@hierynomus
Copy link
Contributor Author

I think it's more generic, that in the case of setting up a kubernetes cluster with multiple instance-groups, that you would want to be able to control where certain pods land. We see the same at our customers who want to target specific workloads to specific instance-groups.

@olemarkus
Copy link
Member

Brought this up in office hours today and we decided this feature looks good.

/ok-to-test
/assign

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 16, 2021
@hierynomus
Copy link
Contributor Author

/retest

@hierynomus
Copy link
Contributor Author

@olemarkus What is still needed to proceed with this PR? I see that there are a few CI jobs which require an approval before they're run (build-*-amd64)

Other than that, I think it needs a code-review/approval?

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 20, 2021
@hierynomus
Copy link
Contributor Author

I took the liberty to create a test for the change this PR adds: 238670b
It can be a bit tricky for first-time contributors to create these, so I thought this may save some time.
Can you cherry that commit into this PR?

It would also be nice if you could squash some of the commits.

Thanks a lot! That would've indeed cost me considerably more time. I've added your commit and squashed them all into one.

@hierynomus
Copy link
Contributor Author

/retest

@olemarkus
Copy link
Member

You need to run ./hack/update-expected.sh to fix those failures

@hierynomus
Copy link
Contributor Author

You need to run ./hack/update-expected.sh to fix those failures

That's indeed what I tried locally before doing the retest, but it didn't show any changes after that command. I'll re-run it

@hierynomus
Copy link
Contributor Author

@olemarkus If I just urn ./hack/update-expected, it executes a bunch of tests, but at the end there is no difference to any of the files. Am I missing something in how to execute this?

@olemarkus
Copy link
Member

Not sure if you noticed, but hack/update-expected.sh fails because of the following error:

bootstrapchannelbuilder_test.go:174: error from BootstrapChannelBuilder Build: error reading manifest addons/coredns.addons.k8s.io/k8s-1.12.yaml: error opening resource: error executing resource template "addons/coredns.addons.k8s.io/k8s-1.12.yaml": error executing template "addons/coredns.addons.k8s.io/k8s-1.12.yaml": template: addons/coredns.addons.k8s.io/k8s-1.12.yaml:121:32: executing "addons/coredns.addons.k8s.io/k8s-1.12.yaml" at <8>: wrong type for value; expected string; got []v1.Toleration

@hierynomus
Copy link
Contributor Author

Missed that in the build logs, weird thing is that doesn't happen locally. Does that behave differently when run on the CI env?

@olemarkus
Copy link
Member

It should happen to you locally as well. Do you see this command exiting successfully?

@hierynomus hierynomus force-pushed the coredns-affinity-tolerations branch 3 times, most recently from 4e37f0b to 56ac813 Compare September 28, 2021 14:34
@hierynomus
Copy link
Contributor Author

Ok, after merging in master, I had the same error locally. The template now uses a newly added ToYAML TemplateFunction to render the v1.Tolerations and v1.Affinity objects as yaml.

Signed-off-by: Jeroen van Erp <jeroen@hierynomus.com>
@hierynomus
Copy link
Contributor Author

@olemarkus Any idea about why that test fails? Is it something I changed, or is it flaky somehow?

@olemarkus
Copy link
Member

Looks like a flake

/retest

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 28, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: olemarkus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2021
@hierynomus
Copy link
Contributor Author

Thanks @olemarkus! Two tests now failing on that cni-kuberouter test, looks like flakes indeed. /retest

@hierynomus
Copy link
Contributor Author

/test pull-kops-e2e-cni-kuberouter

@k8s-ci-robot k8s-ci-robot merged commit 8f91247 into kubernetes:master Sep 28, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 28, 2021
@hierynomus
Copy link
Contributor Author

Thanks for the guidance @olemarkus!

@hierynomus hierynomus deleted the coredns-affinity-tolerations branch September 29, 2021 07:17
k8s-ci-robot added a commit that referenced this pull request Sep 30, 2021
…234-origin-release-1.22

Automated cherry pick of #12234: Add ability to provide custom CoreDNS Tolerations and
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants