Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Best method to permanently modify kube-dns configuration? #1089

Closed
pedrobizzotto opened this issue Dec 22, 2017 · 9 comments
Closed

Best method to permanently modify kube-dns configuration? #1089

pedrobizzotto opened this issue Dec 22, 2017 · 9 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@pedrobizzotto
Copy link

pedrobizzotto commented Dec 22, 2017

Hello all,

kube-aws 0.9.8 user here

To solve some issues we encountered with name resolution I applied two modifications to kube-dns:

The problem I'm having is that when I replace a controller instance, or make any changes that results in a new controller instances coming up, both the configmap and the deployment are reset to the default values, i've isolated this to the script /opt/bin/install-kube-system that applies the deployment and configmap files present in /srv/kubernetes/manifest every time a controller instance is created.

Is there a way to make the changes permanent without modifying the cloud-init file for the controller?

Thanks,
Pedro S. Bizzotto

@mumoshu
Copy link
Contributor

mumoshu commented Dec 26, 2017

@pedrobizzotto Hi, thanks for trying kube-aws!

Is there a way to make the changes permanent without modifying the cloud-init file for the controller?

Unfortunately, no - would you like to make it a feature request?
Sharing me a specific reason why you don't want to modify cloud-config-controller for a persistent kube-dns configmap content would also be helpful!

@pedrobizzotto
Copy link
Author

@mumoshu Hello, thanks for the answer.
The environment i'm working on has an AD server as DNS configured in the DHCP Options for the VPC.
We have some issues when the DNS sends empty responses on some queries, especially non-authoritative, and the empty responses are being cached in the dnsmasq container of the kube-dns pod.
The workaround I had was adding a custom configmap pointing the most used external domains to the VPC DNS endpoint and modifying the options of the dnsmasq daemon to not cache negative responses.

The reason I want to make the changes without altering cloud-config-controller is that any modifications will need to replace the controller instances to be effectively applied. In a dev/QA setup this isn't an issue, but in a production setup this is more serious.

I just had an issue where the deployment and configmap were reset but apparently we had not replaced any of the controller instances, will try to gather more info on this.

Thanks again!

@pedrobizzotto
Copy link
Author

@mumoshu , hello again,
Seems like the systemd unit install-kube-system is running on every reboot of the controller.
here is a snippet of the output of journalctl -u install-kube-system:

-- Logs begin at Tue 2017-12-19 18:17:04 UTC, end at Thu 2017-12-28 12:19:14 UTC. --
Dec 19 18:17:52 ip-xxx-xxx-xxx-xxx.somedomain.local systemd[1]: Starting install-kube-system.service...
Dec 19 18:17:52 ip-xxx-xxx-xxx-xxx.somedomain.local bash[1041]: activating
...snip...
Dec 19 18:21:02 ip-xxx-xxx-xxx-xxx.somedomain.local retry[2292]: rolebinding "heapster-nanny" created
Dec 19 18:21:02 ip-xxx-xxx-xxx-xxx.somedomain.local systemd[1]: Started install-kube-system.service.
-- Reboot --
Dec 22 15:02:18 ip-xxx-xxx-xxx-xxx.somedomain.local systemd[1]: Starting install-kube-system.service...
Dec 22 15:02:18 ip-xxx-xxx-xxx-xxx.somedomain.local bash[1042]: activating
...snip...
Dec 22 15:02:51 ip-xxx-xxx-xxx-xxx.somedomain.local retry[2257]: configmap "kube-dns" configured
...snip...
Dec 22 15:02:53 ip-xxx-xxx-xxx-xxx.somedomain.local retry[2257]: deployment "kube-dns" configured

Is this the intended behavior?

Thanks!

@mumoshu
Copy link
Contributor

mumoshu commented Apr 25, 2018

@pedrobizzotto Hi! Sorry for the late reply.
Yes, it is definitely an expected behavior. However I'm not satisfied with it.
May I ask for you ideas?

Like:

  • kube-aws update updates just controller nodes. kube-system components are left untouched.
    • You run kubectl apply ... yourself afterwards
  • A cloudformation custom resource to literally run kubectl apply from outside of controller nodes?
    • Sounds a bit overkill

@pedrobizzotto
Copy link
Author

Hello, thanks for the response,

I don't know yet how hard would it be to implement, but maybe the script called by the systemd unit should check for the existence of the components and only apply if they are missing?

This can, of course, leave the components in an non-optimal state, for example if you mess up the configuration of the deployments after the cluster has booted , but that's something you can run kubectl apply to fix manually later, I think.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 23, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 23, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

4 participants