Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support internal DNS names when exporting kubecfg #800

Closed
jschneiderhan opened this issue Nov 3, 2016 · 27 comments · Fixed by #9732
Closed

Support internal DNS names when exporting kubecfg #800

jschneiderhan opened this issue Nov 3, 2016 · 27 comments · Fixed by #9732
Labels
area/DNS lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P1
Milestone

Comments

@jschneiderhan
Copy link

When I created my cluster, I specified the --admin-access flag to the CIDR of a management subnet in my VPC. I connect to the management subnet using a VPN server, which gives me network access to the machines in my cluster.

When I run a kops export kubecfg, it populates my ~/.kube.config using the public DNS entry for my cluster instead, which isn't reachable from my workstation. If I change the server for the cluster in my ~/.kube/config file to the "internal" DNS name that was create for the cluster, everything works. Unfortuneatly, every time I update the cluster, the ~/.kube/config file resets the server attribute to the public DNS name.

Would it make sense to have a flag to support using the internal hostname in ~/.kube/config?

If so, I'd be happy to give implementing it a shot, although I may need a few hints on implementation ideas.

If not, is there a recommended way to setup kubectl when accessing the cluster over a private network? I'd be happy to submit a PR with some documentation updates if someone wants to help me explain the approach.

@justinsb
Copy link
Member

justinsb commented Nov 4, 2016

Interesting! Can you explain a little bit more about your setup?

  • Is there any way we can know the internal name?
  • How are you doing this? Is this something we should recommend for people so they can use private hosted zones?

@jschneiderhan
Copy link
Author

A bit more information about my setup:

I have a few VPCs, all of which existing before I spun up my k8s cluster. One is a management network (172.16.0.0/16) which has an OpenVPN server and a few other monitoring servers in it. The other VPCs are just for different environments (dev/staging/production). Most of our servers are in private subnets without public IPs, with most internet traffic coming though ELBs in public subnets.

The management network has VPC peering connections with the other VPCs, and the OpenVPN server sets up routes to the other VPCs for clients. So when we have a VPN connection established, we can connect to the machines from our workstations via the local IP. It works pretty well, and has been so much easier to manage than our previous solution of using jump boxes. If an engineer leaves or their account is compromised or something we can just cut network access off at the VPN server immediately before disable all their other accounts/credentials/other lose ends.

None of that has anything to do with k8s, but I figured it may be useful background info.

When I went to create my staging k8s cluster, my first thought was that it would be great to put all of the masters and nodes into private subnets without public IPs and all of us engineers can just access the cluster using local IPs over the VPN connection. That would keep all of our k8s instances from being routable over the internet, and have the only incoming internet traffic be though ELBs in public subnets create by k8s services.

That isn't quite possible yet, but I know people are putting awesome work into it (#428)! I figured the default approach of putting the k8s instances in public subnets with public IPs and hostnames was fine for the time being, especially if we could limit access to the k8s api to just local traffic from our management network. I don't really want the k8s api directly accessible over the internet, if possible. It turned out that kops has the --admin-access flag for just that purpose, so I used it when creating my cluster.

This is the command that I used to create my cluster (changed a identifiers)

kops create cluster --admin-access 172.16.0.0/16 \
                    --cloud aws \
                    --master-zones us-east-1a,us-east-1b,us-east-1c \
                    --network-cidr 172.20.0.0/16 \
                    --node-count 3 \
                    --vpc vpc-xxxxxx \
                    --zones us-east-1a,us-east-1b,us-east-1c \
                    --name k8s.example.com \
                    --ssh-public-key ~/.ssh/example.pub

Everything came up flawlessly. I actually couldn't believe it. I had just spent the past two weeks working through "kubernetes the hard way" setting everything up manually and tweaking things, then dealing with the fallout of my tweaks, so the fact that a tool did all of that in a matter of minutes was amazing.

Anyway, the only hiccup I had was that when I exported my kubectl config and tried running a command, I couldn't connect to the API. Upon closer inspection, I noticed that my ~/.kube/config file was using api.k8s.example.com as the server address. api.k8s.example.com resolves to the public IP addresses of the master servers. The k8s api is not accessible via the public IP addresses because the --admin-access flag adds a security group rule to limit access to 172.16.0.0/16.

In Route53 I saw that there is a corresponding DNS entry setup for api.internal.example.com which resolves to the private ip addresses of the master servers. When I updated my ~/.kube/config file to use the internal DNS entry, everything worked. Now all k8s api traffic is limited to VPN traffic and not accessible via the open internet.

So all of that works, but my ~/.kube/config keeps getting reset to the public DNS entry each time I use kops to update my cluster, so I have to go in and edit by hand each time. No big deal, but it would be great if it was supported by kops.

I'm not sure how kops can get the internal name when generating the config, must it must be available somewhere, since it's creating it in the first place! I'll poke around a bit and see if I can't figure it out. I do know that if I run kops edit cluster k8s.example.com there is an attribute for masterPublicName: api.k8s.example.com, but no corresponding private attribute. Perhaps an implementation idea would be to store the internal name alongside wherever that public name is being stored (S3 presumably) then provide a flag to allow specification during kops export kubecfg of whether to use the public (default) or private hostname?

@gladiatr72
Copy link

@jschneiderhan, Hey there. AWS uses the split-horizon DNS model--if you resolve one of their generated host names from within a VPC's address space you'll get VPC IP addresses; outside of their networks you are served the EIP of the instance.

@jschneiderhan
Copy link
Author

@gladiatr72 thanks - I've noticed that before but didn't know the formal name. I'm not using AWS generated host name here, though. I'm using the kops created DNS names api.k8s.example.com and api.internal.k8s.example.com which are A records to the public and internal IP addresses of the machines. Those two resolve to the same values from both outside and inside the VPC.

Either way, I'm running my kubectl from my workstation outside the VPC, so my DNS resolution is going to happen from outside the VPC.

@shrabok-surge
Copy link

@jschneiderhan,

When you connect to your VPN does it assign the AWS internal DNS server to your host (in your case 172.16.0.2) resolvers?
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/hosted-zones-private.html (See Custom DNS Servers)

@jschneiderhan
Copy link
Author

@shrabok-surge no - I don't have my VPN server changing any DNS settings client-side. I'm not familiar with private hosted zones so I'll read up on them. Maybe this is a non-issue and is just me not being experienced enough with the networking side of things.

If there is a way to have DNS requests for api.k8s.example.com return the values currently set for api.internal.example.com, then I can keep api.k8s.example.com in my ~/.kube/config file. I have some reading to do, but it sounds like perhaps I need to setup a private zone and configure OpenVPN clients to route specific domains though to the private zone.

@shrabok-surge
Copy link

Ah, I missed that you have a different name space for your internal dns.
As per @gladiatr72 statement that AWS uses split-horizon dns model. Usually you will have the same zone:

api.k8s.example.com (public zone)
api.k8s.example.com (private zone)

When you connect to your VPN you inject the AWS DNS server 172.16.0.2 in your case and you will automatically resolve to api.k8s.example.com internal entries and not the external dns zone entries.

In your case, have you tried specifying the internal DNS zone (api.internal.example.com) as the dns zone parameter on cluster creation?

--dns-zone=api.k8s.example.com

Not sure if it will work and might require a name change (--name=api.k8s.example.com)

Hopefully that provides some help.

@jschneiderhan
Copy link
Author

jschneiderhan commented Nov 9, 2016

Ok, so I think I understand what is being recommended. Have k8s.example.com resolve to public IP addresses from the public internet, and to internal ip addresses from inside the VPC and from clients set to use the private hosted zone for DNS resolution.

That would work, but it would then require me to manually manage the DNS entry in the private hosted zone, since I don't think kops will currently do that for me. kops does manage the api.internal.example.com entry, though, which is why I thought just providing a flag to use that when exporting ~/kube/config would do the trick.

With the private topology recently being merged, perhaps I'm just trying to put a square peg in a round hole. The main thing is that I don't want my k8s API accessible over the internet - I want the master security group to only allow traffic from my management subnet, and have kubectl use the internal IP addresses. I assume that when using the private topology, this is the case. I'll have to test that out.

If you think this is the case, and maybe I'm trying to do something that the public topology just isn't intended to do, I'm happy to close this issue.

Thank you everyone who has chimed in so far!

@shrabok-surge
Copy link

shrabok-surge commented Nov 9, 2016

I think we are in the same boat @jschneiderhan. Currently we are looking to deploy using kops and we require internal only deployment which relies on internal dns, private subnets and using existing routes with nat gateways. Currently from what I can tell is some of the private networking stuff is getting added to the master branch, but I only think part of the solution is there. And the DNS management built into the kubernetes/kops deployment gets confused if you have two of the same domain. Not sure if there is an option to specify the Hosted Zone ID to leverage this, but you have to have an existing vpc already created before the kops deployment. It's a bit of a chicken and the egg scenario for the current kops deployment from what I can tell.
But I believe we are waiting for the same thing.

@justinsb justinsb added this to the 1.5.0 milestone Dec 28, 2016
@justinsb justinsb added the P1 label Jan 17, 2017
@nckturner
Copy link
Contributor

We have a similar need - we want the kops created subnets to be public, that is to have an internet gateway, but have restricted api and ssh access to an internal ip range. Everything works great when api.internal.clusername.com is used in our ~/.kube/config. The annoying thing is that it gets overwritten every time a cluster update is performed.

Is there any reason not to do what @jschneiderhan suggested and have an option in the kops cluster yaml that refers specifically to the endpoint should be used to populate the ~/.kube/config server scalar?

I'm thinking something like this:

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: cluster.company.com
spec:
  masterPublicName: api.cluster.company.com
  masterPrivateName: api.internal.cluster.company.com
  masterCfgEndpoint: private

This would populate your ~/.kube/config with the private dns entry.

I noticed that there is already an api section that allows you to choose between dns and a loadbalancer when selecting how your masters are exposed, but this does not seem to address what actually gets populated in the config. We may want kops to create both public and private dns or loadbalancers, but only choose one of those for the default way to access the cluster.

@rdtr
Copy link
Contributor

rdtr commented Feb 17, 2017

Just for your info -

I'm using Pritunl as a VPN server, and changed the Pritunl server side setting so that the VPN client should try to resolve DNS through VPN (I specified xxx.xxx.0.2, AWS's private name server. Default is 8.8.8.8). Then I associate managing VPC to the hosted zone of kops, then now I'm able to access the cluster from my laptop using the default kubeconfig.

I believe OpenVPN is more flexible than Pritunl so there may be some way to resolve the issue by tweaking DNS server setting. That being said, I also think the improvement mentioned above sounds nice-to-have :)

@jkinkead
Copy link
Contributor

Same issue here. It looks like the dns-controller won't register the external DNS entry if there's no external IP address for it to point to.

I've resolved by updating the api.clustername entry to a CNAME pointing to api.internal.clustername. This seems likely to be robust to restarts & config regeneration.

@jschneiderhan
Copy link
Author

FWIW we've recently recreated our clusters using a private topology and this is no longer an issue for us. The ~/.kube/config is setup using internal DNS entries (by making use of an AWS private hosted zone) and we updated our OpenVPN client's to route DNS through the VPN connection. Works perfectly.

@jkinkead
Copy link
Contributor

We've played around with both topologies. It's unfortunate how sparse the documentation of them is - we seem to want some features of both types of topologies, and it's quite hard to tell what each provides by default, and what can be overridden.

Specifically, we want:

  • private IPs ONLY for all nodes & master
  • internet access from nodes via a NAT
  • full network access to all nodes & master from corp network

We achieved this by building a new VPC on EC2, with subnets configured manually to accept the traffic we want, and route through a self-managed NAT. If you point kops to the existing subnet as a public topology, it doesn't care that there aren't public IPs for the nodes - except for the DNS registration of the API.

The problem with a private topology is that it creates bastion nodes, which we don't want or need (and it also wants those to be open to the internet, which again we don't want or need).

@jschneiderhan
Copy link
Author

@jkinkead if you leave out the--bastion argument when creating the cluster, I don't think kops creates a bastion server (

o.Bastion = false
). It didn't for us last time we created a cluster (I'm not sure what version we were on). Out of the box the private topology gave us the first and second of your bullet points. The third we solved using OpenVPN..

@chrislovecnm
Copy link
Contributor

The answer above is spot on. Can we close this issue now or do we have a feature request still?

@bcorijn
Copy link
Contributor

bcorijn commented Apr 25, 2017

@chrislovecnm for me personally this is still a useful feature. I don't mind (and prefer) to have my nodes public, but would prefer the API access to be as private as possible, which includes the DNS records.

@jschneiderhan
Copy link
Author

@chrislovecnm I, personally, no longer need this feature as we've moved to using a private topology. Looks like some others still think it would be useful, though.

@jkinkead
Copy link
Contributor

The --topology=private still wants a Utility subnet, which we don't need. I'm not sure exactly what it's used for. I honestly can't tell what private and public even mean with an existing subnet; the documentation is quite sparse.

@jschneiderhan
Copy link
Author

@jkinkead I know the NAT Gateways are placed in the Utility subnets so that they have public IPs associated with them.

@jkinkead
Copy link
Contributor

Our existing subnets already have NAT gateways set up, so we don't want that. :)

@chrislovecnm
Copy link
Contributor

@justinsb is this supported now?

@chrislovecnm
Copy link
Contributor

@geojaz / @justinsb is this supported now?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 30, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 29, 2018
@bcorijn
Copy link
Contributor

bcorijn commented Jan 29, 2018

/remove-lifecycle rotten

As mentioned above, I would still appreciate having this feature.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 29, 2018
@chrislovecnm
Copy link
Contributor

/lifecycle frozen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/DNS lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P1
Projects
None yet
Development

Successfully merging a pull request may close this issue.