Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use '= gethostname();' instead of "<hostname>" #174

Merged
merged 1 commit into from
Aug 1, 2018

Conversation

andyliuliming
Copy link
Contributor

this is the default dhcp config in ubuntu now.
and it works in both azure and other place.

@cfdreddbot
Copy link

Hey andyliuliming!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA.

…place. and this is the default dhcp config in ubuntu now.
@svrc
Copy link

svrc commented Jul 11, 2018

@andyliuliming Hi Andy, @cppforlife , @mxplusb , @dpb587 , @Amit-PivotalLabs and I spoke about this today, one question, for the gethostname() setting , the implication is that this sets the hostname in DHCP in azure and makes the worker node hostname resolvable in azure DNS?

So, yes, we are sending very old config syntax dating back to Lucid Ubuntu in dhclient.conf here with the <hostname> syntax. In Trusty and onwards, the DHCP never is actually given the hostname, which looks like a bug .... but this behaviour (wherein the hostname isn't actually sent as the syntax changed to gethostname()) is what's actually desired and assumed for most BOSH deploys.

We're thinking the hostname is typically the bosh agent ID (a GUID) which was never originally intended to be exposed via DNS. Dynamic hostname setting via DHCP seems like it can lead to a security problem where other VMs could request/overwrite existing hostnames. Do you know if Azure secures this? I'm vaguely remembering that Azure will register this under myhostname.<azure supplied unique VM ID>.internal.cloudapp.net but wanted to get your confirmation that this sort of hostname can't be hijacked by other hosts.

We'd also like to testing the bosh agent behavior on a Xenial stemcell, if it's still using the old Lucid-era Ubuntu syntax for dhclient.conf / if this fix applies to that as well.

One other idea was to try to set the CFCR azure kubelet --hostname-override to a BOSH DNS name for the worker node via spec.address . This way we can get the predictable hostname. If I recall I had problems with this because the K8s Azure cloud provider has no way of being forced to recognize the BOSH DNS name as the node ID.

@andyliuliming
Copy link
Contributor Author

@svrc-pivotal Hi Stuart,
yes, the azure distribute the search domain

cat /etc/resolv.conf
nameserver 168.63.129.16
search u32u0ul0g4pu5mcsvblskhrmgg.hx.internal.cloudapp.net
The search domain are different in the different virtual networks. And it's protected by the azure network infrastructure.

I tested the change on the trusty and the xenial stemcells on azure, with the patch, the cfcr deployment would be successful, and all the conformance tests passed.

And yes, use the spec ip as the hostname-override does not work.

@andyliuliming
Copy link
Contributor Author

tracker story here

@andyliuliming
Copy link
Contributor Author

@mfine30 @jfmyers9 ping :)

@jrussett jrussett changed the base branch from master to develop July 26, 2018 20:43
@jrussett
Copy link
Contributor

Hey @andyliuliming, we brought these changes in, compiled a new agent and xenial stemcell and tried a few simple deployments on GCP. Unfortunately, we weren't able to successfully deploy, and instead saw that the instances would time out during deployment. During further investigation, we were not even able to SSH onto the machines and investigate logs or the state of the system.

We are wondering, what steps did you take to test this on Trusty/Xenial on Azure?

Thanks,
@jrussett and @cdutra

@andyliuliming
Copy link
Contributor Author

@jrussett
thanks for the efforts to test it on the gcp.
on Azure I use https://github.com/andyliuliming/bosh-linux-stemcell-builder/commits/dhcp_client_fix
to test the CFCR deployment. the cfcr deployment manifests are here: https://github.com/virtualcloudfoundry/cf-azure-deployment/tree/master/cfcr
and it works.
since the deployment on GCP failed. I will pull the freshest commits and do the test on Azure again, will also add the bosh deployment and cf deployment in.

@andyliuliming
Copy link
Contributor Author

andyliuliming commented Jul 28, 2018

@andyliuliming
Copy link
Contributor Author

rethinking about the behaviour change, to save the impact to other cloud providers, will change to add one option to send the dhcp with default off. will update the PR later. thanks.

@andyliuliming
Copy link
Contributor Author

andyliuliming commented Jul 28, 2018

@jrussett @cdutra thanks for your patience.
I resent the pr, and add one option whether we should send the real hostname in dhcp request.
so other platform are not impacted at all, will turn on this on the Azure specific stemcell.

stemcell's PR is here: cloudfoundry/bosh-linux-stemcell-builder#58

@luan
Copy link
Contributor

luan commented Jul 31, 2018

Hi @andyliuliming,

After reviewing your updated PR, we don't think that making this work only for
Azure is what we want to do. So we ended up trying this again on Google Cloud
(with the hostname sending enabled) and it actually ended up working out of the
box.

We're not sure if our initial test had some other issue or if the original PR
had a different problem, but it seems like simply having gethostname() in the
dhclient.conf is functional on GCP as well.

Could you please re-submit your original change (where we unconditionally have
gethostname() in the config) for us to try and merge that? If that ends up not
working we can work together to investigate what we'd have to do to make it work
across all infrastructure.

Thanks,

@luan and @jaresty, CF BOSH Team

@andyliuliming
Copy link
Contributor Author

@luan done. re-submit the original change :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants