-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offer Version of Boxes without Hardcoded DNS? #35
Comments
@lberk I'd be happy to merge in improvements. My philosophy is that the default values I specify in the various scripts should be used in the absence of a user defined value. The trick will be figuring a mechanism which makes it easy to override those values, and then ensuring the box config defers appropriately. Ideally someone can override the values using the Vagrantfile, and we add documentation to the roboxes.org website, which shows people how to override features. Like I said, I haven't been able to test every combination, so I admittedly deviate from this ideal. As for the DNS servers specifically, I'm on the fence about whether DHCP values qualify as "user provided input" ... since in many cases DHCP values are outside the user's control. As I mentioned, I went the current route, because I needed to ensure the That said, I could loosen the rules for the Thoughts? |
I keep coming back to the idea that this should be configurable via the Vagrantfile, using something like: https://www.vagrantup.com/docs/networking/public_network.html#dhcp If it isn't, then perhaps the solution is to mimic the feature using an inline provisioning script, using the embedded Vagrantfile? The script could use environment variables, or Vagrantfile file values, to configure things like the DNS servers during the box provisioning stage... If someone disables the provisioning step, or doesn't provide values, then the defaults would be applied... On my long term to do list is writing a generic I fear people don't realize that by convention, these credentials have known defaults, and might be leaving them intact. Which could be a problem, if a box happens to be get provisioned with an IP that isn't NAT'ed. Long story short, if we went this route, then other things could also be easily configured, like the DNS servers, I'd welcome a PR along thse lines... thoughts? |
I should add, IPv6 enable/disable should be easy to control via the same mechanism. |
Thanks for your input. I think I agree in general with your approach and the underlying principles. But I keep hitting a road block (this is probably a case of my ruby/vagrant/networking knowledge being slightly too basic for this). I've taken a bit of time to try and add the option in the underlying robox/tpl/$provider-$distro.rb vagrant template as you've suggested, something similar to;
That way, if the environment variable isn't declared, we default to your known-sane value. If it is declared, change to whatever is specified (it'd be worthwhile for my use-case to perhaps have a 'none' special case where no extra sed line happens. Is that what you were envisioning? The issue I'm running in to with this approach (and again, if you have any pointers, I'd appreciate it and am willing to try incorporating them!) is that this kind of inline provisioning script seems to be run after the box is provisioned with default values. This means, when I run This is why I come back to having the 'generic' namespaces as slightly more permissive for the networking; just so I can stand the box up and provision them at all. That being said, I'm aware we've got huge matrix of potential config options people will be wanting to tweak, and its a balancing act trying to make these options configurable, yet testable to ensure functionality. Thoughts? |
You're script looks right. I'd go with a environment variables of NS1, NS2, and NS3. If any are defined, they're configured. Presumably we'll need to adapt the basic strategy to the different distros, and having the values separate should make that easier. As for your issue. Which hypervisor are you using? Is it libvirt? I believe all of the boxes are configured, by default to bring up a preconfigured NIC that uses DHCP. Based on the error, it feels like the box is stuck waiting for a DHCP address? In the world of libvirt, a domain is synonymous with guest virtual machine. |
@lberk can you do a little test for me. When you see the error, can you still connect to the guest console, and login using vagrant:vagrant. Once you do that, you can run a few tests. The Those commands should be installed, but I haven't verified it with every variant, so if they aren't let me know and I'll add the required package to the baseline configuration for that distro. |
@ladar oops, wrong error message I reported (wrong box) for example, on generic/ubuntu1804
I can change the ruby script to use those env var names instead. Is there a better way to include them? than by adding them directly to tpl/$provider-$distro.rb ? I'd prefer to have them organised similar to how you have the scripts/ directory right now that create giant, monolithic base vagrantfiles. |
What you posted makes very little sense to me. All I can think of, is that perhaps the additional NIC (aka eth1 is getting created by Vagrant (or some other rule) and that is what is causing problems with your DNS resolution. Specifically, your post shows you can access the box, and the box can access NS1 (aka 4.2.2.1), but that it can't resolve the DNS name
and
You should see something like the following:
|
@lberk in reply to your comment... if we're speaking ideally, then users should be to set a value like:
or
Or
For no DNS servers. Of course the absence of a value, or setting:
Would use the existing values. I just don't have know if a provisioning script, inside the bundled Vagrantfile could read/react using those settings. Ideally we should open up a ticket with Vagrant, and get them to add support for that. Short term we might be able to fake support for something like this by using inline script like you proposed, inside the bundled Vagrantfile. The scripts/ files get executed at build time, which wouldn't help you, right? What you want is to be able download the box, and have it use custom NS values during the provisioning step, right? The only way to do that is by adding scripts to the tpl/ files. And the hard part, once we get it working, will be creating variants for the 5-8 different DNS config schemes used by the various distros. |
Would be happy to try and paste the output of any other commands (is there a better place to go over this stuff than github?)
|
@lberk just to confirm. The box is provisioning properly, you're just concerned that DNS doesn't work properly inside the guest, correct? The The the source of the your troubles appears to be the
Will confirm IP connectivity, and then try using that name server to resolve the query. Let's also confirm that the DHCP provided name servers are both correct:
and...
My guess is based on the idea that usually resolvers only search the first 3 nameservers, and ignore the rest. If that's the case, then this should be an easy fix. |
@ladar ah awesome, thanks!
If this is something I can fix solely on our end things, that would be happy to implement if you have any suggestions. |
I was correct. Your network is blocking access to outside nameservers. Because the first three entries in the list are outside your network, and thus blocked, DNS is failing. You either need to unblock access to outside DNS servers (UDP port 53), or update the guest so it doesn't use pre-configured values. I can't easily test a fix where I am, but if I were to guess, something like the following might work (for Ubuntu 18.04):
As discussed above, it would be nice if there was a standardized way of setting nameservers via the Vagrantfile, but that will take some work. |
I think that would be a great idea. I personally take advantage of the generic boxes for nearly all my Evilpot / Honeypot fork. |
@lberk did adding those commands to your Vagrantfile solve your problem? |
@ladar Unfortunately not, output of the commands in #35 (comment) were the exact same as well. Any chance there should be some changes in resolved as well? |
@lberk the following seemed to work for me:
|
@ladar unfortunately that still doesn't seem to work for me (either running as a provisioning script to the box, or running by hand). The ping/nslookup command sequence result was the exact same as before as well. |
@ladar fwliw, this is partly why I was originally motivated to create our own boxes without the dns/network modifications. I'm still open to working on this particular issue, but could we also approach changing how we modify the base vagrantboxes in parallel to this so that different/specific dns servers could be specified at runtime instead of build time? |
@lberk yes... there are several ways of making this work, but if you have the time, then please submit a PR which does this. I'm not quite sure why you are having so much trouble getting the resolver setup properly... but the key piece of data I need isn't whether Try run the following, and if it doesn't work, send the output:
Those commands should remove the global resolvers, and switch to using the DHCP provided values, which is what I think you want. On my system the output is:
The syslog file only gets printed if Also, I noticed while investigating this further that when I switched to the DHCP provided nameserver, I had to also disable DNSSEC, or it lookups would fail. Disabling DNSSEC wasn't part of my original script, and I suspect this could be the source of your problem. It seems that the default If those commands still don't work, it might help to look at your Vagrantfile. If it isn't sensitive, can you share it with me via email? |
@lberk were you able to test my commands, and confirm that it was actually the DNSSEC requirement that was causing issues? |
@ladar thanks for the prod. that worked! I'd be happy to to share the vagrantfile, I'm just not sure which email you'd like it forwarded to? |
Actually, that's not needed, the vagrantfile is completely public; https://github.com/performancecopilot/pcp/blob/master/Vagrantfile |
@lberk to summarize then, the issue had nothing to do with which resolvers were being used... but rather, when using the default Any suggestions for a permanent fix? I don't think disabling DNSSEC validation is a good choice... If I don't hear anything, I'll close this issue until someone decides to take up the challenge. |
@ladar thanks for helping diagnose it. I'll have to dig into my libvirt setup a bit more then to see if there's a more surgical/local fix I can apply. Until then, I'll close the issue, thanks again for the help! |
You might want to dig into the man page for It isn't immediately clear whether |
I have the same issue. The network firewall doesn't allow connecting external nameservers and you have to go through the provided nameservers. My vagrant (with libvirt) provides the correct nameservers via DHCP and with the generic image I expected it to honor my settings. Another downside is that my local dnsmasq can resolve all the hostnames for all VMs I'm running. That means I can use DNS to set up connections to other VMs. Using external nameservers breaks that feature. As a user of the generic image I expected a very vanilla Ubuntu. I'm currently using this because Ubuntu only offers Virtualbox images. Note I tested this with with the 16.04 image now. The workaround I use: sed -i 's/allow-hotplug eth0/auto eth0/ ; /^dns-nameserver/d ; /^pre-up sleep 2$/d' /etc/network/interfaces && systemctl restart networking Edit: in 18.04 I have no issue because systemd-resolved detects it can't reach the external nameservers and uses the DHCP provided ones. |
@ekohl the best solution to this problem is to ask Vagrant to add the ability to configure nameservers via the Vagrantfile, and then have the guest plugin make the configuration changes. In lieu of that, there really is no perfect answer. Ideally the default nameservers should get used unless the user overrides them. Whether the DHCP provided values consist of an override is debatable. The biggest issue is that the default values don't work by default on some systems/platforms. I think this is primarily a If you read the ticket above, the problem wasn't requests being blocked, it was the proxied requests not honoring the I'm open to suggestions for fixing this. I'm just not sure building a separate box for every distro without default DNS server values is viable. It simply takes too long to build all the images once (48+ hours). |
P.S. @ekohl I find it curious that for you, 16.04 doesn't work, but 18.04 does. If you look at the original post, that was for an 18.04 configuration. Again, I think this speaks to this issue being platform dependent. |
IMHO default nameservers are what DHCP provides. If your company uses DNS nameservers that are only reachable internally then external nameservers can't resolve those either. It can also be a privacy concern.
In my experience libvirt just works by default if you use DHCP provided values. Also, in my case it's not the my host blocking it. From my host without any virtualization I can't do any DNS queries to port 53 either.
I guess this comes down to the choice you have to make: do you assume broken-by-default or working-by-default. The current model is broken-by-default but I'd argue this punished users who have a working environment. I can respect that you have a difficult decision to make and you'll never please all users.
This can be explained. 16.04 uses resolv.conf. By default that only reads the first 3 Note that when using systemd-resolved it may be cleaner to use |
Privacy is one of the concerns. But that's why I picked the Level 3 resolvers, with OpenDNS as a backup. As opposed to whatever DNS servers might be handed out via DHCP at the coffee shop, hotel wifi, etc I might be at.
Again, it seems your environment is different from the original poster, and both are different than me.
The goal is most certainly to work by default. These images were all originally created to testing
One of several ideas I experimented with when investigating the original report was whether providing just 2 DNS servers would fix this issue. It didn't quite work as expected, but that was with
I noticed that option, but wasn't quite sure how it work in practice, and haven't had the time to investigate/experiment. In my view, the BIGGEST, problem is that every combination of distro/provider/platform has a different ideal solution. Which brings me back to trying to make things work out of the box with the default resolvers, and making it easy to explicitly set nameservers regardless of distro/provider via the |
I can appreciate the lack of time and desire for a unified config to reduce the time spent. 👍 for solving this at the Vagrant layer rather than at the image layer. |
#11 (comment) |
Hi,
As mentioned in issue #11 having a hard-coded external dns breaks our internal dns resolver. In the mean time, I've taken a bit of time to modify your robox.sh script for our use, without the hard-coded dns.
Would this be something you'd be open to merge via PR? Now that the robox namespace seems to be preferred, maybe we could utilise the 'generic' namespace for the boxes without dns changes, and the 'robox' namespace for boxes with hard-coded dns?
Thoughts? Perhaps we could even add this as a robox.sh build-time option?
The text was updated successfully, but these errors were encountered: