Configurable network name and address range#4711
Conversation
0aa18b9 to
3667aed
Compare
|
Dang, looks like this didn't fix the network issue: Shelving this for now, will get back to it sometime. |
5fbbd2d to
138f7fa
Compare
|
@sipsma (Low priority:) this seems ready to go whenever you have a moment to review. Ran the engine tests 5 times with services re-enabled for the remote cache test, no failures. Previously it would fail somewhat reliably (~> 50%). |
13d119e to
5550090
Compare
Signed-off-by: Alex Suraci <alex@dagger.io>
note: leaving services network disabled since the actual issue still isn't fixed, but this is likely needed regardless Signed-off-by: Alex Suraci <alex@dagger.io>
Signed-off-by: Alex Suraci <alex@dagger.io>
Signed-off-by: Alex Suraci <alex@dagger.io>
Signed-off-by: Alex Suraci <alex@dagger.io>
should prevent "iptables: Resource temporarily unavailable" all of these iptables calls used --wait, but it wasn't working because they all had their own lock file. Signed-off-by: Alex Suraci <alex@dagger.io>
still stuck on Docker 20.10.23 with Nixpkgs/NixOS. :( tried upgrading locally but spooky things happened instead. Signed-off-by: Alex Suraci <alex@dagger.io>
5550090 to
7a7d48c
Compare
There was a problem hiding this comment.
nit: In most circumstances I don't believe you need to unmount in order to bind mount over something, you can just bind mount right over it and it will "stack". If we can avoid a umount call here I think that's preferable all else being equal because there's some just generally confusing+convoluted stuff around whether umount can block (and various flags like DETACH that may help but also add more complication).
There was a problem hiding this comment.
Good point - I had it this way because my original approach involved unmounting and writing to the file directly, but since we're mounting over it instead we can just let it stack. 👍
There was a problem hiding this comment.
nit: for defense in depth may as well set this to have 0600 perms and/or mount the final file with MS_RDONLY
| if err != nil { | ||
| return err | ||
| } | ||
| args := []string{"-i", iface, "-p", "udp", "-m", "udp", "--dport", "53", "-j", "ACCEPT"} |
There was a problem hiding this comment.
I'm curious why this is needed, is it just enforcing that dns packets have to be allowed on this interface? I would have assumed that was already the case pretty much all the time since blocking them would break a lot, but I'm probably missing something.
There was a problem hiding this comment.
Honestly not even sure myself. It was there in the original dnsname plugin code, confusingly tied to the dnsmasq.conf setup, so I kept it: https://github.com/dagger/dagger/pull/4711/files#diff-75e7590bb9ac9d43bb60200ec283d75ac9d28e5ade434a38ea8a6342b8f6956cL60
I can try removing it and see if anything breaks. 🤷♂️ Maybe the dnsname plugin had to run in certain places that needed this? It's been there since the original commit so there's no insightful 'blame' for it.
There was a problem hiding this comment.
(Also I totally had ChatGPT explain what this command did a while back, which it did pretty well!)
There was a problem hiding this comment.
I can try removing it and see if anything breaks. 🤷♂️ Maybe the dnsname plugin had to run in certain places that needed this? It's been there since the original commit so there's no insightful 'blame' for it.
Cool yeah may as well cut the cargo culting cord at this point I suppose. The less manual fiddling with iptables the better probably
| pidFile, err := xdg.RuntimeFile("dagger/net/" + name + "/dnsmasq.pid") | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| hostsFile, err := xdg.RuntimeFile("dagger/net/" + name + "/hosts") | ||
| if err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
nit: can dedupe the definitions of these paths InstallDnsmasq
There was a problem hiding this comment.
Done - refactored all of these paths into a separate file with docs for each.
| } | ||
| hostname := hostHash(digest) | ||
| payload.Hostname = fmt.Sprintf("%s.%s", hostname, network.DNSDomain) | ||
| payload.Hostname = hostname |
There was a problem hiding this comment.
Double checking my own understanding, the hostname no longer has .dagger.local, but when a process tries to resolve the hostname it automatically appends .dagger.local (or whatever setting the user has overridden) to the dns request because .dagger.local is marked in the resolv.conf as being the local domain? Or something like that?
There was a problem hiding this comment.
We actually just don't need to use the domain at all now; it's not even configured in the /etc/resolv.conf we set up. Now that everything is routed directly to dnsmasq with no other servers listed in resolv.conf there should never be any ambiguity.
There might be a small benefit in including it and installing our domain into the search list, because it would allow us to skip the domain searching routine if the user's original /etc/resolv.conf has them configured (which we preserve), but even that benefit only applies as long as they don't also have options ndots:N set (k8s defaults to ndots:5). So it's a bit of a wash.
The benefit of not including it, on the other hand, is that we can keep core/ decoupled from the engine configuration, which definitely feels a lot nicer.
There was a problem hiding this comment.
Oh okay interesting, I got confused when I saw we still set something in the dnsmasq template related to the domain, but that all makes sense, thanks for the explanation!
Signed-off-by: Erik Sipsma <erik@dagger.io>
sipsma
left a comment
There was a problem hiding this comment.
Fixed the issue that was causing tests to fail: #4711 (comment)
All my other comments are nits and/or questions, so LGTM!
| return fmt.Errorf("replace nameservers: %w", err) | ||
| } | ||
|
|
||
| upstreamResolv, err := touchXDGFile("dagger/net/" + name + "/resolv.conf.upstream") |
There was a problem hiding this comment.
Just realized AppArmor isn't going to like this since now we'll have dynamic resolv.conf paths.
I guess I could move these to /etc/dnsmasq.d/? https://gitlab.com/apparmor/apparmor/-/blob/master/profiles/apparmor.d/usr.sbin.dnsmasq#L43
There was a problem hiding this comment.
Went with paths specifically designated for the dnsname plugin that this PR vendors and cannibalizes. Hopefully they don't remove them, since dnsname is technically unmaintained now. All the more reason to have some sort of smoke test for this. 😬
don't actually need to umount, so might as well skip it Signed-off-by: Alex Suraci <alex@dagger.io>
is this load-bearing? only one way to find out! Signed-off-by: Alex Suraci <alex@dagger.io>
Signed-off-by: Alex Suraci <alex@dagger.io>
Signed-off-by: Alex Suraci <alex@dagger.io>
Follow-up to #4697
This might be a bit overkill, but it felt like a useful exercise just to get us away from global assumptions around the network stack.
--network-name(defaultdagger) and--network-cidr(default10.87.0.0/16).dnsmasqon boot, rather than managing its state in thednsnameCNI plugin and leaving a window of time where/etc/resolv.confpoints to an IP withoutdnsmasqlistening yet./etc/resolv.confby mounting over it, which should allow for interesting things like running a Dagger engine (or multiple?) on the same Linux host but wrapped inunshare -mso it doesn't mess with everything else's DNS. (Might try this on my Linux machine for testing purposes, just for fun.)dagger.local). I'm 95% sure we don't actually need it, especially considering some environments (k8s) configurendots:5, and it introduces a very tight coupling between the schema and the engine.