Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[20.03] nixos/network-interfaces: Assert that bridges can get an address via DHCP #84221

Merged
merged 1 commit into from Apr 10, 2020

Conversation

erictapen
Copy link
Member

@erictapen erictapen commented Apr 4, 2020

Motivation for this change

#82295 already warns users of 20.03 in the release notes about how dhcpcd changed its implicit behaviour regarding bridge interfaces. This PR proposes an assertion that blocks users from running into some unexpected network change.

This change is primarily thought to be applied before the 20.03 release, as then probably quite a lot of people will run into this problem. This is why I target the release-20.03 branch here. When merged, this should be still cherry-picked on master.

To illustrate the problem once again: This configuration will result in a bridge without an IPv4 address:

{ config, pkgs, ... }:{
  networking = {
    bridges.br0 = {
      interfaces = [ "enp2s0" "enp3s0" ];
    };
    interfaces = {
      br0.useDHCP = true;
      enp2s0.useDHCP = false;
      enp3s0.useDHCP = false;
    };
  };
}

While this one works as expected:

{ config, pkgs, ... }:{
  networking = {
    useDHCP = false;
    bridges.br0 = {
      interfaces = [ "enp2s0" "enp3s0" ];
    };
    interfaces = {
      br0.useDHCP = true;
      enp2s0.useDHCP = false;
      enp3s0.useDHCP = false;
    };
  };
}

This is because networking.useDHCP = false causes an explicit listing of enabled interfaces in the dhcpcd config, while networking.useDHCP = true made dhcpcd implicitly discover the interfaces (which fails now).

Things done

Throw an error if networking.useDHCP is enabled and a bridge interface is defined.

  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@lheckemann
Copy link
Member

+1 for more deprecation on networking.useDHCP (cc @globin) :)

@worldofpeace worldofpeace added this to the 20.03 milestone Apr 8, 2020
@worldofpeace worldofpeace self-assigned this Apr 8, 2020
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/go-no-go-meeting-nixos-20-03-markhor/6495/19

@worldofpeace
Copy link
Contributor

@erictapen Thanks for illustrating how to test this in the OP.

Copy link
Contributor

@worldofpeace worldofpeace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built the config that was supposed to fail in the OP:

networking = {
    bridges.br0 = {
      interfaces = [ "enp2s0" "enp3s0" ];
    };
    interfaces = {
      br0.useDHCP = true;
      enp2s0.useDHCP = false;
      enp3s0.useDHCP = false;
    };
  };

and got this assertion error:

error: 
Failed assertions:
- 	  br0 is a bridge, while `networking.useDHCP` is enabled.
	  Dhcpcd doesn't give IPv4 addresses to bridges by default anymore, so you have
	  to set `networking.useDHCP = false` and then whitelist every
	  interface you need DHCP on with `networking.interfaces.<name?>.useDHCP = true`.
	
(use '--show-trace' to show detailed location information)

I built the config that should fail and it is currently building 👍

@worldofpeace
Copy link
Contributor

cc @samueldr maybe

@lheckemann
Copy link
Member

Should the assertion maybe be cfg.useDHCP -> cfg.bridges == []? Since a bridge may be created in networking.bridges without being configured in networking.interfaces.

@erictapen
Copy link
Member Author

@lheckemann Yeah I think that makes sense. I rewrote the code a few times and missed this case when I didn't fully understood the problem.

I just changed the assertion to cfg.useDHCP -> cfg.bridges == {} and changed the error message, so the user will see a list of all bridges that are configured:

error:                   
Failed assertions:                                                
-             There are the bridges [ br0 br1 ] configured while `networking.useDHCP` is enabled.
            dhcpcd doesn't give IPv4 addresses to bridges by default anymore,
            so you have to set `networking.useDHCP = false` and then whitelist
            every interface you need DHCP on with
            `networking.interfaces.<name?>.useDHCP = true`.
                                                                                                                                     
(use '--show-trace' to show detailed location information)

@worldofpeace
Copy link
Contributor

@erictapen It looks like the indentation is a little skewed. Could you fix that?

Assert that the user doesn't have a bridge configured while
networking.useDHCP is true. Due to new behaviour of dhcpcd [0], this
would result in the bridge not getting an address via DHCP, regardless
of wether it has networking.interfaces.<name?>.useDHCP set or not.

[0] https://roy.marples.name/archives/dhcpcd-discuss/0002621.html
@erictapen
Copy link
Member Author

@worldofpeace Right, I really need to get my tabs under control… It now looks like this:

error: 
Failed assertions:
- There are the bridges [ br0 ] configured while `networking.useDHCP` is enabled.
dhcpcd doesn't give IPv4 addresses to bridges by default anymore,
so you have to set `networking.useDHCP = false` and then whitelist
every interface you need DHCP on with
`networking.interfaces.<name?>.useDHCP = true`.

(use '--show-trace' to show detailed location information)

@worldofpeace worldofpeace merged commit ab018f7 into NixOS:release-20.03 Apr 10, 2020
@erictapen
Copy link
Member Author

Yay! I'm very happy this went through. Thanks for all the reviews!

@samueldr
Copy link
Member

samueldr commented Jul 2, 2020

(Extremely late)

I am sure the change is required, and good in the end, but it was extremely confusing to me. I'm not a networking person. I hate networking, the further away I can be from networking the better I am.

So to my surprise I have this thick assertion failing and telling me I should change a value to something I haven't set, and pick my interfaces where I want this to happen manually?

The error message was far from being easy to action upon, and I had to find the commit, read the PR to get a better understanding of what to do.

This scares me a bit about being locked out of my machine. The default, setting DHCP for all interfaces, is much more comforting for someone that does not grok at all networking options. Considering the mentions of "deprecating useDHCP", what does it mean?

Is there no way to get the default implicit behaviour of dhcpcd for interfaces, while requiring bridges and other "not-interfaces" to be configured?

@samueldr
Copy link
Member

samueldr commented Jul 2, 2020

Oh, it looks like I understood things wrong anyway.

I do not want useDHCP on my br0 interface I think, since my NixOS containers all have localAddress set. Or do I?

If I didn't that means that this assertion broke what was already working.


EDIT: and it looks like that, yes, this assertion is completely breaking my setup that was entirely working. DHCP was not involved with my bridge, and I'm fine with it not being involved with my bridge, but I can't simply say "DHCP all the things" and move along as I could before.

So I was pushed into enabling an option I didn't want by the confusing phrasing and context around the assertion.

@erictapen
Copy link
Member Author

erictapen commented Jul 2, 2020

When I wrote this code I didn't have in mind, that there may be bridges in networks without a DHCP server. So I agree the error message could have addressed that use case better.

Still I think this assertion makes things better.

  • You may not need DHCP on your bridge, but if there was a DHCP server running in that network, it would have leased an IP to your bridge pre-20.03 and would have stopped doing so post-20.03. It's impossible to tell wether you actually need DHCP at eval/buildtime. In my case where I needed DHCP, it locked me out of my machine.
  • networking.useDHCP is messy and should be further deprecated for edge cases like this. It relies on dhcpcds internal distinction between "interfaces" and "not-interfaces" and as we've seen this actually changes between versions.

DHCP was not involved with my bridge, and I'm fine with it not being involved with my bridge, but I can't simply say "DHCP all the things" and move along as I could before.

I agree that it would be nice to have a "DHCP all the things" NixOS option, but only if it would work on eval time. This is not possible atm, as we never know wether networking.interfaces contains all the interfaces on the running system. So we have to rely on a program that is out of our control and has to run on run time. Too unstable for DHCP imho.

Edit: @worldofpeace I just noticed that this commit wasn't forwardported to master. Could you do that? I guess improving the error message could be done in a separate PR, this one was already reviewed and approved.

@samueldr
Copy link
Member

samueldr commented Jul 2, 2020

@erictapen make the new PR, it should have gone through master anyway, even if the intent was to make sure it was going in 20.03 in time for the release.

@lheckemann
Copy link
Member

If we ever get anywhere with networkd-as-default, maybe it would make sense to ship a network definition that matches on en* and wl* and enables DHCP on all of those to function as a "DHCP all the things" and a reasonable default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants