Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS 19.09 upgrade may change network interface naming #71086

Closed
wlhlm opened this issue Oct 13, 2019 · 14 comments · Fixed by #71456

Comments

@wlhlm
Copy link
Contributor

@wlhlm wlhlm commented Oct 13, 2019

Describe the bug
NixOS 19.09 updates systemd from version 239 to version 243 which comes with a changed network device naming algorithm which can result in hosts coming up with different interface names after upgrading NixOS 19.09. In turn, this means previously set up network configuration may no longer apply and thus the host might loose network connectivity.

To Reproduce
Steps to reproduce the behavior:

  1. Upgrade NixOS to 19.09, for example using the procedure outlined in the manual.
  2. Reboot to apply the OS upgrade
  3. Observe machine enacting unexpected network behavior, for example missing network connectivity, unreachability from the outside, etc.

Of course, step 3 happens depending on whether the new interface naming algorithm decides to generate a new name for the hardware configuration or not.

Expected behavior
I don't have a problem with the interface names changing in itself, just that it should be explicitly considered for the upgrade procedure to NixOS 19.09. For this I can come up with two potential solutions:

  1. Explicitly mention in the release notes that the upgrade to systemd 243 may result in interface names changing and suggest a procedure for administrators to check interface names before rebooting after the upgrade.
  2. Change default systemd configuration to keep using old interface naming algorithm. See section Additional context for more info.

Additional context
Luckily, systemd make changes to the interface naming algorithm explicit and keep old versions around for backwards compatibility. Previous versions are documented in systemd.net-naming-scheme(7) and can be configured with the net-naming-scheme kernel parameter.

Workaround
In case one is affected by the changing network interface name and depends on network connectivity to access a machine (such as a remote server) and doesn't have access to out-of-band management, but has to use a crappy rescue image one can change the interface naming using the kernel command line by net-naming-scheme= in /etc/nixos/configuration.nix:

boot.kernelParams = [ "net.naming-scheme=v239" ];

or by doing a quick'n'dirty edit to the grub config at /boot/grub/grub.cfg:

# ...
menuentry "NisOS - Default" {
  # ...
  linux ... net.naming-scheme=v239
  # ...
}
# ...

v239 switches to the version of the algorithm used by the systemd version included in NixOS 19.03.

Metadata

# nix run nixpkgs.nix-info -c nix-info -m
 - system: `"x86_64-linux"`
 - host os: `Linux 4.19.79, NixOS, 19.09.789.7952807791d (Loris)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3`
 - channels(wlhlm): `""`
 - channels(root): `"nixos-19.09.789.7952807791d, nixpkgs-19.03.173394.147bd882fc6"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Maintainer information:

attribute: nixos.systemd
@wlhlm wlhlm added the 0.kind: bug label Oct 13, 2019
@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 13, 2019

In my particular case, the name for an Intel Ethernet controller changed from enp1s0 to eno0:
before upgrade:

[...]
Aug 29 19:02:46 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
Aug 29 19:02:46 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): registered PHC clock
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:22:4d:87:b0:ff
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: Intel(R) PRO/1000 Network Connection
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF
[...]
Aug 29 19:02:46 kernel: e1000e 0000:01:00.0 enp1s0: renamed from eth0
[...]

journal output after the upgrade:

[...]
Oct 13 14:42:08 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
Oct 13 14:42:08 kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
Oct 13 14:42:08 kernel: e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
Oct 13 14:42:08 kernel: e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): registered PHC clock
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:22:4d:87:b0:ff
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: Intel(R) PRO/1000 Network Connection
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: FFFFFF-0FF
[...]
Oct 13 14:42:09 systemd-udevd[409]: Using default interface naming scheme 'v243'.
[...]
Oct 13 14:42:09 kernel: e1000e 0000:01:00.0 eno0: renamed from eth0
Oct 13 14:42:09 systemd-udevd[409]: eth0: Process '/nix/store/ily14d68xl11cnbbkf9svwnzwsrrnzah-bash-4.4-p23/bin/sh -c 'echo 2 > /proc/sys/net/ipv6/conf/eth0/use_tempaddr'' failed with exit code 1.
[...]

Kernel version was 4.19.79 in both cases.

@markuskowa

This comment has been minimized.

Copy link
Member

@markuskowa markuskowa commented Oct 13, 2019

See also #71082

@vcunat

This comment has been minimized.

Copy link
Member

@vcunat vcunat commented Oct 13, 2019

Let's merge the discussion into a single thread.

@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 13, 2019

See also #71082

Let's merge the discussion into a single thread.

This mentioned issue seems to based on predictable interface names being gone entirely, which is different to this issue about predictable names being, well... unpredictable.

I don't think this should be closed.

@vcunat

This comment has been minimized.

Copy link
Member

@vcunat vcunat commented Oct 13, 2019

Oh, right, that's weird [the other issue]. I'll reopen until it's clearer. EDIT: and you posted a much better description/analysis anyway :-)

@vcunat vcunat reopened this Oct 13, 2019
@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 13, 2019

Restating my problem:

What I'm describing isn't a bug in NixOS per se, just that NixOS 19.09 updated from systemd 239 to systemd 243 and from time to time, systemd updates bring slight changes to the way network interfaces are named. This may affect certain hardware configurations resulting in network interface names changing, though in the majority of cases nothing will happen.

What I'm proposing is that it should be made clear in the release notes, that these changes might occur and to suggest a procedure for admins to be able to check before rebooting and adjust network configuration. This would be most important for servers (without out-of-band management), such as the system on which I first discovered this problem.

An alternative solution I propose is to explicitly roll back to the old naming algorithm in the default NixOS configuration to the algorithm from version 239. Although, all this does is delay the inevitable when systemd drops the legacy version of the algorithm, but gives more time to find a solution for a smoother transition.

@vcunat

This comment has been minimized.

Copy link
Member

@vcunat vcunat commented Oct 13, 2019

I assume we'll go the way of adding a line into the release notes? /cc @disassembler, @lheckemann, @andir.

@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 13, 2019

Notes on checking if interface name will change:

The persistent interface names are documented in systemd.net-naming-scheme(7). You can use udevadm to see the variables from which udev draws to set a "persistent" name:

$ NET_NAMING_SCHEME=v239 udevadm test-builtin net_id /sys/class/net/$IFACE
$ NET_NAMING_SCHEME=v243 udevadm test-builtin net_id /sys/class/net/$IFACE

NET_NAMING_SCHEME can be set to the version of the naming algorithm as listed in the manpage mentioned above. To see how udev draws from the variable, we have to check /run/current-system/sw/lib/systemd/network/99-default.link:

#  ...
[Match]
OriginalName=*

[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent

The important setting here is NamePolicy: onboard, slot, and path correspond to the udev variables ID_NET_NAME_ONBOARD, ID_NET_NAME_ONBOARD, and ID_NET_NAME_ONBOARD, ID_NET_NAME_SLOT, ID_NET_NAME_PATH. The first policy that matches is used.

To give an example, here is how the interface on my affected system got changed:

$ NET_NAMING_SCHEME=v239 udevadm test-builtin net_id /sys/class/net/enp1s0
Load module index
Parsed configuration file /nix/store/6snycpaz9zrs5m7xz6dixl1nl0ngdrma-systemd-243/lib/systemd/network/99-default.link
Created link configuration context.
Using interface naming scheme 'v239'.
ID_NET_NAMING_SCHEME=v239
ID_NET_NAME_MAC=enx00224d8741dd
ID_OUI_FROM_DATABASE=MITAC INTERNATIONAL CORP.
ID_NET_NAME_PATH=enp1s0
Unload module index
Unloaded link configuration context.
$ NET_NAMING_SCHEME=v243 udevadm test-builtin net_id /sys/class/net/enp1s0
Load module index
Parsed configuration file /nix/store/6snycpaz9zrs5m7xz6dixl1nl0ngdrma-systemd-243/lib/systemd/network/99-default.link
Created link configuration context.
Using interface naming scheme 'v243'.
ID_NET_NAMING_SCHEME=v243
ID_NET_NAME_MAC=enx00224d8741dd
ID_OUI_FROM_DATABASE=MITAC INTERNATIONAL CORP.
ID_NET_NAME_ONBOARD=eno0
ID_NET_NAME_PATH=enp1s0
Unload module index
Unloaded link configuration context.

You can see that ID_NET_NAME_PATH stayed the same, but v243 added ID_NET_NAME_ONBOARD and looking at /run/current-system/sw/lib/systemd/network/99-default.link:

NamePolicy=keep kernel database onboard slot path

we can see that onboard is listed before path meaning ID_NET_NAME_ONBOARD is chosen before ID_NET_NAME_PATH.

@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 13, 2019

The problem with the procedure outlined above is, that only it works since systemd 240, meaning you'd first have to upgrade systemd from 239 in order to see if interface names change and in turn meaning you'd have to upgrade to NixOS 19.09 using nixos-rebuild switch which is not the best idea for distro upgrades (@vcunat agrees). I'm not sure what the solution here is.

@vcunat

This comment has been minimized.

Copy link
Member

@vcunat vcunat commented Oct 14, 2019

Well, if we really cared about it, we could theoretically have one release (19.09) with forcing the previous naming by default. The main problem I see with that: it's relatively late, so I'm afraid changing back may cause also issues to some people. A compromise approach could be to make it easily configurable and suggest setting the older scheme manually – even if just for one boot to do this procedure.

@michaelpj

This comment has been minimized.

Copy link
Contributor

@michaelpj michaelpj commented Oct 14, 2019

If they keep the old scheme around indefinitely, can't we just set the old one conditional on stateVersion?

@flokli

This comment has been minimized.

Copy link
Contributor

@flokli flokli commented Oct 14, 2019

@michaelpj People already might have switched their system and dealt with the changes, so applying any changes to 19.09 in that regard will change their system behaviour again.

I think the proper way to address this should be to add this more prominent to the 19.09 release notes, and make sure major systemd changes are mentioned. I'm not talking about copying in all of their release notes, but adding pointers to possibly more invasive changes.

@vcunat

This comment has been minimized.

Copy link
Member

@vcunat vcunat commented Oct 20, 2019

What about this formulation? #71456

worldofpeace added a commit to vcunat/nixpkgs that referenced this issue Oct 21, 2019
worldofpeace added a commit to vcunat/nixpkgs that referenced this issue Oct 21, 2019
worldofpeace added a commit that referenced this issue Oct 21, 2019
@wlhlm

This comment has been minimized.

Copy link
Contributor Author

@wlhlm wlhlm commented Oct 21, 2019

What about this formulation? #71456

I'm fine with that. Thank you.

peti added a commit that referenced this issue Oct 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.