Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 23.10 (Mantic) minimized/minimal cloud images do not receive IP address during provisioning in certain environments #4451

Closed
philroche opened this issue Sep 21, 2023 · 10 comments
Labels
bug: downstream This issue was filed against the upstream bug tracker but is a downstream issue.

Comments

@philroche
Copy link
Contributor

Bug report

This is a cloud-init specific bug from the cloud-images/kernel bug @ https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036968

Steps to reproduce the problem

Following a recent change from linux-kvm kernel to linux-generic kernel in the mantic minimized images there is a reproducable bug where a guest VM does not have an IP address assigned as part of cloud-init provisioning.

This is easiest to reproduce when emulating arm64 on amd64 host. The bug is a race condition, so there could exist fast enough virtualisation on fast enough hardware where this bug is not present but in all my testing I have been able to reproduce.

The latest mantic minimized images from http://cloud-images.ubuntu.com/minimal/daily/mantic/ have force initrdless boot and no initrd to fallback to.

This but is not present in the non minimized/base images @ http://cloud-images.ubuntu.com/mantic/ as these boot with initrd with the required drivers present for virtio-net.

Reproducer

wget -O "launch-qcow2-image-qemu-arm64.sh" https://people.canonical.com/~philroche/20230921-cloud-images-mantic-fail-to-provision/launch-qcow2-image-qemu-arm64.sh

chmod +x ./launch-qcow2-image-qemu-arm64.sh
wget https://people.canonical.com/~philroche/20230921-cloud-images-mantic-fail-to-provision/livecd.ubuntu-cpc.img
./launch-qcow2-image-qemu-arm64.sh --password passw0rd --image ./livecd.ubuntu-cpc.img

You will then be able to log in with user ubuntu and password passw0rd.

You can run ip a and see that there is a network interface present (separate to lo) but no IP address has been assigned.

ubuntu@cloudimg:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff

This is because when cloud-init is trying to configure network interfaces it doesn't find any so it doesn't configure any. But by the time boot is complete the network interface is present but cloud-init provisioning has already completed.

You can verify this by running sudo cloud-init clean && sudo cloud-init init

You can then see a successfully configured network interface

ubuntu@cloudimg:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 metric 100 brd 10.0.2.255 scope global dynamic enp0s1
       valid_lft 86391sec preferred_lft 86391sec
    inet6 fec0::5054:ff:fe12:3456/64 scope site dynamic mngtmpaddr noprefixroute
       valid_lft 86393sec preferred_lft 14393sec
    inet6 fe80::5054:ff:fe12:3456/64 scope link
       valid_lft forever preferred_lft forever

There is work ongoing to include the virtio-net driver as a built-in in the mantic generic kernel which will solve the problem for use cases using the virtio-net driver. But the problem still exists with cloud-init and how to handle this race.

There are no plans to include the e1000 driver so this bug is still reproducable using that driver use launch-qcow2-image-qemu-arm64-e1000.sh reproducer

The bug is also reproducible with amd64 guest on adm64 host on older/slower hardware.

Environment details

  • Cloud-init version: 23.3.1-0ubuntu1
  • Operating System Distribution: Ubuntu 23.10
  • Cloud provider, platform or installer type: Qemu

cloud-init logs

Attached
minimal-arm64-cloud-init.tar.gz

@philroche philroche added bug Something isn't working correctly new An issue that still needs triage labels Sep 21, 2023
@blackboxsw blackboxsw added priority Fix soon and removed new An issue that still needs triage labels Sep 21, 2023
@blackboxsw blackboxsw self-assigned this Sep 21, 2023
@blackboxsw
Copy link
Collaborator

Thanks @philroche for the bug here. I can reproduce this problem with the test procedure you provided. And confirm the bug.

This race in network setup represents a general problem for two scenarios:

  1. kernels booting without initramfs that do not have the built-in modules necessary for the hardware on which they are booting. (your e1000 example here on qemu launch which provides an e1000 --device)
  2. kernels booting with initramfs that doesn't provice the necessary e1000 modules to load early in boot

In both of these cases, minimal images/kernels will be forced to rely on udev discovering NICs, loading required modules and bringing links up as a reaction to systemd-udev-trigger.service which orders the udevadm triggers by subsystem with --prioritized-subsystem=module,block,tpmrm,net,tty,input. On emulated or slow hardware, the udevadm triggers can take up to 45 seconds or more.

Given that neither cloud-init-local.sevice nor cloud-init.service have an explicit After=systemd-udev-trigger.service,
a race exists on slow hardware with late loaded kernel modules that both cloud-init-local.service and cloud-init.service can beat the udev events being processed and the presence of any NICs in /sys/class/net.

To be honest, this same condition exists for both NetworkManager-wait-online.service and systemd-networkd-wait- online.service as well as they don't know yet about the e1000 NIC and we see that systemd-networkd-wait-online.service proceeds with early boot (and unblocks cloud-init.service) despite no network devices present except the loopback device at /sys/class/net/lo.

Sep 20 16:39:37.274227 cloudimg systemd[1]: Starting systemd-networkd-wait-online.service - Wait for Network to be Configured...
Sep 20 16:39:37.301678 cloudimg apparmor.systemd[219]: Restarting AppArmor
Sep 20 16:39:37.301678 cloudimg apparmor.systemd[219]: Reloading AppArmor profiles
Sep 20 16:39:37.386226 cloudimg systemd[1]: Starting systemd-tmpfiles-setup.service - Create Volatile Files and Directories...
Sep 20 16:39:37.478626 cloudimg systemd[1]: Finished console-setup.service - Set console font and keymap.
Sep 20 16:39:37.585475 cloudimg systemd[1]: Finished systemd-networkd-wait-online.service - Wait for Network to be Configured.

Additonally, the systemd unit cloud-init.service generally blocks on the "bring up" of network services by declaring After=NetworkManager-wait-online.service or After=systemd-networkd-wait-online.service. But, as we see in the above journalctl logs. The fact that no devices are seen yet from the kernel leaves both systemd-networkd-wait-online.service and cloud-init.service in the same boat. Neither knows if "new" network devices are expected on this hardware because the kernel itself doesn't know about it yet because the proper udevadm net operations haven't yet been queued and processed.

From cloud-init's perspective, we can toss around ideas in this issue that could be workarounds for known platforms or datasources that require network config to be functional. But, the proposals below feel like hacks around shortcomings in the setup and config of udev configured network devices. Ideally, it feels like images containing kernels and/or initramfs looking to support certain platforms/solutions out of the box should should probably be either providing kernel built-ins or initramfs support of those required drivers to ensure fast and efficient boot on those desired hardware platforms.

Proposals/investigations for coping with late udev net subsystem adds ordered in reverse priority when no devices seen yet in /sys/class/net:

  1. cloud-init-local.service: call udevadm trigger --subsystem-match=net
    • CON: ruled out because systemd-udev-trigger.service is running in parallel kicking many udevadm triggers for all subsystems and this just adds more noise and udev operatons during early boot costing more time during boot
  2. cloud-init-local.service: run udevadm settle
    • CON: udevadm settle only waits for any previous queued udev operations, so we could get in too early during systemd-udev-trigger.service runs that we miss the net subsystem events that will be queued after our settle returns
    • CON: some cloud-init-local datasources detected don't actually need network to function because they get config from mounted disks, seed files or environment variables. We don't want to force network discovery unnecessarily in all environments
  3. cloud-init-local.service: Add [Unit]\nAfter=systemd-udev-trigger.service\n[Service]\nExecStartPre=/bin/udevadm settle
    • CON: ruled out for general use because there a many images/kernels/initrd that already provide built-in modules or initramfs with required drivers/modules and this adds unecessary boot delay on any non-critical udev device loads in those images
  4. Allow cloud-init-local.service to disable network bringup, but require cloud-init.service which runs after systemd-network-wait-online.service to run udevadm settle --timeout XX if the datasource requires network connectivity AND zero NICs are present in /sys/class/net.
  • CON: we would be introducing a boot time cost on systems that will never have any physical/virtual NICs present
  • PRO: cloud-init.service does run after sysinit.target we only introduce this cost for datasource detection where network is required
  • CON: because NoCloud datasource doesn't strictly require network to function, this rule wouldn't solve this particular bug. but it would solve OpenStack cases and GCE cases where we know the datasource needs network connectivity to reach IMDS.

@blackboxsw
Copy link
Collaborator

@TheRealFalcon @holmanb @philroche @enr0n and @xnox if you have any alternative suggestions that we may pursue please raise them as we think through this. I'm leaning toward option 4 as a final check in only certain conditions, but it really feels like this might be something that should be better handled in the kernel or in systemd-networkd-wait-online.service if we know that net subsystem hasn't yet been setup by udev operations.

@enr0n
Copy link

enr0n commented Sep 25, 2023

@TheRealFalcon @holmanb @philroche @enr0n and @xnox if you have any alternative suggestions that we may pursue please raise them as we think through this. I'm leaning toward option 4 as a final check in only certain conditions, but it really feels like this might be something that should be better handled in the kernel or in systemd-networkd-wait-online.service if we know that net subsystem hasn't yet been setup by udev operations.

In this case, systemd-networkd-wait-online is supposed to wait for netlink events in case an interface shows up later. It looks like we need systemd/systemd-stable@abbd24e in mantic, which was backported in v253.6, but we have v253.5.

@blackboxsw blackboxsw removed the priority Fix soon label Sep 25, 2023
@blackboxsw blackboxsw removed their assignment Sep 25, 2023
@blackboxsw
Copy link
Collaborator

Thanks @enr0n! I'm deprioritzing this issue here for cloud-init as we won't need to perform this lookup and udevadm settle when no network devices are present, as the upstream systemd commit for systemd/systemd#27822 performs this check prior to completing systemd-networkd-wait-online.service once present in distributions. This behavior will correctly block cloud-init.service anyway looks allowing cloud-init.service to wait in systemd-networkd-wait-online in the case where no devices are present besides the loopback device.

The cases where this race remains are lower priority and don't warrant generalizing a fix in cloud-init at this time:

  • environments without systemd v253.6
    • In near term, this doesn't affect the majority of cloud-init use-cases as kernels or initramfs are present with supported drivers for most use-cases. We will document this behavior in release notes for minimized images when additional required drivers are needed and not yet loaded.
  • systems with multiple types of network interfaces, only some of which have modules loaded
    • In this scenario cloud-init will still be able to generate fallback network configuration on the only viable interfaces seen which should be enough in most platforms to access the instance metadata service to obtain necessary cloud-config on which to act to fully setup the system.

@xnox
Copy link
Contributor

xnox commented Oct 2, 2023

@blackboxsw have you escalated systemd issue back to foundations? they could have fixed it in time for mantic release, no?

@enr0n is there a bug report that systemd is tracking for this?

@enr0n
Copy link

enr0n commented Oct 2, 2023

I am now tracking https://bugs.launchpad.net/cloud-images/+bug/2036968 for Mantic.

@holmanb
Copy link
Member

holmanb commented Apr 17, 2024

Outstanding concerns

  • environments without systemd v253.6

Do we have any known cloud-init use cases that have both <v253.6 and no initrd? I don't think we do. On Ubuntu we currently release back to focal which has newer than that, so I don't think that this is relevant anymore.

systems with multiple types of network interfaces, only some of which have modules loaded

Waiting on systemd-networkd-wait-online should continue to suffice to access the instance metadata service. Providing a network configuration for late-loaded devices is still a possible concern. That said, I don't think that we can reasonably expect to take the responsibility of waiting on interfaces away from the init system / kernel / initramfs without significantly increasing the scope of cloud-init[1].

Proposals

All of the proposed solutions depend on udevadm in some way, and they all fail to solve the problem. The unaddressed issue is that when no uevent event has been queued yet by the kernel, replaying events via udevadm trigger and flushing queued uevents via udevadm settle will solve nothing. Consider this quote from man systemd-udev-settle.service(8):

There can be no guarantee that hardware is fully discovered at any specific time, because the kernel does hardware detection asynchronously, and certain buses and devices take a very long time to become ready, and also additional hardware may be plugged in at any time.

[1] I don't believe that we can know which ones to wait for without either a. platform-specific assumptions or b. relying on a (possibly incorrect) network configuration which in the broken case would require boot to hang indefinitely. We do need to make cloud-init resilient to device enumeration failures due to late device uevents, but that is tracked in a separate bug report.

@holmanb
Copy link
Member

holmanb commented Apr 17, 2024

The conclusion reached in this bug was that cloud-init expects a primary interface to be available via builtin module availability or dependency on systemd-networkd-wait-online. This issue was introduced outside of cloud-init (cloud-init didn't change scope), and it was also resolved outside of cloud-init. I don't believe this to be a bug in cloud-init.

@blackboxsw I don't see any issues or action items in scope for cloud-init related to the this original bug - I think that we should close it.

If there is remaining work to do related to, we should file a new issue which describes the user-visible behavior that is broken in cloud-init. Either way, unless I missed something significant, this issue should be closed.

@holmanb holmanb added bug: downstream This issue was filed against the upstream bug tracker but is a downstream issue. and removed bug Something isn't working correctly labels Apr 17, 2024
@holmanb
Copy link
Member

holmanb commented Apr 17, 2024

For future Ubuntu-specific bug reports please file reports on Launchpad.

@holmanb
Copy link
Member

holmanb commented Apr 18, 2024

Since the issue raised here resulted from changes to external projects on Ubuntu, and the fix was addressed also in external Ubuntu projects. I don't see any action left to take for this bug report, so I'm going to close it.

If further Ubuntu-related issues arise related to this, please file a new report on Launchpad.
If I missed anything, feel free to re-open.

@holmanb holmanb closed this as not planned Won't fix, can't repro, duplicate, stale Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: downstream This issue was filed against the upstream bug tracker but is a downstream issue.
Projects
None yet
Development

No branches or pull requests

5 participants