Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-nspawn booting Ubuntu stuck on Reached target Host and Network Name Lookups. #17686

Closed
Botspot opened this issue Nov 20, 2020 · 17 comments
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer not-our-bug

Comments

@Botspot
Copy link

Botspot commented Nov 20, 2020

systemd version the issue has been seen with

It's complicated. I am using systemd 241 on my main system, but am using the latest compiled systemd-nspawn binary. (247)

Used distribution

Host system: ARMhf Debian 10

I have downloaded an armhf Ubuntu 20.04 image. After mounting the image to /media/pi/vdesktop, I run this command:

sudo systemd-nspawn -bD /media/pi/vdesktop

Full output:

Spawning container vdesktop on /media/pi/vdesktop.
Press ^] three times within 1s to kill container.
Host and machine ids are equal (d1dc3b5f97b7a536cd51a46a5faa0e3c): refusing to link journals
systemd 245.4-4ubuntu3.2 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization systemd-nspawn.
Detected architecture arm64.

Welcome to Ubuntu 20.04.1 LTS!

Set hostname to <ubuntu>.
/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
[  OK  ] Reached target Slices.
[  OK  ] Reached target Swap.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
         Starting Journal Service...
         Starting Set the console keyboard layout...
         Mounting FUSE Control File System...
         Starting Remount Root and Kernel File Systems...
[  OK  ] Started Journal Service.
[  OK  ] Finished Set the console keyboard layout.
[  OK  ] Mounted FUSE Control File System.
[FAILED] Failed to start Remount Root and Kernel File Systems.
See 'systemctl status systemd-remount-fs.service' for details.
         Starting Flush Journal to Persistent Storage...
         Starting Create System Users...
[  OK  ] Finished Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Finished Flush Journal to Persistent Storage.
[  OK  ] Finished Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Commit a transient machine-id on disk...
         Starting Create Volatile Files and Directories...
[  OK  ] Started Dispatch Password …ts to Console Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Finished Tell Plymouth To Write Out Runtime Data.
[FAILED] Failed to start Commit a transient machine-id on disk.
See 'systemctl status systemd-machine-id-commit.service' for details.
[  OK  ] Finished Create Volatile Files and Directories.
         Starting Network Name Resolution...
[  OK  ] Reached target System Time Set.
[  OK  ] Reached target System Time Synchronized.
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Finished Update UTMP about System Boot/Shutdown.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Trigger anacron every hour.
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Started Periodic ext4 Onli…ata Check for All Filesystems.
[  OK  ] Started Refresh fwupd metadata regularly.
[  OK  ] Started Daily rotation of log files.
[  OK  ] Started Daily man-db regeneration.
[  OK  ] Started Message of the Day.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Timers.
[  OK  ] Listening on Unix socket for apport crash forwarding.
[  OK  ] Listening on Avahi mDNS/DNS-SD Stack Activation Socket.
[  OK  ] Listening on CUPS Scheduler.
[  OK  ] Listening on D-Bus System Message Bus Socket.
         Starting Socket activation for snappy daemon.
[  OK  ] Listening on UUID daemon activation socket.
[  OK  ] Listening on Socket activation for snappy daemon.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting LSB: network connection manager...
         Starting Permit User Sessions...
         Starting Rotate log files...
         Starting Daily man-db regeneration...
[  OK  ] Finished Permit User Sessions.
[  OK  ] Started LSB: network connection manager.
[  OK  ] Reached target OEM Configuration.
[  OK  ] Started D-Bus System Message Bus.
         Starting Network Manager Script Dispatcher Service...
[  OK  ] Started Network Manager Script Dispatcher Service.
         Starting WPA supplicant...
[  OK  ] Started WPA supplicant.
[  OK  ] Finished Daily man-db regeneration.
[  OK  ] Finished Rotate log files.
[  OK  ] Started Network Name Resolution.
[  OK  ] Reached target Network.
[  OK  ] Reached target Host and Network Name Lookups.

Linux kernel version used (uname -a)

Linux raspberrypi 5.4.72-v7l+ #1356 SMP Thu Oct 22 13:57:51 BST 2020 armv7l GNU/Linux

Expected behaviour you didn't see

I expected the boot process to finish successfully. I expected to be able to login.

Unexpected behaviour you saw

Instead, it gets stuck at [ OK ] Reached target Host and Network Name Lookups. in the boot process. It's stuck there forever and never proceeds further.

How to fix? Is there a systemd service I can mask in the Ubuntu image to allow the boot process to continue?

@poettering
Copy link
Member

host and machine IDs are equal? that#s weird.

Consider using "journalctl -M …" to have a look at the logs while the system boots up.

Can you log into the system with "machinectl … shell"?

Normally the getty for systemd running in containers is instantiated via the "systemd-getty-generator". Is that installed in your container? Does it run?

If you turn on debug logging, anything more you see? (pass systemd.log_level=debug on the nspawn cmdline)

@poettering poettering added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Nov 23, 2020
@Botspot
Copy link
Author

Botspot commented Nov 23, 2020

@poettering

host and machine IDs are equal? that#s weird.

It sure is. And I'm not sure why it would say that because /etc/machine-id is certainly different between host and guest.

Consider using "journalctl -M …" to have a look at the logs while the system boots up.

Okay here's the output:

-- Logs begin at Sun 2020-10-11 15:34:50 CDT, end at Mon 2020-11-23 12:01:16 CST. --
Oct 11 15:34:50 ubuntu systemd-journald[42]: Journal started
Oct 11 15:34:50 ubuntu systemd-journald[42]: Runtime Journal (/run/log/journal/610873e5ce93a58e0a1b78335f836c64) is 8.0M, max 76.5M, 68.5M free.
Oct 11 15:34:50 ubuntu keyboard-setup.sh[47]: Couldn't get a file descriptor referring to the console
Oct 11 15:34:50 ubuntu keyboard-setup.sh[48]: Couldn't get a file descriptor referring to the console
Oct 11 15:34:50 ubuntu systemd-remount-fs[49]: mount: /: can't find LABEL=writable.
Oct 11 15:34:50 ubuntu systemd-remount-fs[46]: /bin/mount for / exited with exit status 1.
Oct 11 15:34:50 ubuntu keyboard-setup.sh[50]: setupcon: We are not on the console, the console is left unconfigured.
Oct 11 15:34:51 ubuntu systemd[1]: Finished Set the console keyboard layout.
Oct 11 15:34:51 ubuntu systemd[1]: Mounted FUSE Control File System.
Oct 11 15:34:51 ubuntu systemd[1]: systemd-remount-fs.service: Main process exited, code=exited, status=1/FAILURE
Oct 11 15:34:51 ubuntu systemd[1]: systemd-remount-fs.service: Failed with result 'exit-code'.
Oct 11 15:34:51 ubuntu systemd[1]: Failed to start Remount Root and Kernel File Systems.
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Starting Flush Journal to Persistent Storage...
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Load/Save Random Seed being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Starting Create System Users...
Oct 11 15:34:51 ubuntu systemd-journald[42]: Time spent on flushing to /var/log/journal/610873e5ce93a58e0a1b78335f836c64 is 6.504ms for 17 entries.
Oct 11 15:34:51 ubuntu systemd-journald[42]: System Journal (/var/log/journal/610873e5ce93a58e0a1b78335f836c64) is 8.0M, max 531.6M, 523.6M free.
Oct 11 15:34:51 ubuntu systemd-sysusers[71]: Creating group systemd-coredump with gid 999.
Oct 11 15:34:51 ubuntu systemd-sysusers[71]: Creating user systemd-coredump (systemd Core Dumper) with uid 999 and gid 999.
Oct 11 15:34:51 ubuntu systemd[1]: Finished Flush Journal to Persistent Storage.
Oct 11 15:34:51 ubuntu systemd[1]: Finished Create System Users.
Oct 11 15:34:51 ubuntu systemd[1]: Starting Create Static Device Nodes in /dev...
Oct 11 15:34:51 ubuntu systemd[1]: Finished Create Static Device Nodes in /dev.
Oct 11 15:34:51 ubuntu systemd[1]: Reached target Local File Systems (Pre).
Oct 11 15:34:51 ubuntu systemd[1]: Reached target Local File Systems.
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Load AppArmor profiles being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Store a System Token in an EFI Variable being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Starting Commit a transient machine-id on disk...
Oct 11 15:34:51 ubuntu systemd[1]: Starting Create Volatile Files and Directories...
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in udev Kernel Device Manager being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Show Plymouth Boot Screen being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
Oct 11 15:34:51 ubuntu systemd[1]: Condition check resulted in Forward Password Requests to Plymouth Directory Watch being skipped.
Oct 11 15:34:51 ubuntu systemd[1]: Reached target Local Encrypted Volumes.
Oct 11 15:34:51 ubuntu systemd-machine-id-setup[75]: /etc/machine-id is not on a temporary file system.

Can you log into the system with "machinectl … shell"?

Machinectl shell does not work. It's stuck at run-parts: /etc/update-motd.d/98-fsck-at-reboot exited with return code 2

Interestingly, machinectl login does work.

Normally the getty for systemd running in containers is instantiated via the "systemd-getty-generator". Is that installed in your container? Does it run?

I am not familiar with Getty. The container I'm using is actually a fresh Ubuntu Mate image I downloaded from here.

If you turn on debug logging, anything more you see? (pass systemd.log_level=debug on the nspawn cmdline)

Yes I see a lot more, at first. But near the point when the boot process hangs, there is no additional output.

@poettering
Copy link
Member

Machinectl shell does not work. It's stuck at run-parts: /etc/update-motd.d/98-fsck-at-reboot exited with return code 2

That appears to be a bug in Ubuntu:

https://bugs.launchpad.net/ubuntu/+source/update-notifier/+bug/1881548

Normally the getty for systemd running in containers is instantiated via the "systemd-getty-generator". Is that installed in your container? Does it run?

I am not familiar with Getty. The container I'm using is actually a fresh Ubuntu Mate image I downloaded from here.

Maybe ask them for help?

getty is the thing that shows the login prompt.

please paste the debug log somewhere.

@Botspot
Copy link
Author

Botspot commented Nov 24, 2020

ubuntubootlog.txt

@Botspot
Copy link
Author

Botspot commented Nov 24, 2020

@poettering
I don't think the boot process is being held by any .service file in particular.

Sometimes it stops at "Started Hostname Service., other times at "Started Network Name Resolution.", "Reached target OEM Configuration.", or at "Started Network Manager Script Dispatcher Service."

It seems like there's some other process or service, that's started earlier on, that eventually stops the boot process.
How to find it?

  • sudo systemctl list-jobs -M vdesktop returns nothing. (It says "No jobs running.")
  • sudo systemd-analyze -M vdesktop returns Startup finished in 35.776s (userspace) oem-config.target reached after 25.194s in userspace

@Wirone
Copy link

Wirone commented Dec 1, 2020

I have similar problem after upgrading kernel today. Booting freezes on the same step (Reached target Host and Network Name Lookups), I have to change terminal and start X by hand (but it starts without hardware acceleration and wifi, running very slow). I spent few hours trying to solve it, without success.

Ubuntu 20.10 with kernel 5.8.0-29.generic

@Wirone
Copy link

Wirone commented Dec 2, 2020

FYI: I was able to boot normally (with video and wifi drivers) using 5.9.12 kernel installed with Mainline. Maybe someone will find it useful.

@poettering
Copy link
Member

@Botspot What is "oem-config.target"?

Your system is booting into some custom target, not multi-user.target or graphical.target. That target probably doesn't pull in getty.target.

This is a local misconfiguration or some distro-specific change. It's not a bug in systemd, since we upstream know no "oem-config.target". If this is a target you added, please fix your configuration. If you need help for that, contact the systemd mailing list or so, which is the better place for support. If this is a target invented by your downstream distro, please contact them/their community for help.

Either way, let's close this here, since this is a downtsream/local issue, not an upstream one. Hope that makes sense.

@Botspot
Copy link
Author

Botspot commented Dec 2, 2020

@poettering How have we confirmed that OEM-config.target is surely the cause for the freezes?

Concerning your theory that it's a "local misconfiguration", please remember that this is a fresh Ubuntu image, downloaded straight from their download page.

@poettering
Copy link
Member

@poettering How have we confirmed that OEM-config.target is surely the cause for the freezes?

There's no freeze. The target just doesn't list the gettys, so you don't get any. The system is fully up, just without the services that give you the login prompt, that is all. It's explicitly configured that way, and has nothing to do with systemd itself.

Again, talk to your distro if you didn't add "oem-config.target". it's not an upstream feature. I don't know what it is. Ask you distro for help. Maybe @mbiebl can help?

@everflux
Copy link

I experience the same problem using a network install ob ubuntu 21.04.
Never experienced this in the past, feels very much like systemd is broken in this regard.

@lethargosapatheia
Copy link

Funnily enough, I experience this with packer (1.7.4 deployed on proxmox 6.4-9/ubuntu 20.04.2) :) It simply won't continue, stops every time at Network/Host and Network Name Lookups. It freezes completely. Cannot access anything else.
Came across this while searching for this issue :)

@durfman
Copy link

durfman commented Feb 2, 2022

Using Packer 1.7.8 and ubuntu 20.04 image we receive the same issue as Lethargosapatheia explained. During packer build the VM gets stuck stating "Reached target Host and Network Name Lookups". Was resolution ever found?

@lethargosapatheia
Copy link

lethargosapatheia commented Feb 2, 2022

@durfman in my case the problem was related to cloud-init, which is not terribly explicit, unfortunately. I think the problem at that point was that it couldn't get network access at all or something to that effect.
One of the things I realised in the meantime was that I needed to use cloud-init from the very beginning, and no longer debian-installer.
The boot parameters also need to be correct. This is what I use:

      "boot_command": [
        "<esc><enter><f6><esc><wait> ",
        "<bs><bs><bs><bs><bs>",
        "ip={{ user `vm_ip` }}::{{ user `vm_gateway` }}:{{ user `vm_netmask` }}::::{{ user `vm_dns` }} ",
        "autoinstall ds=nocloud-net;s=http://{{ .HTTPIP }}:{{ .HTTPPort }}/ ",
        "--- <enter>"
      ]

The variables need to be substituted with the respective values, of course.

You can find the explanation (although not particularly direct, but you can infer it) here:
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/tree/usr/kinit/ipconfig/README.ipconfig
and here
https://mjmwired.net/kernel/Documentation/filesystems/nfs/nfsroot.txt

[later edit:]
Or I think it might have been related to the fact that the server I was deploying with packer couldn't reach the machine I was running packer on, in order to download the files from the http server. At that time I didn't know it needed that - so that's also important, it can get stuck there.

@Botspot
Copy link
Author

Botspot commented Feb 3, 2022

Someone else theorized that the cause of the frozen boot was because an initial setup service failed to run.
/usr/lib/oem-config/oem-config.service

@lethargosapatheia
Copy link

lethargosapatheia commented Feb 3, 2022

@Botspot I am quite sure it depends on the context and the error doesn't refer to something in particular, you can just limit the range of possibilities. In my care, for instance, this wasn't related to oem, as this service doesn't exist on Ubuntu virtual machines.
But it's good that you've put it out there, anyway.
The main issue is that you can't tell exactly, which is quite annoying. But if you have a look at the previous services that the OS is trying to start, that can be a useful clue.

@freddo256
Copy link

freddo256 commented Mar 15, 2023

I know this post is already over a year old, but I had the same problem when trying to build my packer project to Proxmox over a VPN. Like @lethargosapatheia said the server couldn't reach the host I was running Packer from. The creation of a Bastion host did the trick for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer not-our-bug
Development

No branches or pull requests

7 participants