Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-tmpfiles hangs inside systemd-nspawn #18189

Closed
Andrei-Pozolotin opened this issue Jan 10, 2021 · 6 comments
Closed

systemd-tmpfiles hangs inside systemd-nspawn #18189

Andrei-Pozolotin opened this issue Jan 10, 2021 · 6 comments
Labels
bug 🐛 Programming errors, that need preferential fixing needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer

Comments

@Andrei-Pozolotin
Copy link

systemd version the issue has been seen with

systemctl --version
systemd 247 (247.2-1-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

Used distribution

archlinux

Linux kernel version used (uname -a)

uname -a
Linux work3 5.10.5-arch1-1 #1 SMP PREEMPT Thu, 07 Jan 2021 09:50:43 +0000 x86_64 GNU/Linux

CPU architecture issue was seen on

* amd
* intel

PROBLEM

We have multiple nspawn archlinux guests running inside multiple archlinux hosts, which were working fine for years.
After recent hosts update to the kernel/systemd versions listed above, affected nspawn guests
started to hang on boot with guest journal entries such as:

Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Reached target Remote File Systems.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Reached target Slices.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Reached target Swap.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on Device-mapper event daemon FIFOs.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on Process Core Dump Socket.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on initctl Compatibility Named Pipe.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on Journal Socket (/dev/log).
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on Journal Socket.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Listening on Network Service Netlink Socket.
Jan 09 21:39:00 work3 data-node[4070]:          Mounting Huge Pages File System...
Jan 09 21:39:00 work3 data-node[4070]:          Starting Journal Service...
Jan 09 21:39:00 work3 data-node[4070]:          Mounting FUSE Control File System...
Jan 09 21:39:00 work3 data-node[4070]:          Starting Remount Root and Kernel File Systems...
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Mounted Huge Pages File System.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Mounted FUSE Control File System.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Started Remount Root and Kernel File Systems.
Jan 09 21:39:00 work3 data-node[4070]:          Starting Create Static Device Nodes in /dev...
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Started Create Static Device Nodes in /dev.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Reached target Local File Systems (Pre).
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Reached target Local File Systems.
Jan 09 21:39:00 work3 data-node[4070]:          Starting Network Service...
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Started Network Service.
Jan 09 21:39:00 work3 data-node[4070]: [  OK  ] Started Journal Service.
Jan 09 21:39:00 work3 data-node[4070]:          Starting Flush Journal to Persistent Storage...
Jan 09 21:39:01 work3 data-node[4070]: [  OK  ] Started Flush Journal to Persistent Storage.
Jan 09 21:39:01 work3 data-node[4070]:          Starting Create Volatile Files and Directories...
Jan 09 21:39:06 work3 data-node[4070]: [*     ] A start job is running for Create V…es and Directories (6s / no limit)
Jan 09 21:39:06 work3 data-node[4070]: ^M[**    ] A start job is running for Create V…es and Directories (6s / no limit)
Jan 09 21:39:07 work3 data-node[4070]: ^M[***   ] A start job is running for Create V…es and Directories (7s / no limit)
Jan 09 21:39:07 work3 data-node[4070]: ^M[ ***  ] A start job is running for Create V…es and Directories (7s / no limit)
Jan 09 21:39:08 work3 data-node[4070]: ^M[  *** ] A start job is running for Create V…es and Directories (8s / no limit)
Jan 09 21:39:08 work3 data-node[4070]: ^M[   ***] A start job is running for Create V…es and Directories (8s / no limit)
Jan 09 21:39:09 work3 data-node[4070]: ^M[    **] A start job is running for Create V…es and Directories (9s / no limit)

with process systemd-tmpfiles sitting in the TASK_UNINTERRUPTIBLE state
and producing endless A start job is running for ... journal messages

ps aux | grep systemd-tmpfiles
root        4100  0.0  0.0  15412  6716 ?        Ds   21:39   0:00 /usr/bin/systemd-tmpfiles --create --remove --boot --exclude-prefix=/dev

WORKAROUND

none found so far

@Andrei-Pozolotin
Copy link
Author

also tracking in archlinux: https://bugs.archlinux.org/task/69268

@DaanDeMeyer
Copy link
Contributor

See #18140. Are you using overlayfs? If so, this is caused by a kernel overlayfs bug.

@DaanDeMeyer DaanDeMeyer added bug 🐛 Programming errors, that need preferential fixing needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels Jan 10, 2021
@Andrei-Pozolotin
Copy link
Author

yes, using overlayfs

@DaanDeMeyer
Copy link
Contributor

Alright, then I'm closing this issue. You can follow the patch I linked in #18140 for more updates. I expect this to be fixed in a minor kernel release soonishly.

@Andrei-Pozolotin
Copy link
Author

verified workaround for archlinux: system roll back to 2020-12-30
https://wiki.archlinux.org/index.php/Arch_Linux_Archive#How_to_restore_all_packages_to_a_specific_date

@Andrei-Pozolotin
Copy link
Author

the issue is resolved for archlinux with kernel 5.10.15 via
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a66f82a1de028878bb158cfaac178f3a710ebdeb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer
Development

No branches or pull requests

2 participants