Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stopped (with error) /dev/dm-1 #1620

Open
rumpelsepp opened this issue Oct 20, 2015 · 101 comments
Open

Stopped (with error) /dev/dm-1 #1620

rumpelsepp opened this issue Oct 20, 2015 · 101 comments

Comments

@rumpelsepp
Copy link

I am using full disk encryption and an encrypted swap partition. On every shutdown I get the following Error:

Okt 20 09:03:52 kronos systemd[1]: Deactivated swap /dev/dm-1.
Okt 20 09:03:52 kronos systemd[1]: Stopped (with error) /dev/dm-1.
Okt 20 09:03:54 kronos systemd[1]: Stopped /sys/devices/virtual/block/dm-1.
# /etc/crypttab
swap           /dev/sda3                         /dev/urandom            swap,discard
# /etc/crypttab.initramfs
cryptroot   /dev/sda2   none    discard
$ uname -a
Linux kronos 4.2.3-1-ARCH #1 SMP PREEMPT Sat Oct 3 18:52:50 CEST 2015 x86_64 GNU/Linux

$ systemctl --version
systemd 227
+PAM -AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

$ lsblk
NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                             8:0    0 335,4G  0 disk  
├─sda1                                          8:1    0   500M  0 part  /boot
├─sda2                                          8:2    0 319,9G  0 part  
│ └─cryptroot                                 254:0    0 319,9G  0 crypt /
└─sda3                                          8:3    0    15G  0 part  
  └─swap                                      254:1    0    15G  0 crypt [SWAP]
@poettering
Copy link
Member

You could please boot with "systemd.log_level=debug", then reboot and shutdown, and provide us with a longer excerpt of the logs this generates, around the issue you are trying to debug? This should provide us with more information about what's precisely failing there.

@poettering poettering added pid1 needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels Oct 20, 2015
@rumpelsepp
Copy link
Author

sure, will look if I have a few minutes time for that this afternoon.

@rumpelsepp
Copy link
Author

Enough?

Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2duuid-fa4c42b1\x2dff6d\x2d4d23\x2d9b69\x2d3bbc65abbb42.swap: Changed active -> dead
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2duuid-fa4c42b1\x2dff6d\x2d4d23\x2d9b69\x2d3bbc65abbb42.swap: Job dev-disk-by\x2duuid-fa4c42b1\x2dff6d\x2d4d23\x2d9b69\x2d3bbc65abbb42.swap/stop finished, result=done
Okt 20 12:51:25 kronos systemd[1]: Deactivated swap /dev/disk/by-uuid/fa4c42b1-ff6d-4d23-9b69-3bbc65abbb42.
Okt 20 12:51:25 kronos systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=587 reply_cookie=0 error=n/a
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dPLAIN\x2dswap.swap: Changed active -> dead
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dPLAIN\x2dswap.swap: Job dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dPLAIN\x2dswap.swap/stop finished, result=done
Okt 20 12:51:25 kronos systemd[1]: Deactivated swap /dev/disk/by-id/dm-uuid-CRYPT-PLAIN-swap.
Okt 20 12:51:25 kronos systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=588 reply_cookie=0 error=n/a
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2dname\x2dswap.swap: Changed active -> dead
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2dname\x2dswap.swap: Job dev-disk-by\x2did-dm\x2dname\x2dswap.swap/stop finished, result=done
Okt 20 12:51:25 kronos systemd[1]: Deactivated swap /dev/disk/by-id/dm-name-swap.
Okt 20 12:51:25 kronos systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=589 reply_cookie=0 error=n/a
Okt 20 12:51:25 kronos systemd[1]: dev-dm\x2d1.swap: Changed active -> dead
Okt 20 12:51:25 kronos systemd[1]: dev-dm\x2d1.swap: Job dev-dm\x2d1.swap/stop finished, result=done
Okt 20 12:51:25 kronos systemd[1]: Deactivated swap /dev/dm-1.
Okt 20 12:51:25 kronos systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=590 reply_cookie=0 error=n/a
Okt 20 12:51:25 kronos systemd[1]: dev-dm\x2d1.device: Job dev-dm\x2d1.device/stop finished, result=failed
Okt 20 12:51:25 kronos systemd[1]: Stopped (with error) /dev/dm-1.
Okt 20 12:51:25 kronos systemd[1]: Sent message type=signal sender=n/a destination=n/a object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=591 reply_cookie=0 error=n/a
Okt 20 12:51:25 kronos systemd[1]: dev-dm\x2d1.swap: Collecting.
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2dname\x2dswap.swap: Collecting.
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dPLAIN\x2dswap.swap: Collecting.
Okt 20 12:51:25 kronos systemd[1]: dev-disk-by\x2duuid-fa4c42b1\x2dff6d\x2d4d23\x2d9b69\x2d3bbc65abbb42.swap: Collecting.

@arvidjaar
Copy link
Contributor

dev-dm\x2d1.device: Job dev-dm\x2d1.device/stop finished, result=failed

Devices do not even have stop method so failure is pretty much expected here; but why it attempts to stop device in the first place? Can you make full log available?

@rumpelsepp
Copy link
Author

I prefer to send it to an email address, but yeah, I can do.

@igalic
Copy link

igalic commented Mar 31, 2016

i just did a reboot with debug enabled. here are 12k lines of glorious log (it starts from logind noticing that we're rebooting, but i can give you more if necessary): https://gist.github.com/e032493159fbc8cda780f773c261dc03

@db-src
Copy link

db-src commented May 30, 2016

I get this too, for dm-0 via ecryptfs-setup-swap, on Debian unstable.

Am I correct to infer that this is harmless and the fix is to deactivate a nonsensical 'stop device' service somewhere? How would I do that? Should we file bugs somewhere else against this?

thanks

@djgera
Copy link

djgera commented Oct 14, 2016

I get this "Stopped (with error) /dev/mapper/home" if fstab line uses UUID=, otherwise no error if /dev/mapper/home is used.

@saintger
Copy link

On Debian Stable, with systemd 230-7~bpo8+2, I got this error as well on a /home in RAID1 configuration on top on dm-crypt/luks.
Using only label designations in /etc/fstab and /etc/crypttab I got the errors:

Stopped (with error) /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-2814d5a159c1237c95155b0e2efdg3e1-enc1.
Stopped (with error) /dev/dm-1.
Stopped (with error) /dev/disk/by-id/dm-name-enc1.
Stopped (with error) /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-43759e562s4df5gh2z4r8teca4dbbad9-enc0.
Stopped (with error) /dev/disk/by-uuid/dea9dd5d-a129-414d-dfe546789347.
Stopped (with error) /dev/disk/by-label/vol0.
Stopped (with error) /dev/mapper/enc1.
Stopped (with error) /sys/devices/virtual/block/dm-0.
Stopped (with error) /dev/disk/by-id/dm-name-enc0.
Stopped (with error) /dev/dm-0.
Stopped (with error) /sys/devices/virtual/block/dm-1.

Using only UUID designations in /etc/fstab and /etc/crypttab I got the errors:

Stopped (with error) /dev/mapper/enc0.
Stopped (with error) /dev/dm-1.
Stopped (with error) /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-2814d5a159c1237c95155b0e2efdg3e1-enc1.
Stopped (with error) /dev/mapper/enc1.
Stopped (with error) /sys/devices/virtual/block/dm-1.
Stopped (with error) /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-43759e562s4df5gh2z4r8teca4dbbad9-enc0.
Stopped (with error) /dev/disk/by-id/dm-name-enc1.
Stopped (with error) /dev/dm-0.
Stopped (with error) /sys/devices/virtual/block/dm-0.
Stopped (with error) /dev/disk/by-id/dm-name-enc0.
Stopped (with error) /dev/disk/by-label/vol0.

So for me the workaround doesn't work.
However it seems that /home is correctly unmounted afterwards, but it would be nice if someone can confirm that these error messages are harmless.

@denibertovic
Copy link

Am I correct to infer that this is harmless and the fix is to deactivate a nonsensical 'stop device'
service somewhere? How would I do that? Should we file bugs somewhere else against this?

@db0451 did you find out anything regarding your question? I'm also interested if this is harmless of in fact something that I should look into ASAP?

@db-src
Copy link

db-src commented Oct 26, 2016

@denibertovic The only info I have on this issue is this thread.

This continues to occur for me, though admittedly my system seems to work OK anyway. But it is always worrying to see red ERROR text anywhere, so if this is ultimately a non-issue and can/should be worked around, then I wish it would be.

@knghtbrd
Copy link

I did not find it harmless. I found my disks would be dirty on the next boot often because they did not unmount cleanly, with errors introduced that required fixing. And this is really difficult to fix on systemd systems because even in "single user" mode, you can't really remount your root directory ro. And that's when I determined that LUKS + bcache + systemd were not, for the moment, something I should be trying to use on the same system.

@saintger
Copy link

@iKarith Thanks for the info. As a workaround, I will therefore unmount manually, before shutting down (or include an unmount script somewhere).
I'll keep monitoring this issue to see if there is something I can do toward solving this issue.

@ledoge
Copy link

ledoge commented Oct 28, 2016

I also encounter Stopped (with error) /dev/dm-1. on Arch Linux. I set up an encrypted swap partition with /dev/urandom as the password and mounted by its label as described here under "UUID and Label".

@camoz
Copy link
Contributor

camoz commented Nov 18, 2016

I too have that issue since one year.
Described it in the Arch Forums back then: https://bbs.archlinux.org/viewtopic.php?id=205275
For me, not using UUIDs in /etc/fstab solves it, too.

@urbie-mk2
Copy link

I have the same use case as @ledoge has. I already temporary tried to use device names (sda5) in the fstab / crypttab files. The same error message appears. Here is the log regarding this error during shutdown.

systemd_shutdown_stopped_with_error_dm-1_version_231.txt

@mnfdev
Copy link

mnfdev commented Nov 28, 2016

I also had the "Stopped (with error) /dev/dm.." problem with encrypted swap, then found that "generator" doesn't add deps to swap unit in /var/run/systemd/generator.

Instead of crypttab/fstab entries I just wrote simple .service and .swap units and now it stops w/out error.

Examples from my local computer (HITACHI-SWAP - LVM volume, HITACHI-SWAP-OPEN - encrypted swap on HITACHI-SWAP):

/etc/systemd/system/systemd-cryptsetup@HITACHI\x2dSWAP\x2dOPEN.service:

[Unit]
Description=Setup swap on /dev/mapper/HITACHI-SWAP-OPEN
Requires=dev-mapper-HITACHI\x2dSWAP.device
After=dev-mapper-HITACHI\x2dSWAP.device
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/systemd-cryptsetup attach 'HITACHI-SWAP-OPEN' '/dev/mapper/HITACHI-SWAP' '/dev/urandom' 'swap'
ExecStop=/lib/systemd/systemd-cryptsetup detach 'HITACHI-SWAP-OPEN'
ExecStartPost=/sbin/mkswap '/dev/mapper/HITACHI-SWAP-OPEN'

/etc/systemd/system/dev-mapper-HITACHI\x2dSWAP\x2dOPEN.swap:

[Unit]
Description=Start swap on /dev/mapper/HITACHI-SWAP-OPEN
Requires=systemd-cryptsetup@HITACHI\x2dSWAP\x2dOPEN.service
After=systemd-cryptsetup@HITACHI\x2dSWAP\x2dOPEN.service
Before=swap.target
DefaultDependencies=no

[Swap]
What=/dev/mapper/HITACHI-SWAP-OPEN

[Install]
WantedBy=swap.target

(added to the default swap.target with systemctl enable 'dev-mapper-HITACHI\x2dSWAP\x2dOPEN.swap')

@siwyd
Copy link

siwyd commented Nov 30, 2016

Same issue here with an encrypted swap partition.

@RalfJung
Copy link

RalfJung commented Dec 6, 2016

I am also seeing these errors on Debian testing on two systems with root and /home on an encrypted LVM. One also has swap on that LVM, one has no swap.

@poettering Why is the tag "needs-reporter-feedback" still set? Which feedback is missing? I'd be happy to provide more data if helpful.

@pdefreitas
Copy link

pdefreitas commented Dec 6, 2016

same issue here with encrypted swap partition, ubuntu 16.04.

@marchom
Copy link

marchom commented Dec 6, 2016

same on ubuntu 16.10

@mayosemmel
Copy link

same error on Debian 8.6
If you need any Logfiles, I also will be Happy to send them.

@Skyedra
Copy link

Skyedra commented Dec 15, 2016

Also having this issue on ubuntu 16.10 / systemd 231 / LVM on LUKS on mdadm/Mirrored Raid 1

Can a project member please remove the "needs-reporter-feedback" tag? Not sure if this tag is preventing developers from getting started on this bug, but I believe all the necessary info has already been supplied. If not, please let us know how we can help.

@arvidjaar
Copy link
Contributor

@sky-lake

Well, we do not know what info was requested ... :)

As far as I can tell, the reason for this error message is the fact, that crypto container device Requires corresponding cryptsetup service. Cryptsetup service is in turn configured to be stopped on shutdown. According to Requires behavior stopping cryptsetup unit initiates stopping of crypto container device. And as I mentioned before, devices do not have "stop" method at all which results in this error message.

@jnturton
Copy link

@arvidjaar, given your above diagnosis, do you know how we can fix or work around this problem?

@Vrihub
Copy link

Vrihub commented Jan 15, 2017

FYI: I'm having the same problem on all my LUKS encrypted devices (the underlying devices are LVM logical volumes), on Debian stretch, systemd version 232-8.

I don't have UUIDs in /etc/fstab, so the workaround discussed above doesn't work in my case.

@keszybz keszybz removed the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jul 9, 2017
@keszybz
Copy link
Member

keszybz commented Jul 10, 2017

@mbiebl I tried reproducing the bug using your recipe from #1620 (comment). I used the normal stretch amd64 installer, and got the same partition layout that you did, except with sda instead of vda, but I doubt that matters. It has systemd-232-5 and shuts down cleanly.

Can you maybe upload the image that exhibits the issue somewhere where I can download it?

--

Looking at the dependency graph, systemd-cryptsetup@vda5_crypt.service has After=dev-disk-by\x5cx2duuid-dd643cf7\x5cx2d01c4\x5cx2d497e\x5cx2db7a8\x5cx2dc8004fc8d8e3.device, and RequiredBy=dev-mapper-sda5_crypt.device, but there's no Before=dev-mapper-sda5_crypt.device. So when shutting down, systemd will could run the stop job for system-cryptsetup@.service before the stop job for the device(s) that depend on it.

I did systemctl stop systemd-cryptsetup@vda5_crypt.service and the logs show:

Jul 09 22:31:27 debianx systemd[1]: Stopped target Encrypted Volumes.
Jul 09 22:31:27 debianx systemd[1]: Stopping Cryptography Setup for sda5_crypt...
Jul 09 22:31:32 debianx systemd-cryptsetup[857]: Failed to deactivate: Device or resource busy
Jul 09 22:31:32 debianx systemd[1]: systemd-cryptsetup@sda5_crypt.service: Control process exited, code=exited status=1
Jul 09 22:31:32 debianx systemd[1]: Stopped Cryptography Setup for sda5_crypt.
Jul 09 22:31:32 debianx systemd[1]: systemd-cryptsetup@sda5_crypt.service: Unit entered failed state.
Jul 09 22:31:32 debianx systemd[1]: systemd-cryptsetup@sda5_crypt.service: Failed with result 'exit-code'.
Jul 09 22:32:57 debianx systemd[1]: sys-devices-virtual-block-dm\x2d0.device: Job sys-devices-virtual-block-dm\x2d0.device/stop timed out.
Jul 09 22:32:57 debianx systemd[1]: Timed out stopping /sys/devices/virtual/block/dm-0.
Jul 09 22:32:57 debianx systemd[1]: sys-devices-virtual-block-dm\x2d0.device: Job sys-devices-virtual-block-dm\x2d0.device/stop failed with result 'timeout'.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-lvm\x2dpv\x2duuid\x2dn9ZuZI\x2d9tgU\x2d2Yoj\x2dirT8\x2dR4Bf\x2dSsGw\x2da6eoYP.device: Job dev-disk-by\x2did-lvm\x2dpv\x2duuid\x2dn9ZuZI\x2d9tgU\x2
Jul 09 22:32:57 debianx systemd[1]: Timed out stopping /dev/disk/by-id/lvm-pv-uuid-n9ZuZI-9tgU-2Yoj-irT8-R4Bf-SsGw-a6eoYP.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-lvm\x2dpv\x2duuid\x2dn9ZuZI\x2d9tgU\x2d2Yoj\x2dirT8\x2dR4Bf\x2dSsGw\x2da6eoYP.device: Job dev-disk-by\x2did-lvm\x2dpv\x2duuid\x2dn9ZuZI\x2d9tgU\x2
Jul 09 22:32:57 debianx systemd[1]: dev-dm\x2d0.device: Job dev-dm\x2d0.device/stop timed out.
Jul 09 22:32:57 debianx systemd[1]: Timed out stopping /dev/dm-0.
Jul 09 22:32:57 debianx systemd[1]: dev-dm\x2d0.device: Job dev-dm\x2d0.device/stop failed with result 'timeout'.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-dm\x2dname\x2dsda5_crypt.device: Job dev-disk-by\x2did-dm\x2dname\x2dsda5_crypt.device/stop timed out.
Jul 09 22:32:57 debianx systemd[1]: Timed out stopping /dev/disk/by-id/dm-name-sda5_crypt.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-dm\x2dname\x2dsda5_crypt.device: Job dev-disk-by\x2did-dm\x2dname\x2dsda5_crypt.device/stop failed with result 'timeout'.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2ddd643cf701c4497eb7a8c8004fc8d8e3\x2dsda5_crypt.device: Job dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2ddd643
Jul 09 22:32:57 debianx systemd[1]: Timed out stopping /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-dd643cf701c4497eb7a8c8004fc8d8e3-sda5_crypt.
Jul 09 22:32:57 debianx systemd[1]: dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2ddd643cf701c4497eb7a8c8004fc8d8e3\x2dsda5_crypt.device: Job dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2ddd643

Essentially, everything fails to stop, because stuff is mounted and devices are used, but systemd doesn't seem to know that.

@arvidjaar
Copy link
Contributor

systemd will could run the stop job for system-cryptsetup@.service before the stop job for the device(s) that depend on it

stop job for device just waits for device to "disappear" so unless timeouts are set very low or cryptsetup takes very long it should not matter in this case.

This error is returned when "stop" job for device completes without timeout but device state at time of completion is not "stopped" (or device is still present in system from systemd PoV). One possible cause is systemd reload as I have demonstrated earlier. This is a bug that needs fixing. It is possible that some device event may cause premature job completion, I could not find it by code review.

All those countless "me too"s do not help - we are facing race condition so unless there is clearly understood root cause with deterministic reproducer it is impossible to claim it has been fixed. Relative timings may change from boot to boot, not to mention from version to version.

keszybz added a commit to keszybz/systemd that referenced this issue Jul 10, 2017
This simply adds one line to the generated unit:

  [Unit]
  Before=cryptsetup.target
  BindsTo=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device
  After=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device
  Before=umount.target
+ Before=dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device
  BindsTo=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device
  After=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device
  Before=umount.target

Might fix systemd#1620 — please test!
@keszybz
Copy link
Member

keszybz commented Jul 10, 2017

@arvidjaar it's likely that we're dealing with more than one bug here.

I started working on patch to test my theory about missing Before= dep, but now I see that won't work, because Before=*.device deps are not allowed. But I think I might be onto something — I don't see how systemd should know that the systemd-cryptsetup@.service should stay around until all devices it creates are unmounted. Afaics, there's no ordering that goes .mountluks.devicecryptsetup@.servicedev-disk.device.

@mbiebl
Copy link
Contributor

mbiebl commented Jul 10, 2017

@keszybz I can no longer reproduce it

@arvidjaar
Copy link
Contributor

@keszybz

it's likely that we're dealing with more than one bug here

This issue is about quite specific error message. Users see this error message in conjunction with cryptsetup simply because cryptsetup is one of very few cases when device is (attempted to be) actively stopped. But cryptsetup itself does not cause this error. I completely agree that we have rather bad situation with proper ordering of units here but this needs its own report that makes it obvious, not buried in this hundred pages long one.

@mbiebl I can still easily trigger this error using 233 on openSUSE Tumbleweed. If you mean you do not see it during real shutdown - well, that's the nature of all races ... :)

@rumpelsepp
Copy link
Author

rumpelsepp commented Jul 19, 2017

@keszybz I can no longer reproduce it

Problem seems to be gone now, at least on my machine. Since some update the error disappeared. I am on debian testing/sid:

$ systemctl --version
systemd 233
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN default-hierarchy=hybrid
$ apt show systemd
Package: systemd
Version: 233-10
...

@DarkCaster
Copy link

I have this issue with fresh installation of openSUSE 42.3 in my production servers.
While we are waiting for a fix, I've created small workaround for this issue - it seems sufficient for my case. If you brave enough you can try it too : https://github.com/DarkCaster/Systemd-Issue1620-Workaround

@jugovic
Copy link

jugovic commented Oct 12, 2017

I have the same issue, using Ubuntu Gnome 17.04 .................. Absolutely same thing. Crypted Partition and Swap issue.....

Is there any solution (REAL SOLUTION) for this problem??

I will gladly pay someone to make it... if someone would like to do so.

All the best & please help folks

@zifxify
Copy link

zifxify commented Oct 14, 2017

Tried systemd 234-3~bpo9+1 and 232-25+deb9u1, same results

NAME           MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda              8:0    0 238.5G  0 disk  
└─sda1           8:1    0 235.9G  0 part  
  └─lvm        254:0    0 235.9G  0 crypt 
    ├─vg0-swap 254:1    0    16G  0 lvm   [SWAP]
    ├─vg0-root 254:2    0    25G  0 lvm   /
    ├─vg0-home 254:3    0   100G  0 lvm   /home
    ├─vg0-var  254:4    0    15G  0 lvm   /var
    └─vg0-qemu 254:5    0    60G  0 lvm   
Oct 14 11:55:01 kwicero systemd[1682]: systemd-cryptsetup@lvm.service: Executing: /lib/systemd/systemd-cryptsetup detach lvm
Oct 14 11:55:01 kwicero systemd[1683]: systemd-random-seed.service: Executing: /lib/systemd/systemd-random-seed save
Oct 14 11:55:01 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:01 kwicero systemd[1685]: systemd-backlight@leds:dell::kbd_backlight.service: Executing: /lib/systemd/systemd-backlight save leds:dell::kbd_backlight
Oct 14 11:55:01 kwicero systemd-timesyncd[724]: Removed server 0.debian.pool.ntp.org.
Oct 14 11:55:01 kwicero systemd-timesyncd[724]: Removed server 1.debian.pool.ntp.org.
Oct 14 11:55:01 kwicero systemd-timesyncd[724]: Removed server 2.debian.pool.ntp.org.
Oct 14 11:55:01 kwicero systemd-timesyncd[724]: Removed server 3.debian.pool.ntp.org.
Oct 14 11:55:01 kwicero systemd[1687]: dev-mapper-vg0\x2dswap.swap: Executing: /sbin/swapoff /dev/mapper/vg0-swap
Oct 14 11:55:01 kwicero systemd[1688]: systemd-update-utmp.service: Executing: /lib/systemd/systemd-update-utmp shutdown
Oct 14 11:55:01 kwicero systemd-update-utmp[1688]: systemd-update-utmp running as pid 1688
Oct 14 11:55:01 kwicero systemd-update-utmp[1688]: systemd-update-utmp stopped as pid 1688
Oct 14 11:55:02 kwicero systemd[1701]: home-zifxify-Bestanden-3._Werk.mount: Executing: /bin/umount /home/zifxify/Bestanden/3._Werk
Oct 14 11:55:02 kwicero systemd[1702]: tmp.mount: Executing: /bin/umount /tmp
Oct 14 11:55:02 kwicero systemd[1703]: run-user-1000.mount: Executing: /bin/umount /run/user/1000
Oct 14 11:55:02 kwicero systemd[1704]: var.mount: Executing: /bin/umount /var
Oct 14 11:55:02 kwicero systemd[1709]: home.mount: Executing: /bin/umount /home
Oct 14 11:55:02 kwicero systemd[1712]: lvm2-monitor.service: Executing: /sbin/lvm vgchange --monitor n --ignoreskippedcluster
Oct 14 11:55:02 kwicero lvm[1712]:   5 logical volume(s) in volume group "vg0" unmonitored
Oct 14 11:55:02 kwicero lvmetad[392]: Failed to accept connection errno 11.
Oct 14 11:55:02 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:02 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:02 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:02 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:02 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:03 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:03 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:03 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:03 kwicero systemd-cryptsetup[1682]: device-mapper: remove ioctl on lvm failed: Device or resource busy
Oct 14 11:55:06 kwicero kernel: systemd-cryptse: 17 output lines suppressed due to ratelimiting
Oct 14 11:55:06 kwicero systemd[1]: systemd-journald.service: Received EPOLLHUP on stored fd 33 (stored), closing.
Oct 14 11:55:06 kwicero systemd[1]: Received SIGCHLD from PID 1682 (systemd-cryptse).
Oct 14 11:55:06 kwicero systemd[1]: Child 1682 (systemd-cryptse) died (code=exited, status=1/FAILURE)
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Child 1682 belongs to systemd-cryptsetup@lvm.service
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Control process exited, code=exited status=1
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Got final SIGCHLD for state stop.
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Changed stop -> failed
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Job systemd-cryptsetup@lvm.service/stop finished, result=done
Oct 14 11:55:06 kwicero systemd[1]: Stopped Cryptography Setup for lvm.
Oct 14 11:55:06 kwicero systemd[1]: systemd-cryptsetup@lvm.service: Unit entered failed state.
Oct 14 11:55:06 kwicero systemd[1720]: systemd-poweroff.service: Executing: /bin/systemctl --force poweroff
Oct 14 11:55:06 kwicero systemctl[1720]: Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 object=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=PowerOff cookie=1 reply_cookie=0 error=n/a
Oct 14 11:55:06 kwicero systemctl[1720]: Got message type=method_return sender=n/a destination=n/a object=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 error=n/a
Oct 14 11:55:06 kwicero kernel: systemd-shutdow: 4636 output lines suppressed due to ratelimiting
Oct 14 11:55:06 kwicero systemd-journald[352]: Journal stopped

@vsespb
Copy link

vsespb commented Nov 5, 2017

happens in ubuntu 16.04 (only visible in logs!) and debian 9 (either delay 90 seconds or red lines in logs).

is there any workaround to minimize chance of data loss ? i.e. sync disks?

@wavexx
Copy link

wavexx commented Nov 5, 2017 via email

@vsespb
Copy link

vsespb commented Nov 6, 2017

At least is there way to setup new system with LUKS, with workaround for the issue making it safe against data-loss: for example maybe use separate partitions for root, /var and /tmp, and remount root as readonly (or it's already remounted?) before shutdown?

@mbiebl
Copy link
Contributor

mbiebl commented Nov 6, 2017

I'm not aware that this is actually causing data loss

@wavexx
Copy link

wavexx commented Nov 6, 2017 via email

@mbiebl
Copy link
Contributor

mbiebl commented Nov 6, 2017

@wavexx where exactly das systemd fail to unmount a filesystem or remount it ro?

@wavexx
Copy link

wavexx commented Nov 6, 2017 via email

@mbiebl
Copy link
Contributor

mbiebl commented Nov 6, 2017

@wavexx which log shows that the file system is not unmounted? I don't see a complete log anywhere which shows that the file system was not unmounted

@wavexx
Copy link

wavexx commented Nov 7, 2017 via email

@mbiebl
Copy link
Contributor

mbiebl commented Nov 12, 2017

@wavexx please follow the instructions at https://freedesktop.org/wiki/Software/systemd/Debugging
Specifically "Diagnosing Shutdown Problems". Adjust the kernel parameters (to avoid the kernel rate limiting). Keep in mind that on Debian/Ubuntu the debug.sh script location should be /lib/systemd/system-shutdown/debug.sh

@medhefgo
Copy link
Contributor

I think #8472 is a duplicate of this.

@wavexx
Copy link

wavexx commented Jul 10, 2018

I'm still affected by this, btw. On multiple laptops I have access to I have no issues, however the original system I reported is still not unmounting correctly on shutdown, causing data loss at every reboot. This happens only on the root mount, as there is another luks partition which is unmounted correctly. That system has been upgraded to cryptsetup 2 and the issue persists.

The system it's running on a network I have no direct access to, making it impossible to capture the journal before the shutdown. On top of that it has some services that cannot go offline for long, and I didn't have time to replicate the entire setup to another identical system for in-depth testing.

At this point I'm basically relying on the filesystem's own journal to recover.

@medhefgo
Copy link
Contributor

Well, if the late shutdown logic does not catch this, then there's likely a bug in that too. There's been quite some changes in that recently; can you give the latest version a try? Are you booting with or without a initramfs? And some full shutdown debug logs would be really helpful too....

Also, if you take a look at #8472, there's a workaround: Adding x-systemd.requires=systemd-cryptsetup@.service (adjust the service name) to the file system's fstab options should result in the proper shutdown sequence.

@wavexx
Copy link

wavexx commented Jul 10, 2018 via email

@medhefgo
Copy link
Contributor

I am talking about late shutdown logs to see why they also fail to clean things up. And yours are either not there or the systemd is too old.

@StrawberryJuice
Copy link

I was also encountering this issue with an freshly installed latest version of Debian 9 Stretch.
Had setup several encrypted partitions with LVM+LUKS.

Only the partition defined as swap was giving errors for me.
Via the following steps is was working for me continuously after the reboots:

  1. Temporarily disable/ comment the entry for the swap partition in /etc/fstab
  2. Check that the entry for the swap partition is indeed correct in /etc/crypttab
  3. Run update-initramfs -u -k all
  4. Enable the entry for the swap partition again
  5. Reboot the system

When the entry for the swap partition in the /etc/fstab is enabled again, running update-initramfs -u -k all should give no difference.

Ps. I'm using the default systemd that comes with this installation

$ systemctl --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN
$ apt-cache policy systemd
systemd:
  Installed: 232-25+deb9u9
  Candidate: 232-25+deb9u9
  Version table:
 *** 232-25+deb9u9 500
        500 http://security.debian.org/debian-security stretch/updates/main amd64 Packages
        100 /var/lib/dpkg/status

@Sieboldianus
Copy link

Sieboldianus commented Mar 9, 2019

Not sure if this fits here, but I'll report it anyway. My GPT header got corrupted many times until I found out that active Samba connections to my luks-encrypted volume on Ubuntu 18.04 prevented secure unmounting of the volume on shutdown. I saw similar entries in the log as the OP above (unfortunately, I can't find those anymore). Since manually shutting down samba before locking the luks volume and shutting down, everything works fine and I never again (for the last 3 months) had a corrupted GPT table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests