New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stopped (with error) /dev/dm-1 #1620
Comments
You could please boot with "systemd.log_level=debug", then reboot and shutdown, and provide us with a longer excerpt of the logs this generates, around the issue you are trying to debug? This should provide us with more information about what's precisely failing there. |
sure, will look if I have a few minutes time for that this afternoon. |
Enough?
|
Devices do not even have stop method so failure is pretty much expected here; but why it attempts to stop device in the first place? Can you make full log available? |
I prefer to send it to an email address, but yeah, I can do. |
i just did a reboot with debug enabled. here are 12k lines of glorious log (it starts from logind noticing that we're rebooting, but i can give you more if necessary): https://gist.github.com/e032493159fbc8cda780f773c261dc03 |
I get this too, for dm-0 via ecryptfs-setup-swap, on Debian unstable. Am I correct to infer that this is harmless and the fix is to deactivate a nonsensical 'stop device' service somewhere? How would I do that? Should we file bugs somewhere else against this? thanks |
I get this "Stopped (with error) /dev/mapper/home" if fstab line uses UUID=, otherwise no error if /dev/mapper/home is used. |
On Debian Stable, with systemd 230-7~bpo8+2, I got this error as well on a /home in RAID1 configuration on top on dm-crypt/luks.
Using only UUID designations in /etc/fstab and /etc/crypttab I got the errors:
So for me the workaround doesn't work. |
@db0451 did you find out anything regarding your question? I'm also interested if this is harmless of in fact something that I should look into ASAP? |
@denibertovic The only info I have on this issue is this thread. This continues to occur for me, though admittedly my system seems to work OK anyway. But it is always worrying to see red ERROR text anywhere, so if this is ultimately a non-issue and can/should be worked around, then I wish it would be. |
I did not find it harmless. I found my disks would be dirty on the next boot often because they did not unmount cleanly, with errors introduced that required fixing. And this is really difficult to fix on systemd systems because even in "single user" mode, you can't really remount your root directory ro. And that's when I determined that LUKS + bcache + systemd were not, for the moment, something I should be trying to use on the same system. |
@iKarith Thanks for the info. As a workaround, I will therefore unmount manually, before shutting down (or include an unmount script somewhere). |
I also encounter |
I too have that issue since one year. |
I have the same use case as @ledoge has. I already temporary tried to use device names (sda5) in the fstab / crypttab files. The same error message appears. Here is the log regarding this error during shutdown. |
I also had the "Stopped (with error) /dev/dm.." problem with encrypted swap, then found that "generator" doesn't add deps to swap unit in /var/run/systemd/generator. Instead of crypttab/fstab entries I just wrote simple .service and .swap units and now it stops w/out error. Examples from my local computer (HITACHI-SWAP - LVM volume, HITACHI-SWAP-OPEN - encrypted swap on HITACHI-SWAP): /etc/systemd/system/systemd-cryptsetup@HITACHI\x2dSWAP\x2dOPEN.service:
/etc/systemd/system/dev-mapper-HITACHI\x2dSWAP\x2dOPEN.swap:
(added to the default swap.target with |
Same issue here with an encrypted swap partition. |
I am also seeing these errors on Debian testing on two systems with root and /home on an encrypted LVM. One also has swap on that LVM, one has no swap. @poettering Why is the tag "needs-reporter-feedback" still set? Which feedback is missing? I'd be happy to provide more data if helpful. |
same issue here with encrypted swap partition, ubuntu 16.04. |
same on ubuntu 16.10 |
same error on Debian 8.6 |
Also having this issue on ubuntu 16.10 / systemd 231 / LVM on LUKS on mdadm/Mirrored Raid 1 Can a project member please remove the "needs-reporter-feedback" tag? Not sure if this tag is preventing developers from getting started on this bug, but I believe all the necessary info has already been supplied. If not, please let us know how we can help. |
@sky-lake Well, we do not know what info was requested ... :) As far as I can tell, the reason for this error message is the fact, that crypto container device |
@arvidjaar, given your above diagnosis, do you know how we can fix or work around this problem? |
FYI: I'm having the same problem on all my LUKS encrypted devices (the underlying devices are LVM logical volumes), on Debian stretch, systemd version 232-8. I don't have UUIDs in /etc/fstab, so the workaround discussed above doesn't work in my case. |
@mbiebl I tried reproducing the bug using your recipe from #1620 (comment). I used the normal stretch amd64 installer, and got the same partition layout that you did, except with sda instead of vda, but I doubt that matters. It has systemd-232-5 and shuts down cleanly. Can you maybe upload the image that exhibits the issue somewhere where I can download it? -- Looking at the dependency graph, systemd-cryptsetup@vda5_crypt.service has I did
Essentially, everything fails to stop, because stuff is mounted and devices are used, but systemd doesn't seem to know that. |
stop job for device just waits for device to "disappear" so unless timeouts are set very low or cryptsetup takes very long it should not matter in this case. This error is returned when "stop" job for device completes without timeout but device state at time of completion is not "stopped" (or device is still present in system from systemd PoV). One possible cause is systemd reload as I have demonstrated earlier. This is a bug that needs fixing. It is possible that some device event may cause premature job completion, I could not find it by code review. All those countless "me too"s do not help - we are facing race condition so unless there is clearly understood root cause with deterministic reproducer it is impossible to claim it has been fixed. Relative timings may change from boot to boot, not to mention from version to version. |
This simply adds one line to the generated unit: [Unit] Before=cryptsetup.target BindsTo=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device After=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device Before=umount.target + Before=dev-mapper-luks\x2d8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device BindsTo=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device After=dev-disk-by\x2duuid-8db85dcf\x2d6230\x2d4e88\x2d940d\x2dba176d062b31.device Before=umount.target Might fix systemd#1620 — please test!
@arvidjaar it's likely that we're dealing with more than one bug here. I started working on patch to test my theory about missing Before= dep, but now I see that won't work, because Before=*.device deps are not allowed. But I think I might be onto something — I don't see how systemd should know that the systemd-cryptsetup@.service should stay around until all devices it creates are unmounted. Afaics, there's no ordering that goes |
@keszybz I can no longer reproduce it |
This issue is about quite specific error message. Users see this error message in conjunction with cryptsetup simply because cryptsetup is one of very few cases when device is (attempted to be) actively stopped. But cryptsetup itself does not cause this error. I completely agree that we have rather bad situation with proper ordering of units here but this needs its own report that makes it obvious, not buried in this hundred pages long one. @mbiebl I can still easily trigger this error using 233 on openSUSE Tumbleweed. If you mean you do not see it during real shutdown - well, that's the nature of all races ... :) |
Problem seems to be gone now, at least on my machine. Since some update the error disappeared. I am on debian testing/sid:
|
I have this issue with fresh installation of openSUSE 42.3 in my production servers. |
I have the same issue, using Ubuntu Gnome 17.04 .................. Absolutely same thing. Crypted Partition and Swap issue..... Is there any solution (REAL SOLUTION) for this problem?? I will gladly pay someone to make it... if someone would like to do so. All the best & please help folks |
Tried systemd 234-3~bpo9+1 and 232-25+deb9u1, same results
|
happens in ubuntu 16.04 (only visible in logs!) and debian 9 (either delay 90 seconds or red lines in logs). is there any workaround to minimize chance of data loss ? i.e. sync disks? |
On Sun, Nov 05 2017, Victor Efimov wrote:
happens in ubuntu 16.04 (only visible in logs!) and debian 9 (either
delay 90 seconds or red lines in logs).
I discovered while testing that when using dracut instead of
initramfs-tools, the fsck output is completely hidden from the console.
Lovely.
|
At least is there way to setup new system with LUKS, with workaround for the issue making it safe against data-loss: for example maybe use separate partitions for root, /var and /tmp, and remount root as readonly (or it's already remounted?) before shutdown? |
I'm not aware that this is actually causing data loss |
On Mon, Nov 06 2017, Michael Biebl wrote:
I'm not aware that this is actually causing data loss
Isn't that implicit when the FS in not unmounted cleanly?
Sure, the FS is journaled and recovers quickly, but data which has not
been synced will get lost. The last journal is regularly moved away
because of this, with last entries being missing.
With recent systemd versions there's no error message anymore in the
console output. However, the problem persists: the FS is not unmounted
or remounted=ro before shutdown.
It's still a mystery to me why some systems that I've setup with
identical cryptsetup and partition layout exhibit this problem, while
others don't.
|
@wavexx where exactly das systemd fail to unmount a filesystem or remount it ro? |
On Mon, Nov 06 2017, Michael Biebl wrote:
@wavexx where exactly das systemd fail to unmount a filesystem or
remount it ro?
See my old journal/log that I've posted as well the old posts.
Aside from the fact that the error is gone from the console, the issue
still persist today.
The net result is that during boot the filesystem is found dirty just
after cryptsetup mounts it. I wrote already earlier that since the fsck
is quick and the output can be hidden by dracut as well, many people
will *not* notice that this is happening at all.
|
@wavexx which log shows that the file system is not unmounted? I don't see a complete log anywhere which shows that the file system was not unmounted |
On Mon, Nov 06 2017, Michael Biebl wrote:
@wavexx which log shows that the file system is not unmounted? I don't
see a complete log anywhere which shows that the file system was not
unmounted
The console doesn't show any error that this is happening, except the
mount error on the next boot.
I couldn't manage to save the journal on a remote system where where
this is happening all the time (the system is under a different
network), and on the system itself the journal is corrupted.
Turning on debug on the console doesn't show anything useful either.
There's no dump of commands as being executed, which would be incredibly
useful.
|
@wavexx please follow the instructions at https://freedesktop.org/wiki/Software/systemd/Debugging |
I think #8472 is a duplicate of this. |
I'm still affected by this, btw. On multiple laptops I have access to I have no issues, however the original system I reported is still not unmounting correctly on shutdown, causing data loss at every reboot. This happens only on the root mount, as there is another luks partition which is unmounted correctly. That system has been upgraded to cryptsetup 2 and the issue persists. The system it's running on a network I have no direct access to, making it impossible to capture the journal before the shutdown. On top of that it has some services that cannot go offline for long, and I didn't have time to replicate the entire setup to another identical system for in-depth testing. At this point I'm basically relying on the filesystem's own journal to recover. |
Well, if the late shutdown logic does not catch this, then there's likely a bug in that too. There's been quite some changes in that recently; can you give the latest version a try? Are you booting with or without a initramfs? And some full shutdown debug logs would be really helpful too.... Also, if you take a look at #8472, there's a workaround: Adding x-systemd.requires=systemd-cryptsetup@.service (adjust the service name) to the file system's fstab options should result in the proper shutdown sequence. |
On Tue, Jul 10 2018, Jan Janssen wrote:
Well, if the late shutdown logic does not catch this, then there's
likely a bug in that too. There's been quite some changes in that
recently; can you give the latest version a try? Are you booting with
I'm using systemd 239, with no change.
or without a initramfs? And some full shutdown debug logs would be
really helpful too....
I attached them earlier, and the shutdown sequence is still unchanged
even with the current version, so please refer to the old logs.
If you look, the issue is still the same: the crypto container is
stopped before unmounting the filesystem.
See: #1620 (comment)
|
I am talking about late shutdown logs to see why they also fail to clean things up. And yours are either not there or the systemd is too old. |
I was also encountering this issue with an freshly installed latest version of Debian 9 Stretch. Only the partition defined as swap was giving errors for me.
When the entry for the swap partition in the /etc/fstab is enabled again, running Ps. I'm using the default systemd that comes with this installation
|
Not sure if this fits here, but I'll report it anyway. My GPT header got corrupted many times until I found out that active Samba connections to my luks-encrypted volume on Ubuntu 18.04 prevented secure unmounting of the volume on shutdown. I saw similar entries in the log as the OP above (unfortunately, I can't find those anymore). Since manually shutting down samba before locking the luks volume and shutting down, everything works fine and I never again (for the last 3 months) had a corrupted GPT table. |
I am using full disk encryption and an encrypted swap partition. On every shutdown I get the following Error:
The text was updated successfully, but these errors were encountered: