Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm-crypt corruption issues (?) #200

Closed
flokli opened this issue May 15, 2024 · 32 comments · Fixed by #202
Closed

dm-crypt corruption issues (?) #200

flokli opened this issue May 15, 2024 · 32 comments · Fixed by #202

Comments

@flokli
Copy link
Contributor

flokli commented May 15, 2024

In the last few days I've been running into a bunch of btrfs corruption issues on my Macbook M2 Air. I initially suspected a single fluke, but it got worse.

Yesterday I entirely re-created the filesystem (luks with --allow-discards), then mkfs.btrfs with default params, and again got btrfs errors.

It seems I can rule out the internal SSD internal, as the same issues also happens on a (somewhat reliable and fast) external SSD (formatted with LUKS and btrfs).

Opening filesystem to check...
Checking filesystem on /dev/mapper/usb
UUID: a4d7d051-44ee-4512-bcfd-3b634526b02a
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking csums against data
mirror 1 bytenr 23302864896 csum 0xc296d77c expected csum 0xe5e91fb3
ERROR: errors found in csum tree
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 42043219968 bytes used, error(s) found
total csum bytes: 40380008
total tree bytes: 685703168
total fs tree bytes: 570392576
total extent tree bytes: 63324160
btree space waste bytes: 119804959
file data blocks allocated: 41357516800
 referenced 41357484032

This was after copying my /nix/store from the host to /mnt, and unmounting.

dmesg of the host:

    6.192083] BTRFS: device label root devid 1 transid 41 /dev/disk/by-label/root scanned by mount (530)
[    6.192241] BTRFS info (device dm-0): first mount of filesystem 5eaac3f0-833c-4f0f-b6f5-df3eb94e4327
[    6.192248] BTRFS info (device dm-0): using crc32c (crc32c-generic) checksum algorithm
[    6.192252] BTRFS info (device dm-0): forcing free space tree for sector size 4096 with page size 16384
[    6.192254] BTRFS info (device dm-0): using free-space-tree
[    6.192254] BTRFS warning (device dm-0): read-write for sector size 4096 with page size 16384 is experimental
[    6.200521] BTRFS info (device dm-0): checking UUID tree
[    6.740028] systemd-journald[807]: Creating journal file /var/log/journal/bbe02739e577495c999bfebef448138d/system.journal on a btrfs file system, and copy-on-write is enabled. This is likely to slow down journal access substantially, please consider turning off the copy-on-write file attribute on the journal directory, using chattr +C.
[    6.817857] BTRFS info: devid 1 device path /dev/disk/by-label/root changed to /dev/dm-0 scanned by (udev-worker) (936)
[ 7896.890101] BTRFS: device fsid a4d7d051-44ee-4512-bcfd-3b634526b02a devid 1 transid 6 /dev/mapper/usb scanned by mount (17224)
[ 7896.890774] BTRFS info (device dm-2): first mount of filesystem a4d7d051-44ee-4512-bcfd-3b634526b02a
[ 7896.890802] BTRFS info (device dm-2): using crc32c (crc32c-generic) checksum algorithm
[ 7896.890811] BTRFS info (device dm-2): forcing free space tree for sector size 4096 with page size 16384
[ 7896.890816] BTRFS info (device dm-2): using free-space-tree
[ 7896.890819] BTRFS warning (device dm-2): read-write for sector size 4096 with page size 16384 is experimental
[ 7896.892854] BTRFS info (device dm-2): checking UUID tree
[ 7896.893323] BTRFS info (device dm-2): last unmount of filesystem a4d7d051-44ee-4512-bcfd-3b634526b02a
[ 7902.339423] BTRFS: device fsid a4d7d051-44ee-4512-bcfd-3b634526b02a devid 1 transid 8 /dev/mapper/usb scanned by mount (17294)
[ 7902.340267] BTRFS info (device dm-2): first mount of filesystem a4d7d051-44ee-4512-bcfd-3b634526b02a
[ 7902.340294] BTRFS info (device dm-2): using crc32c (crc32c-generic) checksum algorithm
[ 7902.340303] BTRFS info (device dm-2): forcing free space tree for sector size 4096 with page size 16384
[ 7902.340308] BTRFS info (device dm-2): using free-space-tree
[ 7902.340312] BTRFS warning (device dm-2): read-write for sector size 4096 with page size 16384 is experimental
[ 8089.623727] BTRFS warning (device dm-0): csum failed root 5 ino 978709 off 1253376 csum 0x81dd87df expected csum 0xe77556aa mirror 1
[ 8089.623738] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 8320.442776] BTRFS warning (device dm-0): csum failed root 5 ino 807754 off 49152 csum 0x579bee4a expected csum 0x8a36f543 mirror 1
[ 8320.442794] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[ 8320.450734] BTRFS warning (device dm-0): csum failed root 5 ino 807754 off 94208 csum 0x857c0a3b expected csum 0xc3bf7a9c mirror 1
[ 8320.450739] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
[ 8320.462502] BTRFS warning (device dm-0): csum failed root 5 ino 807754 off 49152 csum 0x579bee4a expected csum 0x8a36f543 mirror 1
[ 8320.462507] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
[ 8320.470859] BTRFS warning (device dm-0): csum failed root 5 ino 807754 off 49152 csum 0x579bee4a expected csum 0x8a36f543 mirror 1
[ 8320.470866] BTRFS error (device dm-0): bdev /dev/dm-0 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
[ 8336.179155] BTRFS warning (device dm-0): checksum verify failed on logical 1039695872 mirror 1 wanted 0x9e547725 found 0x87cd70c6 level 0
[ 8336.179673] BTRFS info (device dm-0): read error corrected: ino 0 off 1039695872 (dev /dev/dm-0 sector 2047040)
[ 8651.859183] BTRFS info (device dm-2): last unmount of filesystem a4d7d051-44ee-4512-bcfd-3b634526b02a
@tpwrules
Copy link
Owner

Similar issue here: #196

Should we roll back the kernel? If you have a relatively easy and safe way to replicate, can you try a few past kernel versions?

@flokli
Copy link
Contributor Author

flokli commented May 15, 2024

I'm currently trying ZFS with its own crypto layer, so if it's really dm-crypt (only) I shouldn't be affected anymore.

If that's stable, and everything is set up again, I can do the smoketest with the external drive on various kernel versions and see if there's a pattern.

@vs49688
Copy link

vs49688 commented May 16, 2024

Might be coincidental, but I hit some bad ext4 corruption yesterday on my M1, also using dm-crypt.

It was rebuilding the kernel+mesa, and the compile started failing with gcc complaining one of the kernel .c files was filled with binary content.

I noticed this in dmesg:

May 16 02:08:23 ZAIR kernel: EXT4-fs error (device dm-1): ext4_lookup:1855: inode #11943410: comm nix-daemon: iget: bad extra_isize 762 (inode size 256)

Rebooted to this, and ran a repair. I don't have the full fsck log, but it was large.

May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:46 UTC 2024] Passphrase for /dev/disk/by-uuid/2186c706-f18d-4be1-b1e6-cdfb0260843d:
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] Verifying passphrase for /dev/disk/by-uuid/2186c706-f18d-4be1-b1e6-cdfb0260843d... - success
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] starting device mapper and LVM...
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] 2 logical volume(s) in volume group "vg" now active
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] checking /dev/disk/by-uuid/07e2eada-bf28-4f4e-b0f0-1fbc05953b2a...
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] fsck (busybox 1.36.1)
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] [fsck.ext4 (1) -- /mnt-root/] fsck.ext4 -a /dev/disk/by-uuid/07e2eada-bf28-4f4e-b0f0-1fbc05953b2a
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:48 UTC 2024] root contains a file system with errors, check forced.
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] root: Inode 9608098 seems to contain garbage.
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] root: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] (i.e., without -a or -p options)
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] /dev/disk/by-uuid/07e2eada-bf28-4f4e-b0f0-1fbc05953b2a has unrepaired errors, please fix them manually.
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] An error occurred in stage 1 of the boot process, which must mount the
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] root filesystem on `/mnt-root' and then start stage 2.  Press one
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] of the following keys:
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] i) to launch an interactive shell
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] f) to start an interactive shell having pid 1 (needed if you want to
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] start stage 2's init manually)
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] r) to reboot immediately
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:26:51 UTC 2024] *) to ignore the error and continue
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:27:10 UTC 2024] Starting interactive shell...
May 16 02:28:42 ZAIR stage-1-init: [Wed May 15 16:28:41 UTC 2024] mounting /dev/disk/by-uuid/07e2eada-bf28-4f4e-b0f0-1fbc05953b2a on /...

@fx-chun
Copy link
Contributor

fx-chun commented May 16, 2024

I've had some recent corruption issues as well rendering my primary partition unbootable; I don't have a log to provide, but I'm using an ext4 partition on LUKS. I'm not sure what it could be exactly.

@flokli
Copy link
Contributor Author

flokli commented May 16, 2024

I'm currently trying ZFS with its own crypto layer, so if it's really dm-crypt (only) I shouldn't be affected anymore.

If that's stable, and everything is set up again, I can do the smoketest with the external drive on various kernel versions and see if there's a pattern.

I tried reproducing the issue from there, by copying my /nix/store to the external drive with a btrfs inside a luks volume.

I could not immediately reproduce it anymore, though that's a kernel with much more options enabled, essentially a distro kernel built with the asahi kernel sources (https://github.com/yu-re-ka/nixos-m1/tree/minimize-patches).

@devusb
Copy link

devusb commented May 16, 2024

I could reproduce this using ext4 + LUKS and btrfs + LUKS -- I didn't try for long, but it seemed like btrfs without LUKS was not exhibiting this issue (as observed by multiple scrubs without checksum errors).

Wonder if this also happens on Fedora -- spent a bunch of time trying to find any mention of it but no luck -- I had managed to convince myself this was a hardware issue on my side until now :)

@mixi
Copy link

mixi commented May 16, 2024

I can reproduce it with a vanilla linux v6.8.9 on a M1 Pro Macbook (j316). That opens up the possibility to bisect it.

The reproducer I am using is tio's fio's examples/basic-verify.fio on a freshly created dm-crypt volume, which seems to trigger the bug reliably.

@flokli
Copy link
Contributor Author

flokli commented May 16, 2024

Can you post a bit more details on how to reproduce? I don't know tio and a quick search didn't turn up anything helpful in particular.

@mixi
Copy link

mixi commented May 16, 2024

Sorry, that was because I typoed the name. The tool is called fio: axboe/fio.

It worked for me both with filename=/dev/mapper/... at the end of examples/basic-verify.tio (and adding loops=10 to get to roughly 10GB to be reproducible) for testing on the block device and with replacing that line with size=10G for testing a mounted filesystem.

In the meantime my bisect also pointed me to 2632e2521769 ("arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD") as the commit responsible.

@mixi
Copy link

mixi commented May 16, 2024

I double checked with the proper asahi kernel. It is fixed for me with the following commits reverted:

  • aefbab8e77eb ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch") (bisect with the new reproducer), and~~ (edit: the commit does not need to be reverted edit 2: the commit needs to be reverted)
  • 2632e2521769 ("arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD") (for context reasons)

@knurd
Copy link

knurd commented May 16, 2024

Do you want me to forward this report upstream? If yes, two short questions:

Did you do the bisection with a vanilla kernel? Is vanilla 6.9 still showing the same problem? And does a revert help there, too (I assume all of that is the case, but sometimes it's better to be sure)

[side note: I'm the Linux kernel's regression tracker; somebody pointed me here; normally I do not comment on downstream bug trackers, but I make an exception due to the data corruption aspect]

@knurd
Copy link

knurd commented May 16, 2024

ahh, I see, somebody reported it upstream already: https://lore.kernel.org/all/D1B7GPIR9K1E.5JFV37G0YTIF@shadowice.org/ great, thx!

@mixi
Copy link

mixi commented May 16, 2024

That was me reporting it, but thanks for the offer.

@tpwrules
Copy link
Owner

tpwrules commented May 16, 2024

Thanks all for the debugging efforts. I plan to do a NixOS Apple Silicon release with a revert patch within 24-48 hours, assuming the Asahi Linux kernel branch is not updated.

@jannau
Copy link

jannau commented May 17, 2024

I double checked with the proper asahi kernel. It is fixed for me with the following commits reverted:

* aefbab8e77eb ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch") (for the context), and

* 2632e2521769 ("arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD") (as found by `git bisect`)

Hej @mixi,
both commits need to be reverted on top of asahi-6.8.9-5 / v6.8? I'm a little confused since the linux-arm-kernel mail only mentions 2632e2521769 which reverts and builds cleanly.

@mixi
Copy link

mixi commented May 17, 2024

You are right to be confused. Reverting 2632e2521769 alone is enough, and that is also the commit bisect pointed me to yesterday.

Apparently I reverted one commit too many by accident and guessed I did it for context reasons when writing the comment afterwards.

@jannau
Copy link

jannau commented May 17, 2024

@tpwrules asahi-6.8.9-6 containing only the revert pushed to AsahiLinux/linux

@tpwrules
Copy link
Owner

tpwrules commented May 18, 2024

Latest release contains the revert. @flokli please close the issue if you are satisfied with that fix.

@mixi
Copy link

mixi commented May 21, 2024

Bad news: aefbab8e77eb ("arm64: fpsimd: Preserve/restore kernel mode NEON at context switch") also needs to be reverted. See https://lore.kernel.org/all/Zkw9kK0sXIgfqd01@shadowice/ for details, and a new reproducer that found the commit (the old one reproducibly sees the commit as good).

@jannau:

Apparently I reverted one commit too many by accident and guessed I did it for context reasons when writing the comment afterwards.

Correction: Apparently I reverted the right commit for the wrong reasons back then.

@ardbiesheuvel
Copy link

Please try this fix, and report on the thread whether or not it works for you:
https://lore.kernel.org/all/20240522091335.335346-2-ardb+git@google.com

@flokli
Copy link
Contributor Author

flokli commented May 22, 2024

Just to make sure, is this a fix to be applied on top of any reverts (and if so, which), or an attempt to fix without reverting anything else?

@ardbiesheuvel
Copy link

The latter.

@flokli flokli changed the title btrfs corruption issues (?) dm-crypt corruption issues (?) May 22, 2024
@flokli
Copy link
Contributor Author

flokli commented May 22, 2024

With asahi-6.8.9-7 (essentially reverting the other revert(s) and applying that patch) I don't seem to be running into these issue anymore, a lot of other folks on the ML thread also reported the same, and it already got applied to arm64 (for-next/core).

I guess what's left here is bumping linux-asahi in here again, then this can be closed.

@flokli
Copy link
Contributor Author

flokli commented May 22, 2024

PR up at #202

@larstiq
Copy link

larstiq commented May 23, 2024

Thanks @flokli ! I used to reliably get Firefox to crash by running nix-store --verify --check-contents in the background. With #202 that's no longer happening.

@flokli
Copy link
Contributor Author

flokli commented May 25, 2024

#202 has been merged (bumping the kernel to asahi-6.8.9-7, including a cherrypick), and a new release of nixos-apple-silicon been created, so we can close the issue here.

On the upstream kernel side, I however noticed the fix only landed in the master branch so far - meaning other aarch64 machines running the mainline kernel might still run into this corruption.

@knurd is there anything else left to be done so this gets cherrypicked to linux-6.9.y, so it'll land in v6.9.2?

@flokli flokli closed this as completed May 25, 2024
@knurd
Copy link

knurd commented May 25, 2024

@knurd is there anything else left to be done so this gets cherrypicked to linux-6.9.y, so it'll land in v6.9.2?

That's likely too late, as 6.9.2 is in its -rc phase already – and usually Greg does not add any patches at that point aiui. You could ask though. But it likely should go into 6.9.3 dues to the "CC: stable..." tag in the commit.

@flokli
Copy link
Contributor Author

flokli commented May 25, 2024

(trying here, as I don't have that ML subscribed): Hey @gregkh, any chance "arm64/fpsimd: Avoid erroneous elide of user state reload" could still end up in 6.9.2, due to its data corruption nature?

@gregkh
Copy link

gregkh commented May 25, 2024

Please send stable requests to stable@vger.kernel.org, we can't take stuff from random github repos for obvious reasons.

@flokli
Copy link
Contributor Author

flokli commented May 25, 2024

The commit I linked had a Cc: stable in the message. That's sufficient?

@knurd
Copy link

knurd commented May 25, 2024

The commit I linked had a Cc: stable in the message. That's sufficient?

Up to Greg, but I'd say it's in the everyone's best interest if you write a quick mail to the list (like with most Linux kernel lists, you don't have to be subscribed!) with Greg CCed (side note: you might ask for the patch to be included in 6.8.y, too) – that among others is also important for the paper trail in case the question "who asked for this to be included" comes up later.

@flokli
Copy link
Contributor Author

flokli commented May 25, 2024

@knurd Sent out an email to stable@, both you and greg are in CC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.