-
-
Notifications
You must be signed in to change notification settings - Fork 17.1k
Description
There are several reports of storage issues all on this one laptop model here, so I am filing a bug: https://discourse.nixos.org/t/nvme-drive-not-detecting-after-calameres-initiates/32108/
Live posts of me doing debugging: https://matrix.to/#/!DBFhtjpqmJNENpLDOv:nixos.org/$9zWPOBSxYIou5aEGILjwnDFgEJLuIi7JTuUUo7iQkyw?via=nixos.org&via=matrix.org&via=tchncs.de
Results of testing on my machine yesterday:
Kernels I believe are ok: 6.4.9, 6.1.44
Kernels I believe are bad: 6.5.0, 6.5.1, 6.1.51, 6.1.49(?)
Debug log of a failing boot: https://gist.github.com/lf-/58fb6bfd13e4f3d09d8e2c39b279b46a
Describe the bug
Various storage-not-detected/io error symptoms on the XPS 15 9560, with the internal SSD. Seems to not matter massively what SSD it is, since there are people having it with the original SSD as far as I can tell.
Reproduced on this aftermarket drive (there's a firmware rev in here somewhere right?):
$ nvme id-ctrl /dev/nvme0
<snip>
mn : WD Blue SN570 1TB
fr : 234110WD
rab : 4
ieee : 001b44
cmic : 0
mdts : 7
cntlid : 0
ver : 0x10400
rtd3r : 0x7a120
rtd3e : 0xf4240
oaes : 0x200
ctratt : 0x2
rrls : 0
cntrltype : 1
<snip>
Most relevant part of failed boot log:
Sep 04 11:56:38 localhost kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
Sep 04 11:56:38 localhost kernel: nvme nvme0: Does your device have a faulty power saving mode enabled?
Sep 04 11:56:38 localhost kernel: nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
Sep 04 11:56:38 localhost kernel: nvme 0000:04:00.0: Unable to change power state from D3cold to D0, device inaccessible
Sep 04 11:56:38 localhost kernel: nvme nvme0: Disabling device after reset failure: -19
Sep 04 11:56:38 localhost systemd-cryptsetup[169]: Device /dev/disk/by-uuid/b80aedf8-ddd4-46fa-8d09-5215d5f286b9 READ lock released.
Sep 04 11:56:38 localhost systemd-cryptsetup[169]: IO error while decrypting keyslot.
Sep 04 11:56:38 localhost systemd-cryptsetup[169]: Keyslot 0 (luks2) open failed with -5.
I have not yet tried the alleged workaround given in the message here. I might try it on NixCon hacking day when I don't need my computer to work.
Steps To Reproduce
- Use bad kernel
- ???
- Suffering
Expected behavior
Disk works, system boots.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
This is a regression between nixos-unstable revisions ce5e4a6 and 3efb0f6 which I have debugged to be kernel-version induced. systemd and cryptsetup versions are constant across both
Notify maintainers
@TredwellGit @Ma27 @NeQuissimus @alyssais @thoughtpolice
Metadata
- system: `"x86_64-linux"`
- host os: `Linux 6.4.9, NixOS, 23.11 (Tapir)
, 23.11.20230902.e569908`
- multi-user?: `yes`
- sandbox: `yes`
- version: `nix-env (Nix) 2.17.0`
- nixpkgs: `/etc/nix/inputs/nixpkgs`