Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix encrypted hard drive #62

Closed
allofmex opened this issue Oct 10, 2021 · 16 comments · Fixed by #75
Closed

Fix encrypted hard drive #62

allofmex opened this issue Oct 10, 2021 · 16 comments · Fixed by #75
Labels
bug Something isn't working

Comments

@allofmex
Copy link

This extends on PR 48 Use partitions read/write to calculate disk activity and Issue 38 only calculate partition access:

hd-idle (1.16) is still not working on newer kernel if timeout is set to >=600 seconds and the complete device is encrypted (no partitions).
For encrypted drives, idle checks should be done on partition level too (same as since #48 for unecrypted).

As far as I understand, the encrypted partitions are the dm-0, dm-1,... entries in /proc/diskstats.

  8       0 sda 12345 123....
  8      16 sdb 12345 123....
  8      17 sdb1 12345 123....
...
 253       2 dm-2 12345 123....
 253       3 dm-3 12345 123....

So hd-idle should check dm-x partition for activity instead of device itself.

Only thing to find out is what dm-? partition maps to the device:
Could be done via looping through /sys/block/dm-?. Directory slaves has device name. (Maybe there is even a better way)

In this example, to idle sda, dm-2 needs to be monitored
/sys/block/dm-2/slaves/sda/


hd-idle 1.16
5.11.0-37-generic, 20.04.2-Ubuntu

@Schneck1
Copy link

I think we both working on the same problem.
Found out that the package udisks2 is polling every 10mins to the hdds.
You can check this with:
udisksctl monitor
When you uninstall udisks2 all is fine...
In my opinion it has nothing to do with whether you are working with encrypted partitions or not.
In my case i have a veracrypt encrypted disk and it does not matter whether the encrypted partition is mounted or not.

@allofmex
Copy link
Author

@Schneck1 Thanks for your hint about udisks2. It may be the reason in my case too.

In my opinion it has nothing to do with whether you are working with encrypted partitions or not.

Right, because these 10min polls are happening on device (not partition) level:

# cat  /proc/diskstats |grep 'dm-2\|sda'
   8       0 sda 22438 151 12486120 494817 6908 564 582608 159997 0 347576 657739 0 0 0 0 46 2923
 253       2 dm-2 17793 0 12483784 394272 7472 0 582608 645384 0 341160 1039656 0 0 0 0 0 0
# cat  /proc/diskstats |grep 'dm-2\|sda'
   8       0 sda ->22456<- 151 12486120 ->494821<- 6908 564 582608 159997 0 347648 657743 0 0 0 0 46 2923
 253       2 dm-2 17793 0 12483784 394272 7472 0 582608 645384 0 341160 1039656 0 0 0 0 0 0

(sda polled, dm-2 (on sda) not)

If hd-idle will check only partition (dm-2) instead of (raw) drive (sda), we can keep those monitoring tools install while still being able to put drives to sleep. This behavior is implemented for normal partitions already (sda1...)

@adelolmo
Copy link
Owner

@allofmex
I'm a bit confuse here: If udisks2 spins up disk every 10 mins, how is hd-idle going to be any helpful?

@allofmex
Copy link
Author

udisk2 does not spin-up, it's device access just increases the diskstats of the device. So hd-idle does not spin down because it believes that there was an access.

As said before, you implemented the needed feature already (#48) by reading disk stats of sda1 instead of sda. Later you reverted the change for encrypted disks (#51).

But encrypted drives do have visible partitions. All what needs to be implemented is to search for dm-x partitions of sda drive, and to use partitions stats instead of device stats. (see suggest in my first post)

@adelolmo
Copy link
Owner

Thank you for your explanation, now I understand the situation.
I'll try to find some time to implement a solution for this case.

@spasche
Copy link
Contributor

spasche commented May 1, 2022

Hi,
I updated my server to Ubuntu 22.04 recently (kernel 5.15) and noticed that my disks weren't spinning down anymore.
Then I discovered this issue as I'm using LUKS on these devices.

What could be helpful is that there's also a /sys/class/block/BLOCK/holders directory in sysfs that contains the list of related dm-xxx devices that should be used to determine the disk IO activity.

Here's an example on my system for the /dev/sdb device.

cat /proc/diskstats
   8      16 sdb 23504371 86420 6530016638 41076992 14116 7695 145376 348653 0 26337688 41735391 0 0 0 0 7277 309746
...
 253       3 dm-3 23480523 0 6529464314 46446228 21811 0 174440 383144 0 26409700 46829372 0 0 0 0 0 0

The relation between sdb and dm-3 can be found from the holders directory:

ls /sys/class/block/sdb/holders/
dm-3

@V33m
Copy link

V33m commented May 16, 2022

I'm somewhat in the same boat since I want to start using LUKS. Have anyone setup LUKS and got it to work with hd-idle?

My setup is simple. The whole external drive is encrypted, so I decrypt it and then mount it to /mnt/usb in fstab. Here are the paths which I might use, as well as by uuid (/dev/disk/by-uuid/UUID)

$ ls /dev/disk/by-id/
dm-uuid-CRYPT-LUKS2-XXXXXXXXXXXXXXX-usb
dm-name-usb
$ ls /dev/disk/by-label/
USB

Adding any of them or all to the hd-idle config does not seem to spin down the drive. I get the following for all the paths by running hd-idle -t:

$ hd-idle -t /dev/disk/by-id/dm-name-usb
cannot spindown scsi disk /dev/disk/by-id/dm-name-usb:
device does not appear to be an sg device

$ hd-idle -t /dev/sda spins down the drive instantly.

Should I create two partions for my decrypted device instead of only one?

$ lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda            8:0    0   3.5T  0 disk  
└─usb       252:1    0   3.5T  0 crypt /mnt/usb

@kamild1996
Copy link

kamild1996 commented May 31, 2022

I think I'm having the same issue? I also use LUKS for my drives, but both my drives are also running Linux RAID (mirror). Timeout is set to 1800 and after this time passes, only one of the two drives spins down - it's always sdb getting spun-down by hd-idle, but not sda. I can manually spin-down sda without issues. Info:

# blkid
/dev/mmcblk0p1: UUID="aa6142cb-d988-4220-aa50-9294f560c8e4" BLOCK_SIZE="1024" TYPE="ext2" PARTUUID="e82c0000-01"
/dev/mmcblk0p2: UUID="6f950b39-b93c-4687-8139-9abd10bf9e2d" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="e82c0000-02"
/dev/mmcblk0p5: UUID="c6f46bbe-1afa-4f42-bef1-7fb938cefdc1" TYPE="swap" PARTUUID="e82c0000-05"
/dev/sda: UUID="97f8c171-2b08-930e-cc1b-4a197e4e8f4b" UUID_SUB="36345e5b-15d4-79e1-a20b-536a760d8894" LABEL="vault:SeagateRAID" TYPE="linux_raid_member"
/dev/sdb: UUID="97f8c171-2b08-930e-cc1b-4a197e4e8f4b" UUID_SUB="dbe5b188-f3b4-fc06-bf1a-f7b91e1cc617" LABEL="vault:SeagateRAID" TYPE="linux_raid_member"
/dev/md0: UUID="ed25038d-ee87-4564-8fa1-e084e08f0045" LABEL="SeagateENC" TYPE="crypto_LUKS"
/dev/mapper/md0-crypt: LABEL="SeagateFS" UUID="862e1e8e-e766-44d4-8d44-e04e506bcb81" BLOCK_SIZE="4096" TYPE="xfs"

# ls /dev/disk/by-uuid
6f950b39-b93c-4687-8139-9abd10bf9e2d  862e1e8e-e766-44d4-8d44-e04e506bcb81  aa6142cb-d988-4220-aa50-9294f560c8e4  c6f46bbe-1afa-4f42-bef1-7fb938cefdc1  ed25038d-ee87-4564-8fa1-e084e08f0045

# systemctl status hd-idle
● hd-idle.service - hd-idle - spin down idle hard disks
     Loaded: loaded (/lib/systemd/system/hd-idle.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2022-03-20 20:55:31 CET; 2 months 11 days ago
       Docs: man:hd-idle(8)
   Main PID: 1488 (hd-idle)
      Tasks: 7 (limit: 3460)
     Memory: 11.3M
        CPU: 3.458s
     CGroup: /system.slice/hd-idle.service
             └─1488 /usr/sbin/hd-idle -i 0 -a /dev/sda -i 1800 -a /dev/sdb -i 1800 -l /var/log/hd-idle.log

May 31 21:01:29 toster hd-idle[1488]: sdb spinup
May 31 21:01:29 toster hd-idle[1488]: sda spinup
May 31 21:34:29 toster hd-idle[1488]: sdb spindown

# cat /var/log/hd-idle.log
date: 2022-05-31, time: 21:01:29, disk: sdb, running: 1920, stopped: 84482
date: 2022-05-31, time: 21:01:29, disk: sda, running: 11220, stopped: 75241

As suggested by the reported, setting the timeout below 600 does fix the issue, allowing both drives to spin-down.

@V33m
Copy link

V33m commented Jul 18, 2022

Hey all! I'm here to share some insights on how I'm using symlinks with LUKS where previous by-label, by-uuid etc. are not possible.

  1. Mount your LUKS drive and run the following command to see how we are able to figure out which symlinks to use with hd-idle.
    sudo lsblk /dev/sd* -o PATH,FSSIZE,LABEL,UUID,PARTLABEL,PARTUUID,MODEL,SIZE,SERIAL,TYPE,WWN

We need to point hd-idle against the /dev/sda path which is the external drive, so only symlinks related to that path works. Output:

PATH             FSSIZE LABEL   UUID    PARTLABEL PARTUUID MODEL     SIZE SERIAL   TYPE  WWN
/dev/sda                                                   WD101FZBX 9.1T WC0JYTR2 disk  0x91000c02912e34
/dev/mapper/usb  9.1T  	USB 	MY_UUID                              9.1T          crypt
  1. Run the following command to see which devices the system has identified using by-id.
    sudo ls -lv /dev/disk/by-id/

Here we see that we can either use ata-WD101FZBX or wwn-0x91000c02912e34 as symlinks. Output:

lrwxrwxrwx 1 root root  9 Jul 18 22:15 ata-WD101FZBX -> ../../sda
lrwxrwxrwx 1 root root  9 Jul 18 22:15 dm-name-usb -> ../../dm-0
lrwxrwxrwx 1 root root  9 Jul 18 22:15 dm-uuid-CRYPT-LUKS2-kuiheg87egerjg3849g-usb -> ../../dm-0
lrwxrwxrwx 1 root root  9 Jul 18 22:15 wwn-0x91000c02912e34 -> ../../sda
  1. Edit /etc/default/hd-idle to use the symlink you prefer. In my case, I went with the symlink using WWN (unique storage identifier), yet I could have chosen MODEL (device identifier) instead.

HD_IDLE_OPTS='-i 0 -c ata -s 1 -l /var/log/hd-idle.log -a /dev/disk/by-id/wwn-0x91000c02912e34 -i 600'
Or
HD_IDLE_OPTS='-i 0 -c ata -s 1 -l /var/log/hd-idle.log -a /dev/disk/by-id/ata-WD101FZBX -i 600'

@adelolmo Feel free to edit and post the information as you like in readme.md given the information might be helpful to identify possible symlinks to use with LUKS👍

@adelolmo
Copy link
Owner

@V33m
Thank you for sharing your research with everyone.
I'm going to take your offer and add the steps in the readme. To be honest, I expect many issues opened asking for further explanations. I just hope you're around when those questions are asked :)

@adelolmo
Copy link
Owner

Here is the documentation: https://github.com/adelolmo/hd-idle#luks-support

I would like to get feedback from some of you before closing this issue.

@spasche
Copy link
Contributor

spasche commented Jul 24, 2022

I'm afraid @V33m procedure doesn't help for the issue with disks not spinning down when using LUKS.

hd-idle will resolve the symlink and lookup stats from /proc/diskstats matching the underlying device, so it's no different as running hd-idle -a /dev/sda ....

The real fix is to lookup statistics from the device mapper device used by LUKS (as explained in previous comments).

@V33m
Copy link

V33m commented Jul 26, 2022

Correct, my insights likely do not not cover all the different scenarios here. There might be differences on kernel level as LUKS and hd-idle is working fine for me on a 4.9 kernel setup after decreasing the timeout to 10 minutes (600 seconds) and changing symlinks where I was not able to spindown the harddrives prior. I'll move to kernel 5.10 (LTS) when I find time in the next weeks.

So for me, monitoring device stats (/dev/sda) works fine, yet on newer kernels, using different architectures and packages etc., the only solution might be to use partitions stats and device mapper when using LUKS and so on

spasche added a commit to spasche/hd-idle that referenced this issue Jul 26, 2022
When using LUKS, activity happens on the device mapper device.
Fixes adelolmo#62.
@adelolmo
Copy link
Owner

adelolmo commented Jul 28, 2022

@spasche and all

Edit: I made a mistake creating the packages for armfh, arm64 and i386. Fixing it as we speak. All good now.

Could you please try out the pre-release v1.17? https://github.com/adelolmo/hd-idle/releases/tag/v1.17
It includes the PR provided by @spasche with some adjustments.

@spasche
Copy link
Contributor

spasche commented Jul 29, 2022

Thanks for the new release!

I installed it yesterday and it's working as expected. I'm now able to run smartctl regularly on the drive for monitoring and it will still spin down 🎉

@Schneck1
Copy link

My problems with the 10min limit also gone with 1.17, many thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants