Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing watch makes symlink under /dev/disk/by-uuid unreliable #23746

Closed
Geass-LL opened this issue Jun 15, 2022 · 5 comments
Closed

Removing watch makes symlink under /dev/disk/by-uuid unreliable #23746

Geass-LL opened this issue Jun 15, 2022 · 5 comments

Comments

@Geass-LL
Copy link
Contributor

Geass-LL commented Jun 15, 2022

systemd version the issue has been seen with

systemd 250 (250-4.el9)

Used distribution

CentOS Stream 9

Linux kernel version used (uname -a)

5.14.0-71.el9.aarch64

CPU architecture issue was seen on

aarch64

Expected behaviour you didn't see

symlink under /dev/disk/by-uuid is always the same as blkid

Unexpected behaviour you saw

Sometimes they are different

[root@localhost ~]# ll /dev/disk/by-uuid | grep sdb10
lrwxrwxrwx. 1 root root 11 Jun 14 11:27 d9821d60-4908-4c8f-8da5-9e7eee0d5efe -> ../../sdb10
[root@localhost ~]# blkid | grep sdb10
/dev/sdb10: UUID="88c77144-2d0b-46cc-a4e9-f26fb6cff07b" BLOCK_SIZE="1024" TYPE="ext4" PARTLABEL="logic" PARTUUID="70979746-98da-41ca-9857-a2b420ad1b06"

Steps to reproduce the problem

  1. add a new block device sdb;
  2. parted -s /dev/sdb mklabel gpt;
  3. Create 10 partitions:
for i in `seq 1 10`; do
    start=$(( $i * 10));
    end=$(( $start + 10 ));
    parted -s /dev/sdb mkpart logic ${start}M ${end}M;
    mkfs.ext4 /dev/sdb$i;
    partprobe -d /dev/sdb;
done
  1. partprobe on the background, mkfs when symlinks under /run/udev/watch are removed
# /root/partprobe.sh
while true; do
        if [ -f /run/break_parted ]; then
                break;
        else
                partprobe -d /dev/sdb;
                sleep 2;
        fi;
done
# /root/test.sh
while true; do
        rm -rf /run/break_parted;
        while true; do
                ls -l /run/udev/watch/ | grep b8:26 > /dev/null;
                if [ $? -ne 0 ]; then
                        mkfs.ext4 -F /dev/sdb10;
                        sleep 1;
                        res1=`blkid | grep sdb10 | awk '{print $2}' | sed 's/UUID=//g' | sed 's/"//g'`;
                        res2=`ls -l /dev/disk/by-uuid | grep sdb10 | awk '{print $9}'`;
                        echo "$res1";
                        echo "$res2";
                        if [ "$res1" != "$res2" ]; then
                                touch /run/break_parted;
                                break;
                        fi
                fi
        done
        sleep 10;
        res2=`ls -l /dev/disk/by-uuid | grep sdb10 | awk '{print $9}'`;
        if [ "$res1" != "$res2" ]; then
                break;
        else
                rm -rf /run/break_parted;
                sh /root/partprobe.sh &
        fi
done
  1. The shell script will stop when symlink is different from blkid

Additional program output to the terminal or log subsystem illustrating the issue

@Geass-LL
Copy link
Contributor Author

partprobe will trigger uevents on all /dev/sdb* devices, which makes udevd remove their watch fds when executing rules.
Can we fix this? If executing udev-rules doesn't trigger watch event, i.e. creating symlinks under a directory, maybe we need not remove their watch fds?

@yuwata yuwata added the udev label Jun 15, 2022
@yuwata
Copy link
Member

yuwata commented Jun 15, 2022

When a device is processed by udevd, you should not touch the device (partitioning, creating fs, etc). Also, when you want to touch a device, then it is recommended that the device is locked to make udevd not try to process it. See https://systemd.io/BLOCK_DEVICE_LOCKING/ for more details.

Since v251, we provide udevadm lock and udevadm wait. Please try to use it in the script.

Note, I am not familiar with parted or partprobe, but sfdisk also provides --lock option which should be mostly equivalent to udevadm lock /dev/sdb sfdisk ....

@yuwata
Copy link
Member

yuwata commented Jun 15, 2022

For older systemd (v250 or older), try to use udevadm settle before touching devices.

@yuwata yuwata added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jun 15, 2022
@yuwata
Copy link
Member

yuwata commented Jun 15, 2022

Or, in other words, the behavior is unfortunately expected. The above reproducer should introduce many races.

@Geass-LL
Copy link
Contributor Author

https://systemd.io/BLOCK_DEVICE_LOCKING/ is helpful. I think the race is between mkfs and udevd. After serching e2fsprogs LOCK_EX, I found this issue: tytso/e2fsprogs#30 . e2fsprogs seems hasn't use BSD file lock up to now. Maybe I should wait for it.

@yuwata yuwata added not-a-bug and removed needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer labels Jun 16, 2022
@yuwata yuwata closed this as completed Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants