Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale devlink device units are not cleaned up while dispatching UDEV change events #24518

Closed
andatche opened this issue Aug 31, 2022 · 6 comments · Fixed by #24522
Closed

Stale devlink device units are not cleaned up while dispatching UDEV change events #24518

andatche opened this issue Aug 31, 2022 · 6 comments · Fixed by #24522
Labels
bug 🐛 Programming errors, that need preferential fixing pid1 udev
Milestone

Comments

@andatche
Copy link

andatche commented Aug 31, 2022

systemd version the issue has been seen with

239-58.el8_6.4 (appears to affect main)

Used distribution

AlmaLinux 8

Linux kernel version used

4.18.0-372.19.1.el8_6.x86_64

CPU architectures issue was seen on

x86_64

Component

systemd

Expected behaviour you didn't see

Systemd should remove stale devlinks device units while dispatching UDEV "change" events for the parent device (e.g. when renaming an LVM Logical Volume).

Unexpected behaviour you saw

Currently systemd will create new device units for any new paths appearing in devlinks in a UDEV change event, but will not remove previously-created device units for paths which are no longer present in devlinks. This leaves device units for non-existent paths hanging around until the parent device is removed.

Further, if a daemon-reload is issued before the parent device unit is removed, the stale devlinks units lose their association with the parent device (through matching syspath) during device enumeration because the paths no longer exist, leaving them hanging around for ever (until rebooting).

Although I tested with 239-58.el8_6.4 on AlmaLinux 8, after checking over src/core/device.c, this still appears to affect main; there doesn't appear to be any special handling for the "change" event case.

Steps to reproduce the problem

Create a new Logical Volume:

$ lvcreate -n test-a-1 -L1G test
  Logical volume "test-a-1" created.

The following related units are created:

dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--1
dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dvtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6.device loaded active plugged /dev/disk/by-id/dm-uuid-LVM-vtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6
dev-dm\x2d0.device                                                                                             loaded active plugged /dev/dm-0
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                               loaded active plugged /dev/mapper/test-test--a--1
dev-test-test\x2da\x2d1.device                                                                                 loaded active plugged /dev/test/test-a-1
sys-devices-virtual-block-dm\x2d0.device                                                                       loaded active plugged /sys/devices/virtual/block/dm-0

Rename the previously-created LV:

$ lvrename /dev/test/test-a-1 test-a-2
  Renamed "test-a-1" to "test-a-2" in volume group "test"

The following paths no longer exist as a result:

$ ls -lh /dev/disk/by-id/dm-name-test-test--a--1 /dev/mapper/test-test--a--1 /dev/test/test-a-1
ls: cannot access '/dev/disk/by-id/dm-name-test-test--a--1': No such file or directory
ls: cannot access '/dev/mapper/test-test--a--1': No such file or directory
ls: cannot access '/dev/test/test-a-1': No such file or directory

Yet the following related device units now exist:

dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--1
dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d2.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--2
dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dvtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6.device loaded active plugged /dev/disk/by-id/dm-uuid-LVM-vtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6
dev-dm\x2d0.device                                                                                             loaded active plugged /dev/dm-0
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                               loaded active plugged /dev/mapper/test-test--a--1
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d2.device                                                               loaded active plugged /dev/mapper/test-test--a--2
dev-test-test\x2da\x2d1.device                                                                                 loaded active plugged /dev/test/test-a-1
dev-test-test\x2da\x2d2.device                                                                                 loaded active plugged /dev/test/test-a-2
sys-devices-virtual-block-dm\x2d0.device                                                                       loaded active plugged /sys/devices/virtual/block/dm-0

At this point, deleting the parent device (e.g. with lvremove /dev/test/test-a-2) will still result in all the associated devlinks device units being correctly removed (including the stale ones).

However, if a daemon-reload is issued before removing the parent device, the stale devlinks devices lose their association with the parent:

$ systemctl daemon-reload
dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device                                          loaded active plugged dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device
dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d2.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--2
dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dvtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6.device loaded active plugged /dev/disk/by-id/dm-uuid-LVM-vtt62H07Vx8I5nL9HB9qukE8AelCGphakqbY4UX2IOgOE8Da0yTpuolAtvHg6NL6
dev-dm\x2d0.device                                                                                             loaded active plugged /dev/dm-0
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                               loaded active plugged dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d2.device                                                               loaded active plugged /dev/mapper/test-test--a--2
dev-test-test\x2da\x2d1.device                                                                                 loaded active plugged dev-test-test\x2da\x2d1.device
dev-test-test\x2da\x2d2.device                                                                                 loaded active plugged /dev/test/test-a-2
sys-devices-virtual-block-dm\x2d0.device                                                                       loaded active plugged /sys/devices/virtual/block/dm-0

Removing the parent device then leaves those dangling units behind:

$ lvremove /dev/test/test-a-2
Do you really want to remove active logical volume test/test-a-2? [y/n]: y
  Logical volume "test-a-2" successfully removed.
dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device                                  loaded active plugged dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device
dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                       loaded active plugged dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device
dev-test-test\x2da\x2d1.device                                                                         loaded active plugged dev-test-test\x2da\x2d1.device

The dangling units survive a daemon-reexec. Only a reboot will clean them up.

Additional program output to the terminal or log subsystem illustrating the issue

No response

@andatche andatche added the bug 🐛 Programming errors, that need preferential fixing label Aug 31, 2022
@yuwata
Copy link
Member

yuwata commented Aug 31, 2022

Sorry, v239 is too old here. Please reproduce the issue with recent release (v250 or v251) or the current git HEAD.

IIRC, the issue was already fixed.

@poettering
Copy link
Member

[Canned reply follows]

This is the upstream bug and feature request tracker of systemd. Please use this only for issues in the two most current upstream systemd versions. See this link for the list of current releases.

https://github.com/systemd/systemd/releases

For support for older versions please contact your distribution instead.

In order to keep frustration at a minimum the submission form tried to make this all very clear.

@yuwata
Copy link
Member

yuwata commented Aug 31, 2022

Hm, I dived into the git logs, but the issue may not be completely fixed.

Slightly related commits are 4a1a1ca, and 87bc687.

But anyway, please reproduce the issue with a recent release. Thank you.

@andatche
Copy link
Author

The issue is still partially reproducible on Fedora 36 with systemd 250.8-1.fc36 but might be resolved subsequently by 4a1a1ca. I'll retest against main.

With 250.8-1.fc36 the stale device units are still not removed while handling the UDEV change, but do eventually get cleaned up by a daemon-reload.

  dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d1.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--1
  dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d2.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--2
  dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                               loaded active plugged /dev/mapper/test-test--a--1
  dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d2.device                                                               loaded active plugged /dev/mapper/test-test--a--2
  dev-test-test\x2da\x2d1.device                                                                                 loaded active plugged /dev/test/test-a-1
  dev-test-test\x2da\x2d2.device                                                                                 loaded active plugged /dev/test/test-a-2
$ systemctl daemon-reload
  dev-disk-by\x2did-dm\x2dname\x2dtest\x2dtest\x2d\x2da\x2d\x2d2.device                                          loaded active plugged /dev/disk/by-id/dm-name-test-test--a--2
  dev-mapper-test\x2dtest\x2d\x2da\x2d\x2d1.device                                                               loaded active plugged /dev/mapper/test-test--a--2
  dev-test-test\x2da\x2d2.device                                                                                 loaded active plugged /dev/test/test-a-2

@yuwata
Copy link
Member

yuwata commented Aug 31, 2022

I tested with the current git HEAD, and reproduced the issue (but cannot with lvrename or lvremove). Fix is waiting in #24522.

@yuwata yuwata added this to the v252 milestone Aug 31, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Aug 31, 2022
@andatche
Copy link
Author

Great, thanks @yuwata!

yuwata added a commit to yuwata/systemd that referenced this issue Aug 31, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Aug 31, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 1, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 8, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 8, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 12, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 12, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 19, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 19, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Partially fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
This is the continuation for the previous commit.
Previously, device units created based on the SYSTEMD_ALIAS udev
property were never removed unless their main device unit is removed.
This makes the alias device units are checked if they are still
required, and remove them if not necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
This is the continuation for the previous commit.
Previously, device units created based on the SYSTEMD_ALIAS udev
property were never removed unless their main device unit is removed.
This makes the alias device units are checked if they are still
required, and remove them if not necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Partially fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
This is the continuation for the previous commit.
Previously, device units created based on the SYSTEMD_ALIAS udev
property were never removed unless their main device unit is removed.
This makes the alias device units are checked if they are still
required, and remove them if not necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
Previously, existing device units for devlinks were not removed unless
the main device unit is removed. This makes all existing device units
for devlinks are checked if they are still required, and remove if not
necessary anymore.

Partially fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
This is the continuation for the previous commit.
Previously, device units created based on the SYSTEMD_ALIAS udev
property were never removed unless their main device unit is removed.
This makes the alias device units are checked if they are still
required, and remove them if not necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
Previously, existing device units for devlinks or aliases were not
removed unless the main device unit is removed. This makes all existing
device units for devlinks and aliases are checked if they are still
required, and remove if not necessary anymore.

Fixes systemd#24518.
yuwata added a commit to yuwata/systemd that referenced this issue Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing pid1 udev
Development

Successfully merging a pull request may close this issue.

3 participants