New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udev: fragile handling of uevent actions breaks with kernel 4.12+ #8221

Closed
mbiebl opened this Issue Feb 19, 2018 · 36 comments

Comments

@mbiebl
Copy link
Contributor

mbiebl commented Feb 19, 2018

Submission type

  • Bug report

systemd version the issue has been seen with

v237

Used distribution

Debian unstable
Filed as downstream bug report at
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890641

The following is copied from there.

Hi,

I'm filing this against udev, because that's where the fallout of this
first became obvious to me, though looking at the code there are other
places in systemd effected by it too (pretty much everywhere that the
udev_device_get_action() function is used). And given we'll release
with a kernel that breaks the assumptions this code makes, it really
should be RC - but I'm not going to niggle over severity right now.

The symptom that first got my attention was that setting the SYMLINK
key in a rule for a USB device was no longer creating that symlink in
Buster, despite working as expected in Stretch and several releases
prior.

After fixing the broken udev logging in Buster which made it impossible
to see why that was happening, it appears that the answer is Linux 4.12
added the actions "bind" and "unbind" for notification of devices being
bound to, or released from, a driver. And udev handles that poorly, so
the symlink gets created as expected for the "add" event, and then
promptly gets deleted again when the "bind" event is processed.

Bind events were added in 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 of
Linus' tree. There was a followup commit two months later (Sep 2017)
6878e7de6af726de47f9f3bec649c3f49e786586, to work around a different
symptom of the same brand of problems in udev when MODALIAS is included
in an unbind event - but this one is still a live breakage. And there
may well be others that still just haven't been noticed yet.

The code in systemd/udev (and all other users of these events too)
really should be explicit about exactly which actions are expected to
be processed in any given code path, with "default" paths not doing
anything more than reporting an unknown event was received. It can't
possibly assume safely that any unknown future event should just be
treated the same as "add" or "change" would be. And looking at the
current code it's almost impossible to tell whether some of the less
common action types really were expected to follow the path they will,
or if they are just rare enough that nobody has both seen it explode
and understood why it did so ...

This is a minimal debug log from udev that demonstrates the problem
as seen from a rule setting SYMLINK+="bitbabbler/$attr{serial}", it
was logged with 50-udev-default.rules and 80-drivers.rules disabled
(while trying to figure out if it was some bad interaction with them
at fault), but that makes no difference to the end result here, it
just eliminates the distracting noise they'd interleave in the log.

  • The device is hotplugged
Feb 17 10:16:21 buster kernel: [ 1208.116054] usb 1-2: new high-speed USB device number 9 using xhci_hcd
Feb 17 10:16:21 buster kernel: [ 1208.757285] usb 1-2: New USB device found, idVendor=0403, idProduct=7840
Feb 17 10:16:21 buster kernel: [ 1208.758344] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Feb 17 10:16:21 buster kernel: [ 1208.759314] usb 1-2: Product: White RNG
Feb 17 10:16:21 buster kernel: [ 1208.759842] usb 1-2: Manufacturer: BitBabbler
Feb 17 10:16:21 buster kernel: [ 1208.760464] usb 1-2: SerialNumber: JAXYAZ
Feb 17 10:16:21 buster systemd-udevd[262]: seq 1794 queued, 'add' 'usb'
Feb 17 10:16:21 buster systemd-udevd[262]: Validate module index
Feb 17 10:16:21 buster systemd-udevd[262]: Check if link configuration needs reloading.
Feb 17 10:16:21 buster systemd-udevd[262]: seq 1794 forked new worker [568]
  • Begin processing the "add" action for seq 1794
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1794 running
Feb 17 10:16:21 buster systemd-udevd[568]: GROUP 110 /lib/udev/rules.d/60-bit-babbler.rules:27
Feb 17 10:16:21 buster systemd-udevd[568]: MODE 0660 /lib/udev/rules.d/60-bit-babbler.rules:27
Feb 17 10:16:21 buster systemd-udevd[568]: LINK 'bitbabbler/JAXYAZ' /lib/udev/rules.d/60-bit-babbler.rules:27
Feb 17 10:16:21 buster systemd-udevd[568]: RUN '/usr/bin/bbvirt attach $attr{serial} --busnum $attr{busnum} --devnum $attr{devnum}' /lib/udev/rules.d/60-bit-babbler.rules:27
Feb 17 10:16:21 buster systemd-udevd[568]: RUN '/usr/bin/setfacl -m g:bit-babbler:rw $devnode' /lib/udev/rules.d/60-bit-babbler.rules:36
Feb 17 10:16:21 buster systemd-udevd[568]: ATTR '/sys/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-2/power/control' writing 'auto' /lib/udev/rules.d/60-bit-babbler.rules:43
Feb 17 10:16:21 buster systemd-udevd[568]: ATTR '/sys/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-2/power/autosuspend_delay_ms' writing '2000' /lib/udev/rules.d/60-bit-babbler.rules:44
Feb 17 10:16:21 buster systemd-udevd[262]: seq 1795 queued, 'add' 'usb'
Feb 17 10:16:21 buster systemd-udevd[262]: seq 1796 queued, 'bind' 'usb'
Feb 17 10:16:21 buster systemd-udevd[568]: handling device node '/dev/bus/usb/001/009', devnum=c189:8, mode=0660, uid=0, gid=110
Feb 17 10:16:21 buster systemd-udevd[568]: set permissions /dev/bus/usb/001/009, 020660, uid=0, gid=110
Feb 17 10:16:21 buster systemd-udevd[568]: creating symlink '/dev/char/189:8' to '../bus/usb/001/009'
Feb 17 10:16:21 buster systemd-udevd[568]: creating link '/dev/bitbabbler/JAXYAZ' to '/dev/bus/usb/001/009'
Feb 17 10:16:21 buster systemd-udevd[568]: creating symlink '/dev/bitbabbler/JAXYAZ' to '../bus/usb/001/009'
Feb 17 10:16:21 buster bbvirt[569]: starting '/usr/bin/bbvirt attach JAXYAZ --busnum 1 --devnum 9'
Feb 17 10:16:21 buster systemd-udevd[568]: Process '/usr/bin/bbvirt attach JAXYAZ --busnum 1 --devnum 9' succeeded.
Feb 17 10:16:21 buster setfacl[570]: starting '/usr/bin/setfacl -m g:bit-babbler:rw /dev/bus/usb/001/009'
Feb 17 10:16:21 buster systemd-udevd[568]: Process '/usr/bin/setfacl -m g:bit-babbler:rw /dev/bus/usb/001/009' succeeded.
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1794 processed
  • The "add" event completes being handled. At this point everything
    has worked exactly as we normally expect it to.
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1795 running
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1795 processed
  • Then we begin processing the "bind" action seq 1796
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1796 running
Feb 17 10:16:21 buster systemd-udevd[568]: update old name, '/dev/bitbabbler/JAXYAZ' no longer belonging to '/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-2'
Feb 17 10:16:21 buster systemd-udevd[568]: no reference left, remove '/dev/bitbabbler/JAXYAZ'
Feb 17 10:16:21 buster systemd-udevd[568]: handling device node '/dev/bus/usb/001/009', devnum=c189:8, mode=0600, uid=0, gid=0
Feb 17 10:16:21 buster systemd-udevd[568]: preserve already existing symlink '/dev/char/189:8' to '../bus/usb/001/009'
Feb 17 10:16:21 buster systemd-udevd[568]: seq 1796 processed
  • And it's all gone out for a long lunch ...
Feb 17 10:16:21 buster systemd-udevd[262]: cleanup idle workers
Feb 17 10:16:21 buster systemd-udevd[568]: Unload module index
Feb 17 10:16:21 buster systemd-udevd[568]: Unloaded link configuration context.
Feb 17 10:16:21 buster systemd-udevd[262]: worker [568] exited

This is what 'udevadm monitor' reports for the events which do what is
seen above. It's not from the same test run, but aside from the USB
device enumeration changing, it's the same sequence of events from a
hotplug of the device.

KERNEL[52.409511] add      /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3 (usb)
ACTION=add
BUSNUM=001
DEVNAME=/dev/bus/usb/001/004
DEVNUM=004
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3
DEVTYPE=usb_device
MAJOR=189
MINOR=3
PRODUCT=403/7840/900
SEQNUM=1760
SUBSYSTEM=usb
TYPE=0/0/0

KERNEL[52.411157] add      /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0 (usb)
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0
DEVTYPE=usb_interface
INTERFACE=255/255/255
MODALIAS=usb:v0403p7840d0900dc00dsc00dp00icFFiscFFipFFin00
PRODUCT=403/7840/900
SEQNUM=1761
SUBSYSTEM=usb
TYPE=0/0/0

KERNEL[52.411193] bind     /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3 (usb)
ACTION=bind
BUSNUM=001
DEVNAME=/dev/bus/usb/001/004
DEVNUM=004
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3
DEVTYPE=usb_device
DRIVER=usb
MAJOR=189
MINOR=3
PRODUCT=403/7840/900
SEQNUM=1762
SUBSYSTEM=usb
TYPE=0/0/0

UDEV  [52.416928] add      /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3 (usb)
ACTION=add
BUSNUM=001
DEVLINKS=/dev/bitbabbler/JAXYAZ
DEVNAME=/dev/bus/usb/001/004
DEVNUM=004
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3
DEVTYPE=usb_device
ID_BUS=usb
ID_MODEL=White_RNG
ID_MODEL_ENC=White\x20RNG
ID_MODEL_ID=7840
ID_REVISION=0900
ID_SERIAL=BitBabbler_White_RNG_JAXYAZ
ID_SERIAL_SHORT=JAXYAZ
ID_USB_INTERFACES=:ffffff:
ID_VENDOR=BitBabbler
ID_VENDOR_ENC=BitBabbler
ID_VENDOR_FROM_DATABASE=Future Technology Devices International, Ltd
ID_VENDOR_ID=0403
MAJOR=189
MINOR=3
PRODUCT=403/7840/900
SEQNUM=1760
SUBSYSTEM=usb
TYPE=0/0/0
USEC_INITIALIZED=52411809

UDEV  [52.417938] add      /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0 (usb)
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0
DEVTYPE=usb_interface
ID_VENDOR_FROM_DATABASE=Future Technology Devices International, Ltd
INTERFACE=255/255/255
MODALIAS=usb:v0403p7840d0900dc00dsc00dp00icFFiscFFipFFin00
PRODUCT=403/7840/900
SEQNUM=1761
SUBSYSTEM=usb
TYPE=0/0/0
USEC_INITIALIZED=52417785

UDEV  [52.418633] bind     /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3 (usb)
ACTION=bind
BUSNUM=001
DEVNAME=/dev/bus/usb/001/004
DEVNUM=004
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3
DEVTYPE=usb_device
DRIVER=usb
ID_BUS=usb
ID_MODEL=White_RNG
ID_MODEL_ENC=White\x20RNG
ID_MODEL_ID=7840
ID_REVISION=0900
ID_SERIAL=BitBabbler_White_RNG_JAXYAZ
ID_SERIAL_SHORT=JAXYAZ
ID_USB_INTERFACES=:ffffff:
ID_VENDOR=BitBabbler
ID_VENDOR_ENC=BitBabbler
ID_VENDOR_FROM_DATABASE=Future Technology Devices International, Ltd
ID_VENDOR_ID=0403
MAJOR=189
MINOR=3
PRODUCT=403/7840/900
SEQNUM=1762
SUBSYSTEM=usb
TYPE=0/0/0
USEC_INITIALIZED=52411809

KERNEL[52.422682] bind     /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0 (usb)
ACTION=bind
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0
DEVTYPE=usb_interface
DRIVER=usbfs
INTERFACE=255/255/255
MODALIAS=usb:v0403p7840d0900dc00dsc00dp00icFFiscFFipFFin00
PRODUCT=403/7840/900
SEQNUM=1763
SUBSYSTEM=usb
TYPE=0/0/0

UDEV  [52.423548] bind     /devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0 (usb)
ACTION=bind
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:01:01.0/0000:02:03.0/usb1/1-3/1-3:1.0
DEVTYPE=usb_interface
DRIVER=usbfs
ID_VENDOR_FROM_DATABASE=Future Technology Devices International, Ltd
INTERFACE=255/255/255
MODALIAS=usb:v0403p7840d0900dc00dsc00dp00icFFiscFFipFFin00
PRODUCT=403/7840/900
SEQNUM=1763
SUBSYSTEM=usb
TYPE=0/0/0
USEC_INITIALIZED=52417785

And when it is unplugged, there are likewise new "unbind" events to
deal with, along with the normal "remove" set.

And that should be enough of what I collected while getting to the
bottom of this to make the problem clear, and reasonably easy to
reproduce with any handy device. Looking at the code the problem
should be obvious enough anyway though. I was tempted to also include
a preliminary patch for it - but it really all needs careful auditing
by as many of the eyes responsible for the existing assumptions as
possible, at least in the first pass through.

The udev rules file for bit babbler looks like this:

SUBSYSTEM!="usb", GOTO="bb_end"

ACTION!="add|change", GOTO="bb_end_add"

# This is what we'd like to do.  Skip all the rules here if Vendor:Product
# is not 0403:7840 -- but that's not what these two tests will actually do
# (at least with udev versions up to 232-25 which shipped in Stretch).
# If the device we're handling an event for doesn't have the idVendor or
# idProduct attributes at all, then these tests are still false, the same
# as if they did have the values which we are testing for ...
#ATTR{idVendor}!="0403", GOTO="bb_end"
#ATTR{idProduct}!="7840", GOTO="bb_end"

# So instead, we need to explicitly test that they *are* the values we are
# looking for, and play goto leapfrog to get the control flow logic we want.
ATTR{idVendor}=="0403", ATTR{idProduct}=="7840", GOTO="bb_add"
GOTO="bb_end"

LABEL="bb_add"

# Allow users in group bit-babbler to access the device directly.
# Create a symlink to a well known name that can be used in the cgroup_device_acl
# configuration in /etc/libvirt/qemu.conf, and for other similar purposes too.
# And run the bbvirt script to see if this device was configured for hotplugging
# into a virtual machine.
GROUP="bit-babbler", MODE="0660", SYMLINK="bitbabbler/$attr{serial}", \
 RUN+="/usr/bin/bbvirt attach $attr{serial} --busnum $attr{busnum} --devnum $attr{devnum}"

# If ACLs are supported, grant users in the bit-babbler group access to the device
# with them too.  This is mainly so that if a VM is halted, the device will revert
# to normal access from the host system again.  The libvirt 'managed' mode will not
# restore the original ownership when it releases the device, it will just make it
# be root:root, stomping the GROUP we set above.
TEST=="/usr/bin/setfacl", RUN+="/usr/bin/setfacl -m g:bit-babbler:rw $devnode"

# Enable USB autosuspend.  The BitBabbler devices support suspending correctly,
# though not every controller they might be plugged into will always play nicely.
# It should be safe to enable it here, even if an upstream hub or controller
# needs it disabled.  The XHCI controllers seem to be the most troublesome, but
# mostly with older kernels.
TEST=="power/control", ATTR{power/control}="auto"
TEST=="power/autosuspend_delay_ms", ATTR{power/autosuspend_delay_ms}="2000"

LABEL="bb_end_add"


ACTION!="remove", GOTO="bb_end"

# Explicitly detach unplugged devices from the VM if they were passed through to it.
# If we don't do this, the stale <hostdev> configuration will remain, and could
# match some other completely different device that is plugged in later ...
# This is why we can't make persistent changes to the domain definition for VMs that
# aren't running when the device is plugged in, because if the host goes down without
# this rule being run, we'd never clean those up.
#
# We can't test against the attributes here, if this would match they are already gone.
ENV{ID_VENDOR_ID}=="0403", ENV{ID_MODEL_ID}=="7840", \
 RUN+="/usr/bin/bbvirt detach $env{ID_SERIAL_SHORT} --busnum $env{BUSNUM} --devnum $env{DEVNUM}"

LABEL="bb_end"

@mbiebl mbiebl added the udev label Feb 19, 2018

nikias added a commit to libimobiledevice/usbmuxd that referenced this issue Apr 18, 2018

udev: Work around systemd bug related to bind events on Linux 4.12+
Make sure that udev doesn't lose our properties when bind events come
in, as implemented in kernels 4.12+.

See systemd/systemd#8221
and systemd/systemd#7109
@thewolfwillcome

This comment has been minimized.

Copy link

thewolfwillcome commented Apr 25, 2018

Hacked an patch together which let udev ignore bind/unbind uevents.
This is a quick fix.Better would be when udev should only handle those events which it can handle properly and ignore any unknown uevent.

udev_ignore_bind_unbind_uevents.patch.txt

@buczek

This comment has been minimized.

Copy link

buczek commented Apr 26, 2018

Don't you think, it would be better, if bind, unbind, change and all future uevents would restore the device data from the database instead of reinitializing it, so that these events were no longer destructive and could be processed by udev rules, too?

#7587 #7109 related.

@thewolfwillcome

This comment has been minimized.

Copy link

thewolfwillcome commented Apr 26, 2018

@buczek: Why not if it fits better. My patch is, as mentioned, only an hack/quick fix to resolve the bug for me. Because this bug also breaks mtp access support in kde-framework: https://bugs.kde.org/show_bug.cgi?id=387454

@arvidjaar

This comment has been minimized.

Copy link
Contributor

arvidjaar commented Apr 27, 2018

it would be better, if bind, unbind, change and all future uevents would restore the device data from the database instead of reinitializing it

This requires auditing of every udev rule to make sure it works properly. E.g. for change events rules may skip querying device if they find that information is already present. As a trivial example

SUBSYSTEM=="input", ENV{ID_INPUT}=="", IMPORT{builtin}="input_id"

@aleksander0m

This comment has been minimized.

Copy link

aleksander0m commented May 27, 2018

Looks like this issue is fully breaking all device-level udev tags managed in ModemManager, e.g. to blacklist non-modem TTY ports... https://lists.freedesktop.org/archives/modemmanager-devel/2018-May/006417.html
This is a very very very very unfortunate issue... I cannot imagine how many different errors this problem will trigger everywhere!
As a quick solution to solve all the errors, at least until a better solution is found, I also suggest to have udev ignore bind/unbind events for now.

fdo-mirror pushed a commit to freedesktop/ModemManager that referenced this issue Jun 2, 2018

udev: add tags also on bind action
When a new USB device is hotplugged, e.g. a USB<->RS232 converter that
exposes a single ttyUSB0, these udev events happen:

  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1 (usb/usb-device)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0 (usb/usb-interface)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0 (usb-serial)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0/tty/ttyUSB0 (tty)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0 (usb-serial)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0 (usb/usb-interface)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1 (usb/usb-device)

Our udev rules in MM only added tags in the 'add' events, and it looks
like the only ones 'persistent' after this sequence are those of the
last event happening on the specific path.

This meant that all TTY subsystem rules (e.g. ID_MM_CANDIDATE) would
be stored for later check (e.g. if ModemManager is started after these
rules have been applied), which was ok. "udevadm info -p ..." would
show these tags correctly always.

But this also meant that the 'bind' udev event happening for the USB
device didn't get any of our device-specific tags, and so we would be
missing them (e.g. ID_MM_DEVICE_MANUAL_SCAN_ONLY) if MM is started
after the last event has happened. "udevadm info -p ..." would
not show these tags.

Modify all our rules to also run at the 'bind' events.

See, for context:
  systemd/systemd#8221

fdo-mirror pushed a commit to freedesktop/ModemManager that referenced this issue Jun 13, 2018

udev: add tags also on bind action
When a new USB device is hotplugged, e.g. a USB<->RS232 converter that
exposes a single ttyUSB0, these udev events happen:

  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1 (usb/usb-device)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0 (usb/usb-interface)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0 (usb-serial)
  add  /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0/tty/ttyUSB0 (tty)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0/ttyUSB0 (usb-serial)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1:1.0 (usb/usb-interface)
  bind /devices/pci0000:00/0000:00:14.0/usb2/2-1 (usb/usb-device)

Our udev rules in MM only added tags in the 'add' events, and it looks
like the only ones 'persistent' after this sequence are those of the
last event happening on the specific path.

This meant that all TTY subsystem rules (e.g. ID_MM_CANDIDATE) would
be stored for later check (e.g. if ModemManager is started after these
rules have been applied), which was ok. "udevadm info -p ..." would
show these tags correctly always.

But this also meant that the 'bind' udev event happening for the USB
device didn't get any of our device-specific tags, and so we would be
missing them (e.g. ID_MM_DEVICE_MANUAL_SCAN_ONLY) if MM is started
after the last event has happened. "udevadm info -p ..." would
not show these tags.

Modify all our rules to also run at the 'bind' events.

See, for context:
  systemd/systemd#8221

(cherry picked from commit c07382a)

poettering added a commit to poettering/systemd that referenced this issue Nov 29, 2018

udev: treat "unbind" uevents just like "remove"
It's not clear what "unbind" events precisely mean, they appear to be
entirely undocumented and broke all kinds of userspace, but they do
exist:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1455cf8dbfd06aa7651dcfccbadb7a093944ca65

Let's treat them like "remove", since apparently they make stuff
unavailable (?). This is at least along the lines of what people
discussed on systemd#7587 and systemd#8221.

See: systemd#7587 systemd#8221

poettering added a commit to poettering/systemd that referenced this issue Nov 29, 2018

rules: consider "unbind" like "remove" and skip most rules
Not sure what the best approach is here, but given the kernel introduced
a new concept here, we have to follow up, hence let's skip most rules
for "unbind" events too, like we already skip them for "remove".

See: systemd#7587 systemd#8221
@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

Quite frankly @aleksander0m, mm should invert those rules files checks: instead of listing "positive" events in your rules file ("add", "change", …), list only the negative event of "remove". This should be more robust, and is what udev does for all its own rules files and code.

@aleksander0m

This comment has been minimized.

Copy link

aleksander0m commented Nov 30, 2018

@poettering thanks for the hint, I'll get to do that for the next release.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

So, while the kernel additoin is problematic I think we can actually close this specific issue here: rules files should always use a logic where instead of listing positives, they should list negatives. i.e. create your SYMLINKS= props in all cases but in "remove". This is what udev's/systemd's own rules files do, and is what should be done in 3rd party files too.

This bug is hence of mishandling of "bind" events, #7587 is a different kind of bug though, it is about mishandling of "unbind" events. I think "bind" events are easy and safe to handle, as described above. It's just key that everybody does that correctly, which is not the case right now.

I don't think we need to change anything about the handling of "bind" in udev/systemd. It sucks it got added in the kernel, and that it broke so much stuff, but it's unlikely to be reverted now, and systemd/udev is not the place to hide/supress the event, or fake compat where the kernel broke it.

anyway, closing this one hence. in systemd/udev/logind/… itself "bind" is treated like "change" or "add" everywhere, which means we are good on this. If 3rd party rules files need updating, I am sorry, but please blame the kernel for that, not udev, udev is literally just the messenger there.

@poettering poettering closed this Nov 30, 2018

@mbiebl

This comment has been minimized.

Copy link
Contributor Author

mbiebl commented Nov 30, 2018

I might be missing something, but the original rules posted for bitbabbler does use negatives?

@fbuihuu

This comment has been minimized.

Copy link
Contributor

fbuihuu commented Nov 30, 2018

@mbiebl AFAICS it uses ACTION!="add|change", GOTO="bb_end_add" where positives are used.

@mbiebl

This comment has been minimized.

Copy link
Contributor Author

mbiebl commented Nov 30, 2018

Ok, then I don't understand what's meant here.
@poettering could you be so kind to show where the bit babbler rules file is incorrect and how it is supposed to be fixed?

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

@mbiebl the pasted rules files does this:

ACTION!="add|change", GOTO="bb_end_add"

It lists the 'positive' events "add" and "change" (and thus misses all the others, one of them being "bind").

It should instead list negatives, i.e. "remove":

ACTION=="remove", GOTO="bb_end_add"

This is what all our own rules files do, if you look closely.

@aleksander0m

This comment has been minimized.

Copy link

aleksander0m commented Nov 30, 2018

And now it should say "remove|unbind" I assume?

@poettering when you say "rules files should always use a logic where instead of listing positives, they should list negatives", is that a general suggestion always? how is that more robust, if e.g. now we also need to consider possible negatives that may be missing?

The fact that you need to consider "unbind" in your list of negative events (#10998) is the same reason we had to consider "bind" in our list of positive events (this issue #8221).

@aleksander0m

This comment has been minimized.

Copy link

aleksander0m commented Nov 30, 2018

Don't take me wrong, I'm fine with you closing this issue, udev just follows the new kernel features. It's just been very unfortunate that we all had to change udev rules everywhere due to this new bind/unbind events.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

And now it should say "remove|unbind" I assume?

That's what #10998 does for everything in systemd. But it's not clear that that's the right approach. This needs more discussion, that PR is marked "dont-merge" hence. I have pinged the kernel guys now. But nobody appears to know anything about this. I have never seen any hw needing this, I don't even know what "unbind" really means, as there's no documentation about it. Do you happen to know?

@hadess

This comment has been minimized.

Copy link
Contributor

hadess commented Nov 30, 2018

That's what #10998 does for everything in systemd. But it's not clear that that's the right approach. This needs more discussion, that PR is marked "dont-merge" hence. I have pinged the kernel guys now. But nobody appears to know anything about this. I have never seen any hw needing this, I don't even know what "unbind" really means, as there's no documentation about it. Do you happen to know?

Means a kernel driver was disconnected from the device, usually to disconnect a generic driver from a device to assign it a more specialised one. For example, for a HID driver:
https://github.com/hadess/retrode

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

@aleksander0m See 4b06c40 which did the transition back in 2010 for all files systemd/udev upstream ships.

The commit msgs doesn't say much, but yeah, it's more robust to consider all events "positive" except for the ones which are known "negative".

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

Don't take me wrong, I'm fine with you closing this issue, udev just follows the new kernel features. It's just been very unfortunate that we all had to change udev rules everywhere due to this new bind/unbind events.

Yeah, this sucks, but blame the kernel guys for that. udev is literally just the messenger there...

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

Means a kernel driver was disconnected from the device, usually to disconnect a generic driver from a device to assign it a more specialised one. For example, for a HID driver:
https://github.com/hadess/retrode

So what precisely happens on "unbind"? do the various sysfs attr previously exported disappear?

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

So what precisely happens on "unbind"? do the various sysfs attr previously exported disappear?

Any chance you can get me a diff of such a device's tree output in sysfs before the "unbind" and between the "unbind" and "remove"?

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

Any chance you can get me a diff of such a device's tree output in sysfs before the "unbind" and between the "unbind" and "remove"?

btw, that's directed to @hadess as much as to @aleksander0m I guess ;-)

@hadess

This comment has been minimized.

Copy link
Contributor

hadess commented Nov 30, 2018

So what precisely happens on "unbind"? do the various sysfs attr previously exported disappear?

All the sysfs attributes added by that driver get removed, and the driver link is also removed.

@jimis

This comment has been minimized.

Copy link
Contributor

jimis commented Nov 30, 2018

In case it helps, here is a forgotten but relevant issue I filed a while ago: https://bugzilla.redhat.com/show_bug.cgi?id=1584876

To summarize, connecting my mobile phone to import photos broke with upgrade to F28, because:

  • ENV{ID_GPHOTO2}="1" was being set with the "add" event
  • it was being removed immediately afterwards with the "bind" event

Workaround was to apply this small patch to the udev rules:

diff /usr/lib/udev/rules.d/40-libgphoto2.rules{.orig,}
7c7
< ACTION!="add", GOTO="libgphoto2_rules_end"
---
> ACTION!="add|bind", GOTO="libgphoto2_rules_end"
@poettering

This comment has been minimized.

Copy link
Member

poettering commented Nov 30, 2018

So what precisely happens on "unbind"? do the various sysfs attr previously exported disappear?

All the sysfs attributes added by that driver get removed, and the driver link is also removed.

Hmm, ok so this means that at the moment of the "unbind" the device becomes just a "skeleton" if you so will? It's still there, and it is still a device, but it's not really usable and has no useful properties?

Devices that know bind/unbind will come up in this skeleton state, then with the "bind" become useful, and then we "unbind" enter the skeleton state again? But of course, we never know in advance whether a device is going to issue "bind" or not, right? i.e. when a device comes up it could be fully working or it could be in "skeleton" state, but only if we ever see "bind" we know for sure, and for devices that don't issue that we'll never know? What a shitty kernel API...

So I am pretty sure apps should treat "unbind" like "remove" and stop using the device. After all there#s the guarantee that after this even the device is just in this "skeleton" state and not useful anymore.

Question remains: what to do about the rules and what about udev. udev has this special hack to reuse the last event's device db on "remove", to ensure that clients that watched a device can still match properly for the same props/tags to get notified about "remove" events. Question is what to do for "rebind" there. "rebind" is two things after all: the end of the phase where the device was good, and the beginning of the phase where the device is just this weird "skeleton". Subscribed clients want to use the props/tags from when it was good to match events so that they know when it becomes unusable to them. OTOH we don't want to store that stuff, the database should reflect the new state, not what once was: whoever enumerates the db after the event should see the correct, new "skeleton" data, not the data that once was. Hence there are two conflicting requirements, and I have no idea what to do with this.

sourcejedi added a commit to sourcejedi/systemd that referenced this issue Dec 8, 2018

udev: abort non-remove event processing if device is already removed
If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

This fixes the systemd part of systemd#7587 / systemd#8221.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

If there are third-party rules which add tags (are there any of these??),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.  Losing tags is
*not allowed for* by the libudev filter design.   (Again, I don't know if
anything really uses tags outside of the systemd project).

sourcejedi added a commit to sourcejedi/systemd that referenced this issue Dec 8, 2018

udev: abort non-remove event processing if device is already removed
If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of systemd#7587 / systemd#8221.

If there are third-party rules which add tags (are there any of these??),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.  Losing tags is
*not allowed for* by the libudev filter design.   (Again, I don't know if
anything really uses tags outside of the systemd project).
@dtor

This comment has been minimized.

Copy link
Contributor

dtor commented Dec 8, 2018

So what precisely happens on "unbind"? do the various sysfs attr previously exported disappear?

All the sysfs attributes added by that driver get removed, and the driver link is also removed.

Hmm, ok so this means that at the moment of the "unbind" the device becomes just a "skeleton" if you so will? It's still there, and it is still a device, but it's not really usable and has no useful properties?

Devices that know bind/unbind will come up in this skeleton state, then with the "bind" become useful, and then we "unbind" enter the skeleton state again? But of course, we never know in advance whether a device is going to issue "bind" or not, right? i.e. when a device comes up it could be fully working or it could be in "skeleton" state, but only if we ever see "bind" we know for sure, and for devices that don't issue that we'll never know? What a shitty kernel API...

Nice comments we have here :)

Anyway, it is indeed unfortunate that it is hard to determine whether you will get a BIND event or not. I think that if device is on a physical bus then you should expect BIND event, and virtual (or class) devices will not have it. So maybe checking if subsystem link points to a bus or a class could help?

One thing is certain - if you get unbind then consumers should consider the device in question as non-functioning.

So I am pretty sure apps should treat "unbind" like "remove" and stop using the device. After all there#s the guarantee that after this even the device is just in this "skeleton" state and not useful anymore.

Depends on the app of course, but generally I agree.

Question remains: what to do about the rules and what about udev. udev has this special hack to reuse the last event's device db on "remove", to ensure that clients that watched a device can still match properly for the same props/tags to get notified about "remove" events. Question is what to do for "rebind" there. "rebind" is two things after all: the end of the phase where the device was good, and the beginning of the phase where the device is just this weird "skeleton". Subscribed clients want to use the props/tags from when it was good to match events so that they know when it becomes unusable to them. OTOH we don't want to store that stuff, the database should reflect the new state, not what once was: whoever enumerates the db after the event should see the correct, new "skeleton" data, not the data that once was. Hence there are two conflicting requirements, and I have no idea what to do with this.

Can we remove tags? For practical purposes i doubt "bind" actions would add any, but we could ask for "unbind" actions to clear them. Another option is to "tag" the tags with the action that produced the tag/attribute, and clear them when matching event is received. I.e. clear tags that were added by "add" on "remove", and clear tags from "bind" on "unbind". Assume that "change" == "add" for tagging purposes here.

sourcejedi added a commit to sourcejedi/systemd that referenced this issue Dec 8, 2018

udev: abort non-remove event processing if device is already removed
If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of systemd#7587 / systemd#8221.

If there are third-party rules which add tags (are there any of these??),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.  Losing tags is
*not allowed for* by the libudev filter design.   (Again, I don't know if
anything really uses tags outside of the systemd project).If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of systemd#7587 / systemd#8221.

Third-party rules which add tags (e.g. to create systemd device units),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.

Note, removing tags from devices is *not allowed for* by the libudev filter design.
(I don't know if anything really consumes tags outside of the systemd project though).

sourcejedi added a commit to sourcejedi/systemd that referenced this issue Dec 8, 2018

udev: abort non-remove event processing if device is already removed
If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of issues systemd#7587 / systemd#8221.

If there are third-party rules which add tags (are there any of these??),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.  Losing tags is
*not allowed for* by the libudev filter design.   (Again, I don't know if
anything really uses tags outside of the systemd project).If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of systemd#7587 / systemd#8221.

Third-party rules which add tags (e.g. to create systemd device units),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.

Note, removing tags from devices is *not allowed for* by the libudev filter design.
(I don't know if anything really consumes tags outside of the systemd project though).
@vcaputo

This comment has been minimized.

Copy link
Member

vcaputo commented Dec 8, 2018

Nice comments we have here :)

I wouldn't be so righteous

"Anyway, if we come to an agreement on this I will look into getting systemd handle this nicely." - @dtor 2/14/2017 [1]

The frustration being expressed surrounding this breakage is authentic and arguably justified. Had you followed through on your commitment quoted above, it all could have been prevented.

Sometimes things fall through the cracks, people get busy, etc. It's understandable. But for them to then appear nearly two years later acting like they didn't drop the ball, not so much.

[1] https://lore.kernel.org/patchwork/patch/759829/#950559

sourcejedi added a commit to sourcejedi/systemd that referenced this issue Dec 8, 2018

udev: abort non-remove event processing if device is already removed
If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of issue systemd#7587.

If there are third-party rules which add tags (are there any of these??),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.  Losing tags is
*not allowed for* by the libudev filter design.   (Again, I don't know if
anything really uses tags outside of the systemd project).If the device is already removed, we will not be able to load identifying
information from the device :-).  So we will fail to calculate the correct
tag.  Abort processing before we save the incorrect data to the DB!  Also
don't broadcast the incomplete event or run any of the collected RUN
programs.

This avoids losing tags in the DB, which we need for the "remove" event.
On the remove event, we will broadcast the event with the tags (and
other properties) loaded from the DB.

My test case is to plug and unplug a USB printer, while running on
kernel version 4.12+.  Before this fix, the device would remain active in
`systemctl list-devices *usb*.device`. After this fix, the device unit is
removed correctly.

This fixes the systemd part of systemd#7587 / systemd#8221.

Third-party rules which add tags (e.g. to create systemd device units),
they can match our behaviour by handling all unknown events the same as
"change" events.  This is *already required* on kernels 4.12+ due to the
addition of "bind" and "unbind" events, combined with a change to
systemd-udev which allowed "bind" and "unbind" events to be executed with
all of the same features as "change" events).  If your rules skip unknown
events, you will lose symlinks and tags, for example.

Note, removing tags from devices is *not allowed for* by the libudev filter design.
(I don't know if anything really consumes tags outside of the systemd project though).
@dtor

This comment has been minimized.

Copy link
Contributor

dtor commented Dec 8, 2018

@Pointedstick

This comment has been minimized.

Copy link

Pointedstick commented Dec 9, 2018

So what's the short version for dumbos like me? Was this fixed in udev, or are downstream changes required? Or are we still debating the fix?

@boucman

This comment has been minimized.

Copy link
Contributor

boucman commented Dec 9, 2018

debating the fix... it's not clear at this point if the fix is on the kernel side, udev side, udev-consumers side, multiple or none of the above...

@dtor

This comment has been minimized.

Copy link
Contributor

dtor commented Dec 9, 2018

Something like this: #11101

It will simply cause systemd to ignore the bind/unbind events and hopefully prevent it from flushing the old state. Note that while this compiles I have not run it ;)

Once we change systemd/udv to accumulate the state instead of resetting it (save for change events I guess) we can revert this patch.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Dec 10, 2018

Anyway, it is indeed unfortunate that it is hard to determine whether you will get a BIND event or not. I think that if device is on a physical bus then you should expect BIND event, and virtual (or class) devices will not have it. So maybe checking if subsystem link points to a bus or a class could help?

We need something reliable, something we can count on to be stable kernel API. Something with fewer question marks and ideally some documentation in the kernel tree that this behaviour is official behaviour.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Dec 10, 2018

Can we remove tags? For practical purposes i doubt "bind" actions would add any, but we could ask for "unbind" actions to clear them. Another option is to "tag" the tags with the action that produced the tag/attribute, and clear them when matching event is received. I.e. clear tags that were added by "add" on "remove", and clear tags from "bind" on "unbind". Assume that "change" == "add" for tagging purposes here.

Tags currently never survive uevents: whenever a uevent is seen they need to be re-applied, or they'll vanish. That's kinda the problem here... So you remove them, by not having TAGS+= in the rules files for a specific uevent.

@dtor

This comment has been minimized.

Copy link
Contributor

dtor commented Dec 10, 2018

hadess added a commit to hadess/libgphoto2 that referenced this issue Dec 11, 2018

print-camera-list: Fix udev rules for linux-4.14+
Since commit 1455cf8dbfd0 ("driver core: emit uevents when
device is bound to a driver") the kernel started emitting
"bind" and "unbind" uevents which confuse the libgphoto2
udev rules.

This caused ID_GPHOTO2 and GPHOTO2_DRIVER udev properties not being set
on devices, causing them not to be visible to user-space that uses those
properties (such as gvfs' gphoto2 backend).

See systemd/systemd#8221

poettering added a commit to poettering/systemd that referenced this issue Dec 13, 2018

udev: make tags "eternal"
This tries to address the "bind"/"unbind" uevent kernel API breakage, by
changing the semantics of device tags.

Previously, tags would be applied on uevents (and the database entries
they result in) only depending on the immediate context. This means that
if one uevent causes the tag to be set and the next to be unset, this
would immediately effect what apps would see and the database entries
would contain each time. This is problematic however, as tags are a
filtering concept, and if tags vanish then clients won't hence notice
when a device stops being relevant to them since not only the tags
disappear but immediately also the uevents for it are filtered including
the one necessary for the app to notice that the device lost its tag and
hence relevance.

With this change tags become "eternal". If a tag is applied is once
applied to a device it will stay in place forever, until the device is
removed. Tags can never be removed again. This means that an app
watching a specific set of devices by filtering for a tag is guaranteed
to not only see the events where the tag is set but also all follow-up
events where the tags might be removed again.

This change of behaviour is unfortunate, but is required due to the
kernel introducing new "bind" and "unbind" uevents that generally have
the effect that tags and properties disappear and apps hence don't
notice when a device looses relevance to it. "bind"/"unbind" events were
introduced in kernel 4.12, and are now used in more and more subsystems.
The introduction broke userspace widely, and this commit is an attempt
to provide a way for apps to deal with it.

While tags are now "eternal" a new automatic device property
CURRENT_TAGS is introduced (matching the existing TAGS property) that
always reflects the precise set of tags applied on the most recent
events. Thus, when subscribing to devices through tags, all devices that
ever had the tag put on them will be be seen, and by CURRENT_TAGS it may
be checked whether the device right at the moment matches the tag
requirements.

See: systemd#7578 systemd#7018 systemd#8221
@sourcejedi

This comment has been minimized.

Copy link
Contributor

sourcejedi commented Dec 23, 2018

What a shitty kernel API

Nice comments we have here :).

I see this was left without apparent response. And the same word was later repeated in a comment on one of the other associated issues.

@poettering Please do not use that word any more. It reads as intended to provoke. It encourages antagonistic behaviour in other discussions. Both are undesirable.

I know everyone has been frustrated by the problem, and maybe also from the userspace breakage around mknod() behaviour. But I see this word as inappropriate, both times it was used. It was antagonistic towards kernel developers, after you had invited kernel developers to look at the discussions. This makes things harder for everyone. Whether or not a kernel developer happens to take offence in any one case.


I am not sure why this needed to be pointed out. I am not very practised at doing so. It made things harder for me, at least.

@rugubara

This comment has been minimized.

Copy link

rugubara commented Dec 28, 2018

I'm afraid to be a lonely voice here. I relied on bind event to re-configure my Elantech touchpad. The default config didn't work for me, so I fired an echo command to change the protocol when device bound.

After upgrade to systemd-240, this no longer works for me and I'm lost. Trying to do the same on add event doesn't work. My rule always fire before the default rule that loads the module - i.e. I have nothing in sysfs yet to respond to my write.

I can't understand the details of the discussion. I suffer from unsolicited removal of an important feature.

Since I'm on a source-based Gentoo distrib, I reversed your PR for me. Can I raise a humble request? When you remove a function, announce it in the change log.
When you make a dirty fix like this, can you make a parameter/config file option to undo your fix w/o patching and rebuilding?

@thewolfwillcome

This comment has been minimized.

Copy link

thewolfwillcome commented Dec 28, 2018

I'm afraid to be a lonely voice here. I relied on bind event to re-configure my Elantech touchpad. The default config didn't work for me, so I fired an echo command to change the protocol when device bound.

After upgrade to systemd-240, this no longer works for me and I'm lost. Trying to do the same on add event doesn't work. My rule always fire before the default rule that loads the module - i.e. I have nothing in sysfs yet to respond to my write.

I can't understand the details of the discussion. I suffer from unsolicited removal of an important feature.

Since I'm on a source-based Gentoo distrib, I reversed your PR for me. Can I raise a humble request? When you remove a function, announce it in the change log.
When you make a dirty fix like this, can you make a parameter/config file option to undo your fix w/o patching and rebuilding?

The udev rules are handled in lexical order, regardless of the directory in which they live. Try to put e.g. 99 as prefix to your udev rule name. Then it should be parsed after the udev rule which loads the kernel modul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment