Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udev crashes on boot when handling bind action #11319

Closed
iv-m opened this issue Jan 3, 2019 · 5 comments
Closed

udev crashes on boot when handling bind action #11319

iv-m opened this issue Jan 3, 2019 · 5 comments
Labels
regression ⚠️ A bug in something that used to work correctly and broke through some recent commit udev

Comments

@iv-m
Copy link

iv-m commented Jan 3, 2019

systemd version the issue has been seen with

240

Used distribution

ALT Sisyphus

$ uname -a
Linux titan.localdomain 4.19.12-un-def-alt1 #1 SMP PREEMPT Wed Dec 26 12:37:35 UTC 2018 x86_64 GNU/Linux

Expected behaviour you didn't see

system boots normally

Unexpected behaviour you saw

system does not boot as udev crashes.

More details

After upgrading systemd, udev and stuff from 239 to 240, my laptop could not boot normally anymore. Rolling back to 239 helps. Below is what looking at coredumps reveals.

Backtrace:

#0  __strrchr_avx2 () at ../sysdeps/x86_64/multiarch/strrchr-avx2.S:54
#1  0x0000559068f38f81 in device_set_sysname (device=0x55906ab4bd90) at ../src/libsystemd/sd-device/sd-device.c:1005
#2  0x0000559068f3af95 in sd_device_get_sysname (device=device@entry=0x55906ab4bd90, ret=ret@entry=0x7ffc16583390) at ../src/libsystemd/sd-device/sd-device.c:1049
#3  0x0000559068f37b66 in device_append (device=device@entry=0x55906ab4bd90, key=<optimized out>, key@entry=0x7ffc165835f5 "ACTION", _major=_major@entry=0x7ffc16583418,
    _minor=_minor@entry=0x7ffc16583420, _seqnum=_seqnum@entry=0x7ffc16583428, _action=_action@entry=0x7ffc1658340c) at ../src/libsystemd/sd-device/device-private.c:333
#4  0x0000559068f37cc4 in device_new_from_nulstr (ret=ret@entry=0x7ffc16583490, nulstr=nulstr@entry=0x7ffc165835f5 "ACTION", len=len@entry=162)
    at ../src/libsystemd/sd-device/device-private.c:450
#5  0x0000559068f3536d in device_monitor_receive_device (m=m@entry=0x55906ab50be0, ret=ret@entry=0x7ffc16585610) at ../src/libsystemd/sd-device/device-monitor.c:453
#6  0x0000559068f35753 in device_monitor_event_handler (s=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=0x55906ab50be0)
    at ../src/libsystemd/sd-device/device-monitor.c:200
#7  0x0000559068f45e90 in source_dispatch (s=s@entry=0x55906ab28e40) at ../src/libsystemd/sd-event/sd-event.c:2816
#8  0x0000559068f4753f in sd_event_dispatch (e=e@entry=0x55906ab35680) at ../src/libsystemd/sd-event/sd-event.c:3229
#9  0x0000559068f476f0 in sd_event_run (e=e@entry=0x55906ab35680, timeout=timeout@entry=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:3286
#10 0x0000559068f4790b in sd_event_loop (e=0x55906ab35680) at ../src/libsystemd/sd-event/sd-event.c:3308
#11 0x0000559068f0ec6e in main_loop (cgroup=<optimized out>, fd_uevent=3, fd_ctrl=<optimized out>) at ../src/udev/udevd.c:1739
#12 run (argv=<optimized out>, argc=<optimized out>) at ../src/udev/udevd.c:1861
#13 main (argc=<optimized out>, argv=<optimized out>) at ../src/udev/udevd.c:1864

Pieces of nullstr from frame 4:

5: nulstr = (uint8_t *) 0x7ffc165835f5 "ACTION"
6: nulstr + 7 = (uint8_t *) 0x7ffc165835fc "bind"
8: nulstr + 12 = (uint8_t *) 0x7ffc16583601
"DEVPATH=/devices/pci0000:00/0000:00:1c.4/0000:3b:00.0/mdio_bus/r8169-3b00/r8169-3b00:00"
12: nulstr + 100 = (uint8_t *) 0x7ffc16583659 "SUBSYSTEM=mdio_bus"
18: nulstr + 119 = (uint8_t *) 0x7ffc1658366c "DEVTYPE=PHY"
23: nulstr + 131 = (uint8_t *) 0x7ffc16583678 "DRIVER=Generic PHY"
27: nulstr + 150 = (uint8_t *) 0x7ffc1658368b "SEQNUM=3181"

It looks like a bind action from ethernet controller:

$ lspci  | grep ^3b
3b:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller (rev 15)
@iv-m
Copy link
Author

iv-m commented Jan 3, 2019

Here's what apparently happens:

device_new_from_nulstr allocates empty device structure:
https://github.com/systemd/systemd/blob/v240/src/libsystemd/sd-device/device-private.c#L434

device_append tries to log that it ignores actions 'bind' and 'unbind':
https://github.com/systemd/systemd/blob/v240/src/libsystemd/sd-device/device-private.c#L333

internally, the logging macro tries to get the system name of the device:
https://github.com/systemd/systemd/blob/v240/src/libsystemd/sd-device/device-util.h#L41

which gets to device_set_sysname, where it crashes:
https://github.com/systemd/systemd/blob/v240/src/libsystemd/sd-device/sd-device.c#L1005

The problem is that device->devpath is initially NULL, when the device structure is allocated, and is still NULL when the control reaches device_set_sysname; it is not set or checked anywhere.

I think device_set_sysname should check if devpath is null, and return -EINVAL if it's not. I'll validate if that helps.

@yuwata yuwata added udev regression ⚠️ A bug in something that used to work correctly and broke through some recent commit has-pr ✨ labels Jan 3, 2019
@iv-m
Copy link
Author

iv-m commented Jan 3, 2019

I'll validate if that helps.

Yup, it boots now. I'll file a pull request.

@yuwata
Copy link
Member

yuwata commented Jan 3, 2019

I hope #11244 fixes this, especially 18fee12.

@iv-m
Copy link
Author

iv-m commented Jan 3, 2019

I hope #11244 fixes this, especially 18fee12.

I think it does. I've made a very similar change (I've used assert_return, but that should not matter), and works now.

@yuwata
Copy link
Member

yuwata commented Jan 3, 2019

#11244 was merged. Let's close this.

@yuwata yuwata closed this as completed Jan 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression ⚠️ A bug in something that used to work correctly and broke through some recent commit udev
Development

No branches or pull requests

2 participants