Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression Since 2.7.6: 100% CPU Usage due to an uevent loop when MMP (Multiple Mount Protection) is Enabled #946

Closed
dypsilon opened this issue Dec 31, 2021 · 16 comments · Fixed by #949

Comments

@dypsilon
Copy link

dypsilon commented Dec 31, 2021

Hello udisks maintainers! Thank you for all the hard work maintaing this beautiful package and developing new features.

UPDATE: see this comment for a workaround #946 (comment)

Upon logging into an iSCSI target with iscsiadm --mode node --targetname [REDACTED] --login I observe that my network is maxed and the CPU usage is very high. udevadm monitor is showing a constant stream of change events.

KERNEL[7333.233077] change   /devices/platform/host0/session16/target0:0:0/0:0:0:1/block/sda (block)
UDEV  [7333.450942] change   /devices/platform/host0/session16/target0:0:0/0:0:0:1/block/sda (block)

After a lot of digging and with some luck, I was able to narrow the issue down to udisks2. The spamming stops once I log out from the target or stop the udisks systemd service. Here is some information about my system:

Linux 5.15.11-200.fc35.x86_64 #1 SMP Wed Dec 22 15:41:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Fedora release 35 (Thirty Five)
udisks2                  x86_64               2.9.4-1.fc35
libudisks2              x86_64               2.9.4-1.fc35

I am using the KDE spin of fc35. Also reproduced the same behavior on Manjaro XFCE.

Downgrading to 2.7.5 resolves the issue, so I suppose it was introduced in 2.7.6. Is there some better workaround or some config I have to apply to fix this issue?

@dypsilon dypsilon changed the title Regression Since 2.7.5: High CPU Usage + Network Usage with iSCSI targets Regression Since 2.7.6: High CPU Usage + Network Usage with iSCSI targets Dec 31, 2021
@tbzatek
Copy link
Member

tbzatek commented Dec 31, 2021

Interesting issue. UDisks itself has an iscsi module based on downstream libiscsi patch within iscsi-initiator-utils (the open-iscsi project you're using anyway). This comes with a testsuite, running on various Fedora releases. So this surprises me a bit.

Can you check udisksctl monitor and journalctl for any suspicious udisksd messages? This looks like an uevent loop, something is constantly trigerring a change. If you log out from the desktop, does it make a difference? When you're logged on the target, is there any complex layout of the block device present in the system? Could you also post udisksctl dump and lsblk outputs?

@dypsilon
Copy link
Author

dypsilon commented Jan 1, 2022

Thanks for a quick response. Here is the output from the diagnosis tools. I redacted the parts that looked sensitive. Please feel free to request additional info if something is missing:

udiskctl monitor

06:34:49.697: Added /org/freedesktop/UDisks2/drives/LIO_ORG_[REDACTED]
  org.freedesktop.UDisks2.Drive:
    CanPowerOff:                false
    Configuration:              {}
    ConnectionBus:              
    Ejectable:                  false
    Id:                         LIO-ORG-[REDACTED]
    Media:                      
    MediaAvailable:             true
    MediaChangeDetected:        true
    MediaCompatibility:         
    MediaRemovable:             false
    Model:                      [REDACTED]
    Optical:                    false
    OpticalBlank:               false
    OpticalNumAudioTracks:      0
    OpticalNumDataTracks:       0
    OpticalNumSessions:         0
    OpticalNumTracks:           0
    Removable:                  false
    Revision:                   4.0
    RotationRate:               -1
    Seat:                       seat0
    Serial:                     [REDACTED]
    SiblingId:                  
    Size:                       511093309440
    SortKey:                    01hotplug/1641015289695950
    TimeDetected:               1641015289695950
    TimeMediaDetected:          1641015289695950
    Vendor:                     LIO-ORG
    WWN:                        [REDACTED]
06:34:49.732: Added /org/freedesktop/UDisks2/block_devices/sda
  org.freedesktop.UDisks2.Block:
    Configuration:              []
    CryptoBackingDevice:        '/'
    Device:                     /dev/sda
    DeviceNumber:               2048
    Drive:                      '/org/freedesktop/UDisks2/drives/LIO_ORG_[REDACTED]'
    HintAuto:                   false
    HintIconName:               
    HintIgnore:                 false
    HintName:                   
    HintPartitionable:          true
    HintSymbolicIconName:       
    HintSystem:                 true
    Id:                         by-id-scsi-[REDACTED]
    IdLabel:                    [REDACTED]
    IdType:                     ext4
    IdUUID:                     [REDACTED]
    IdUsage:                    filesystem
    IdVersion:                  1.0
    MDRaid:                     '/'
    MDRaidMember:               '/'
    PreferredDevice:            /dev/sda
    ReadOnly:                   false
    Size:                       511093309440
    Symlinks:                   /dev/disk/by-id/scsi-[REDACTED]
                                /dev/disk/by-id/wwn-[REDACTED]
                                /dev/disk/by-label/[REDACTED]
                                /dev/disk/by-path/[REDACTED]
                                /dev/disk/by-uuid/[REDACTED]
    UserspaceMountOptions:      
  org.freedesktop.UDisks2.Filesystem:
    MountPoints:        
    Size:               511093309440

journalctl

Jan 01 06:37:13 p14s kernel: scsi host0: iSCSI Initiator over TCP/IP
Jan 01 06:37:13 p14s audit[1597]: AVC avc:  denied  { sys_ptrace } for  pid=1597 comm="iscsid" capability=19  scontext=system_u:system_r:iscsid_t:s0 tcontext=system_u:system_r:iscsid_t:s0 tclass=cap_userns permissive=0
Jan 01 06:37:13 p14s iscsid[1597]: iscsid: Connection2:0 to [target: [REDACTED], portal: [REDACTED]] through [iface: default] is operational now
Jan 01 06:37:13 p14s kernel: scsi 0:0:0:1: Direct-Access     LIO-ORG  [REDACTED]           4.0  PQ: 0 ANSI: 5
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: Attached scsi generic sg0 type 0
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: [sda] 998229120 512-byte logical blocks: (511 GB/476 GiB)
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: [sda] Write Protect is off
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: [sda] Mode Sense: 43 00 10 08
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
Jan 01 06:37:13 p14s kernel: sd 0:0:0:1: [sda] Optimal transfer size 65535 logical blocks > dev_max (1024 logical blocks)
Jan 01 06:37:14 p14s kernel: sd 0:0:0:1: [sda] Attached SCSI disk

udiskctl dump

/org/freedesktop/UDisks2/drives/LIO_ORG_[REDACTED]:
  org.freedesktop.UDisks2.Drive:
    CanPowerOff:                false
    Configuration:              {}
    ConnectionBus:              
    Ejectable:                  false
    Id:                         LIO-ORG-[REDACTED]
    Media:                      
    MediaAvailable:             true
    MediaChangeDetected:        true
    MediaCompatibility:         
    MediaRemovable:             false
    Model:                      [REDACTED]
    Optical:                    false
    OpticalBlank:               false
    OpticalNumAudioTracks:      0
    OpticalNumDataTracks:       0
    OpticalNumSessions:         0
    OpticalNumTracks:           0
    Removable:                  false
    Revision:                   4.0
    RotationRate:               -1
    Seat:                       seat0
    Serial:                     [REDACTED]
    SiblingId:                  
    Size:                       511093309440
    SortKey:                    01hotplug/1641015434313501
    TimeDetected:               1641015434313501
    TimeMediaDetected:          1641015434313501
    Vendor:                     LIO-ORG
    WWN:                        [REDACTED]

lsblk -O

NAME KNAME PATH      MAJ:MIN FSAVAIL FSSIZE FSTYPE FSUSED FSUSE% FSROOTS FSVER MOUNTPOINT MOUNTPOINTS PTUUID PTTYPE PARTTYPE PARTTYPENAME PARTLABEL PARTUUID PARTFLAGS  RA RO RM HOTPLUG SIZE STATE   OWNER GROUP MODE       ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE TYPE  DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO WSAME RAND PKNAME    HCTL       TRAN   SUBSYSTEMS  REV VENDOR   ZONED DAX
sda  sda   /dev/sda    8:0                  ext4                         1.0                                                                                           128  0  0       0 476G running root  disk  brw-rw----         0    512      0     512     512    1 bfq       226 disk         0        0B       0B         0   32M    1           0:0:0:1    iscsi  block:scsi 4.0  LIO-ORG  none    0

@dypsilon
Copy link
Author

dypsilon commented Jan 1, 2022

Here are some details about the target machine:

uname -a

Linux 5.10.27-v8-iscsi+ #1 SMP PREEMPT Sun Apr 18 15:28:32 CEST 2021 aarch64 GNU/Linux
/backstores/block/[REDACTED]> info
dev: /dev/disk/by-label/[REDACTED]
name: [REDACTED]
plugin: block
readonly: False
write_back: False
wwn: [REDACTED]

/backstores/block/[REDACTED]> status
Status for /backstores/block/[REDACTED]: /dev/disk/by-label/[REDACTED] (0 bytes) write-thru activated

/backstores/b...ault_tg_pt_gp> status
Status for /backstores/block/[REDACTED]/alua/default_tg_pt_gp: ALUA state: Active/optimized

/backstores/b...ault_tg_pt_gp> info
alua_access_state: 0
alua_access_status: 0
alua_access_type: 3
alua_support_active_nonoptimized: 1
alua_support_active_optimized: 1
alua_support_offline: 1
alua_support_standby: 1
alua_support_transitioning: 1
alua_support_unavailable: 1
alua_write_metadata: 0
implicit_trans_secs: 0
name: default_tg_pt_gp
nonop_delay_msecs: 100
preferred: 0
tg_pt_gp_id: 0
trans_delay_msecs: 0

The disk is partitioned with a normal ext4 fs. There is something special: the disk is also accessible on the target system via a loopback interface for backups. I tried removing it from the loopback pool but it didn't solve the issue.

@dypsilon
Copy link
Author

dypsilon commented Jan 1, 2022

Logging out from the desktop session didn't change anything. As long as the systemd service is running, iptraf is showing a lot of traffic on the network (as much as the channel allows). I tried to access the disk in a virtual machine. It works without any issues on a minimal fedora without any desktop, but there is also no udisks2 installed. As mentioned the same issue is observed on Manjaro 21.2.0 XFCE.

I tried to connect the target machine to a different router. Didn't change the behaviour.

@tbzatek
Copy link
Member

tbzatek commented Jan 1, 2022

Thanks for the outputs. I don't see anything unusual, just tried to reproduce your scenario here, with an unpartitioned ext4-formatted iscsi volume and everything works as expected. Target is my Gentoo host with targetcli, initiator is F35 with a development udisks version (that is very close to 2.9.4 release anyway). No extra traffic observed, tried mounting via udisks.

So you're saying that you see constant flow of change events in udevadm monitor but only single record from udiskctl monitor?

I'm kinda out of ideas here, the next step would be debugging udisksd. You may try attaching strace to the running daemon and check if you see any writes to /sys/block/sda/uevent (or grep for /uevent). If you even see data string like UDISKSSERIAL=, it's clear it's coming from us.

@tbzatek
Copy link
Member

tbzatek commented Jan 1, 2022

Another possibility is attaching gdb and setting breakpoint for trigger_uevent. Getting backtrace would reveal the source of the problem.

@dypsilon
Copy link
Author

dypsilon commented Jan 1, 2022

I went a different path and placed the drive into a different machine with ubuntu server (it has udisks2 installed). Immediately after the boot I noticed 100% CPU usage caused by an uevent loop, which was happening only for two partitions out of five. Those are the exact partitions I was using with iSCSI. To avoid multiple mounts on those partitions, which is possible with iSCSI, but leads to data corruption, I enabled MMP (Multiple Mount Protection) in the past.

This is exactly what is driving udev crazy, for some reason. Disabling this feature on those partitions resolved the issue! So I am pretty certain this is the root cause (renaming this github issue).

Here are the commands I had to execute to resolve:

sudo tune2fs -O ^mmp /dev/sda5
sudo tune2fs -O ^mmp /dev/sda6

To check whether MMP is enabled, one could use:

e2mmpstatus /dev/sda5

So the workaround for me was disabling MMP for now. But I am not keen to keep it disabled for a long time because it could lead to fs corruption and data loss if the partition is mounted by two iSCSI initiators at the same time by accident.

PS: funny enough, when Googling for "mmp ext4", this is the first search result I get: https://unix.stackexchange.com/questions/663211/systemd-udev-high-cpu-via-systemd-udevd-because-of-mmp-ext4-feature

@dypsilon dypsilon changed the title Regression Since 2.7.6: High CPU Usage + Network Usage with iSCSI targets Regression Since 2.7.6: 100% CPU Usage when MMP (Multiple Mount Protection) is Enabled Jan 1, 2022
@dypsilon dypsilon changed the title Regression Since 2.7.6: 100% CPU Usage when MMP (Multiple Mount Protection) is Enabled Regression Since 2.7.6: 100% CPU Usage due to an uevent loop when MMP (Multiple Mount Protection) is Enabled Jan 1, 2022
@tbzatek
Copy link
Member

tbzatek commented Jan 1, 2022

Oh, nice, good shot! The real problem is that dumpe2fs that is called to get extended filesystem information is generating extra uevent when MMP is enabled. When it's not, no extra uevent is generated. And that extra uevent gets picked up by udisks eventually and figuring out that something has changed on the device it spawns dumpe2fs to get fresh information.

Reproduced on a simple local block device, no need for iscsi.

Anyway, I can only imagine a workaround would need to be placed in udisks. However we've had issues with calling the filesystem introspection tools in the past and I would personally would like to get rid of such calls completely:

static guint64
get_filesystem_size (UDisksLinuxBlockObject *object)
{
guint64 size = 0;
UDisksLinuxDevice *device;
gchar *dev;
const gchar *type;
GError *error = NULL;
device = udisks_linux_block_object_get_device (object);
dev = udisks_linux_block_object_get_device_file (object);
type = g_udev_device_get_property (device->udev_device, "ID_FS_TYPE");
if (g_strcmp0 (type, "ext2") == 0)
{
BDFSExt2Info *info = bd_fs_ext2_get_info (dev, &error);
if (info)
{
size = info->block_size * info->block_count;
bd_fs_ext2_info_free (info);
}
}
else if (g_strcmp0 (type, "ext3") == 0)
{
BDFSExt3Info *info = bd_fs_ext3_get_info (dev, &error);
if (info)
{
size = info->block_size * info->block_count;
bd_fs_ext3_info_free (info);
}
}
else if (g_strcmp0 (type, "ext4") == 0)
{
BDFSExt4Info *info = bd_fs_ext4_get_info (dev, &error);
if (info)
{
size = info->block_size * info->block_count;
bd_fs_ext4_info_free (info);
}
}
else if (g_strcmp0 (type, "xfs") == 0)
{
BDFSXfsInfo *info = bd_fs_xfs_get_info (dev, &error);
if (info)
{
size = info->block_size * info->block_count;
bd_fs_xfs_info_free (info);
}
}
g_free (dev);
g_object_unref (device);
g_clear_error (&error);
return size;
}

@dypsilon
Copy link
Author

dypsilon commented Jan 1, 2022

Ah, great that you were able to narrow it down to the specific code lines! Maybe it helps somehow that this issue did not exist in 2.7.5? Apparently something was done differently back then, not sure though whether the same code can be used still. 2.9.4 probably diverged quite a bit from 2.7.5.

@tbzatek
Copy link
Member

tbzatek commented Jan 2, 2022

It was added on purpose, however nobody expected such consequences back then: #477

@tbzatek
Copy link
Member

tbzatek commented Jan 2, 2022

Some more issues caused by the mentioned change: #911, #871

@vojtechtrefny
Copy link
Member

Definitely time to change this to a function instead of property (or make the property invalid until a function is called to refresh it?).
But I also think this should be reported as a kernel bug -- dumpe2fs shouldn't cause a change event emit.

@tbzatek
Copy link
Member

tbzatek commented Jan 2, 2022

I'm generally trying to keep the public API as stable as possible but I'm out of ideas here. A while back I tried to hack the GDBus machinery to hook interface property retrieval and make it on demand but failed horribly. Support for this would need to be added to GDBusInterfaceSkeleton first.

Sure this can be made a function instead, marking the property as deprecated (and perhaps only valid when the filesystem is mounted, retrieved by other means). @mvollmer ?

As for the dumpe2fs issue - isn't it exactly the way the protection is supposed to work? I.e. providing a shared lock bit in the filesystem superblock? Any write (fd close) to the device would then trigger an uevent. This is a userspace access on a non-mounted filesystem.

@tbzatek
Copy link
Member

tbzatek commented Jan 2, 2022

Oh? https://developer-old.gnome.org/gio/stable/GDBusInterfaceSkeleton.html#g-dbus-interface-skeleton-get-vtable

Nope, returns static memory, any write would segfault. However!...

@vojtechtrefny
Copy link
Member

As for the dumpe2fs issue - isn't it exactly the way the protection is supposed to work? I.e. providing a shared lock bit in the filesystem superblock? Any write (fd close) to the device would then trigger an uevent. This is a userspace access on a non-mounted filesystem.

Yes, but this is read-only access. And we are not mounting the device for dumpe2fs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants