You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have encountered a problem. All systemctl commands cannot be executed.
Some errors similar to the following are reported:
Jul 13 19:37:17 e99g07484.et2 dbus[2155]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Jul 13 19:37:17 e99g07484.et2 dbus[2155]: [system] Activation via systemd failed for unit 'polkit.service': Argument list too long
Jul 13 19:37:17 e99g07484.et2 dbus[2155]: [system] Activating via systemd: service name='org.freedesktop.PolicyKit1' unit='polkit.service'
Jul 13 19:37:17 e99g07484.et2 dbus[2155]: [system] Activation via systemd failed for unit 'polkit.service': Argument list too long
I collected a coredump for analysis and found that the number of n_entries in m->units reached 131072.
I went on to parse the units details and found that most of the units (13W +) are mounts.
Use the following GDB command to traverse the linked list:
$ cat .gdbinit
define dump_mount_list
set $_node = (Unit *)$arg0
set $_num = 0
while ($_node)
printf "addr: %p, mount->id: %s, source_path: %s\n", $_node, $_node->id, $_node->source_path
set $_node = $_node->units_by_type_next
set $_num = $_num + 1
end
printf "num is %d\n", $_num
end
enum UnitType {
UNIT_SERVICE = 0,
UNIT_SOCKET,
UNIT_BUSNAME,
UNIT_TARGET,
UNIT_SNAPSHOT,
UNIT_DEVICE,
UNIT_MOUNT,
UNIT_AUTOMOUNT,
(gdb) p m->units_by_type[6]
$1 = (Unit *) 0x5608a73630f0
130,000 + mount points will be printed:
addr: 0x5608a73630f0, mount->id: home-t4-pouch-containers-5b1dce60939b18d5661d9b6d498c65d08178121f7b95c1481920379acb45dcec-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a73829f0, mount->id: home-t4-pouch-containerd-state-io.containerd.runtime.v1.linux-default-48b9c8cefd5c953bbf3303e8b4ea7b04a777ccfc789e05e5adf5ebddb834b958-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a739c240, mount->id: home-t4-pouch-containers-48b9c8cefd5c953bbf3303e8b4ea7b04a777ccfc789e05e5adf5ebddb834b958-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a7390680, mount->id: home-t4-pouch-containerd-state-io.containerd.runtime.v1.linux-default-8356eb46281e7fbe2c5da86d1a62eb4f93658cb7e3a4c4c854b656921649e1a4-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a7377040, mount->id: home-t4-pouch-containers-8356eb46281e7fbe2c5da86d1a62eb4f93658cb7e3a4c4c854b656921649e1a4-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a7267b40, mount->id: home-t4-pouch-containerd-state-io.containerd.runtime.v1.linux-default-6b3ce6d5a5f2126b6c0df9ac3663f8a4e3fc553e8952aa7665f622f662f7f154-rootfs.mount, source_path: /proc/self/mountinfo
addr: 0x5608a732e180, mount->id: home-t4-pouch-containers-6b3ce6d5a5f2126b6c0df9ac3663f8a4e3fc553e8952aa7665f622f662f7f154-rootfs.mount, source_path: /proc/self/mountinfo
……
After analyzing the code, I found the following possible bugs:
static int mount_dispatch_io(sd_event_source *source, int fd, uint32_t revents, void *userdata) {
……
r = mount_load_proc_self_mountinfo(m, true); -->Add the data from /proc/self/mountinfo to m->units
if (r < 0) {
/* Reset flags, just in case, for later calls */
LIST_FOREACH(units_by_type, u, m->units_by_type[UNIT_MOUNT]) {
Mount *mount = MOUNT(u);
mount->is_mounted = mount->just_mounted = mount->just_changed = false;
}
return 0; -->If returned here, the data in m->units will only increase, not decrease
}
manager_dispatch_load_queue(m);
LIST_FOREACH(units_by_type, u, m->units_by_type[UNIT_MOUNT]) {
… -->The code here will clean up the residual data in m->units
}
…
}
I also constructed a use case to reproduce the bug.
A, Construct a path greater than 256 characters (so mount_load_proc_self_mountinfo () returns an error code):
We have encountered a problem. All systemctl commands cannot be executed.
Some errors similar to the following are reported:
I collected a coredump for analysis and found that the number of n_entries in m->units reached 131072.
I went on to parse the units details and found that most of the units (13W +) are mounts.
Use the following GDB command to traverse the linked list:
After analyzing the code, I found the following possible bugs:
I also constructed a use case to reproduce the bug.
A, Construct a path greater than 256 characters (so mount_load_proc_self_mountinfo () returns an error code):
B, Mount some directories, then umount them
C, Finally, through GDB analysis, it can be found that these mount points are still in m->units:
Similar bugs:
https://access.redhat.com/solutions/4620671
systemd/systemd#15221
kubernetes/kubernetes#57345
The text was updated successfully, but these errors were encountered: