shutdown -h h:m broken, shuts down the system immediately #1120

Closed
mbiebl opened this Issue Sep 2, 2015 · 21 comments

Comments

Projects
None yet
7 participants
Contributor

mbiebl commented Sep 2, 2015

Version: 225
Bug-Debian: http://bugs.debian.org/797763

After upgrading systemd from 224-2 to 225-1 shutdown at a scheduled time (shutdown h:m) shuts down the system immediately, not at
the scheduled time.
Using systemd 224-2 it works correctly, shutting down at the scheduled time according to /var/run/systemd/shutdown/scheduled.

A git bisect shows 9ef1502 as the faulty commit and reverting that indeed fixes the issue.

Contributor

mbiebl commented Sep 2, 2015

@jsynacek this is probably for you

Contributor

jsynacek commented Sep 3, 2015

Works for me on Fedora Rawhide:

# shutdown -h 3:4
Shutdown scheduled for Fri 2015-09-04 03:04:00 CEST, use 'shutdown -c' to cancel.
# 

Is the action blocked by AppArmor? Is there anything interesting in the logs? Maybe polkit is blocking the action?

Contributor

mbiebl commented Sep 3, 2015

AppArmor is not enabled. As for polkit: I issued that command as root, so I don't think it should have any effect.

Contributor

mbiebl commented Sep 3, 2015

root@pluto:~# shutdown -h 3:4
Failed to set wall message, ignoring: Message did not receive a reply (timeout by message bus)
Failed to call ScheduleShutdown in logind, proceeding with immediate shutdown: Message did not receive a reply (timeout by message bus)

Why does a failed wall message trigger an immediate shutdown? That sounds wrong.

Contributor

mbiebl commented Sep 3, 2015

fwiw, this is from a minimal chroot, no desktop session or polkit agent running.

Contributor

mbiebl commented Sep 4, 2015

I'm also seeing this in the journal, when that happens:

Sep 04 04:06:43 pluto dbus[60]: [system] Activating via systemd: service name='org.freedesktop.login1' unit='dbus-org.freedesktop.login1.service'
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Main process exited, code=killed, status=11/SEGV
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Unit entered failed state.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Failed with result 'signal'.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Service has no hold-off time, scheduling restart.
Sep 04 04:06:43 pluto systemd[1]: Stopped Login Service.
Sep 04 04:06:43 pluto systemd[1]: Starting Login Service...
Sep 04 04:06:43 pluto dbus[60]: [system] Successfully activated service 'org.freedesktop.login1'
Sep 04 04:06:43 pluto systemd-logind[78]: New seat seat0.
Sep 04 04:06:43 pluto systemd-logind[78]: New session 14 of user root.
Sep 04 04:06:43 pluto systemd[1]: Started Login Service.
Sep 04 04:06:43 pluto systemd-logind[78]: Assertion 's' failed at ../src/basic/util.c:1520, function cescape(). Aborting.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Main process exited, code=killed, status=6/ABRT
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Unit entered failed state.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Failed with result 'signal'.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Service has no hold-off time, scheduling restart.
Contributor

mbiebl commented Sep 4, 2015

and the backtrace

(gdb) bt full
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
No locals.
#1  0x00007fef396ef9de in __GI___strdup (s=0x55f224000000 <error: Cannot access memory at address 0x55f224000000>) at strdup.c:41
        len = <optimized out>
        new = <optimized out>
#2  0x000055f22309508b in free_and_strdup (p=0x55f224c491b8, s=0x55f224000000 <error: Cannot access memory at address 0x55f224000000>) at ../src/basic/util.c:6023
        t = <optimized out>
        __PRETTY_FUNCTION__ = "free_and_strdup"
#3  0x000055f2230cf431 in method_set_wall_message.lto_priv.426 (message=0x55f224c515a0, userdata=0x55f224c49060, error=0x7fffde8adf30) at ../src/login/logind-dbus.c:2319
        r = <optimized out>
        m = <optimized out>
        wall_message = 0x55f224000000 <error: Cannot access memory at address 0x55f224000000>
        enable_wall_messages = true
        __PRETTY_FUNCTION__ = "method_set_wall_message"
        __func__ = "method_set_wall_message"
#4  0x000055f22308b830 in method_callbacks_run (found_object=0x7fffde8ae090, require_fallback=false, c=0x55f224c4edf0, m=0x55f224c515a0, bus=0x55f224c4b3e0) at ../src/libsystemd/sd-bus/bus-objects.c:410
        slot = 0x55f224c4d490
        u = 0x55f224c49060
        error = {name = 0x0, message = 0x0, _need_free = 0}
        signature = 0x55f224c51350 "sb"
        r = <optimized out>
#5  object_find_and_run.lto_priv.203 (bus=0x55f224c4b3e0, m=0x55f224c515a0, p=0x55f224c4bd48 "/org/freedesktop/login1", require_fallback=false, found_object=0x7fffde8ae090) at ../src/libsystemd/sd-bus/bus-objects.c:1247
        vtable_key = {path = 0x55f224c4bd48 "/org/freedesktop/login1", interface = 0x55f224c4bd80 "org.freedesktop.login1.Manager", member = 0x55f224c4bd68 "SetWallMessage", parent = 0x0, last_iteration = 0,
          vtable = 0x7fef39a11620 <main_arena>}
        r = <optimized out>
        __PRETTY_FUNCTION__ = "object_find_and_run"
#6  0x000055f2230b996b in bus_process_object (m=0x55f224c515a0, bus=<optimized out>) at ../src/libsystemd/sd-bus/bus-objects.c:1363
        prefix = 0x7fffde8adfd0 "\340\263\304$\362U"
        r = <optimized out>
        pl = <optimized out>
        found_object = true
#7  process_message (m=0x55f224c515a0, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2543
        r = <optimized out>
#8  process_running (priority=0, ret=0x0, hint_priority=false, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2585
        m = 0x55f224c515a0
        r = <optimized out>
#9  bus_process_internal.constprop.98 (bus=<optimized out>, ret=0x0, priority=0, hint_priority=false) at ../src/libsystemd/sd-bus/sd-bus.c:2774
        _dont_destroy_bus = 0x55f224c4b3e0
#10 0x000055f2230e85c1 in sd_bus_process (ret=0x0, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2793
No locals.
#11 io_callback (s=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:3063
        bus = <optimized out>
        __PRETTY_FUNCTION__ = "io_callback"
#12 0x000055f2230dc3f3 in source_dispatch.lto_priv.367 (s=0x55f224c51c10) at ../src/libsystemd/sd-event/sd-event.c:2116
        r = <optimized out>
        __PRETTY_FUNCTION__ = "source_dispatch"
        __func__ = "source_dispatch"
#13 0x000055f22308df1d in sd_event_dispatch (e=0x55f224c4a260) at ../src/libsystemd/sd-event/sd-event.c:2472
        p = <optimized out>
        r = <optimized out>
#14 sd_event_run (timeout=18446744073709551615, e=0x55f224c4a260) at ../src/libsystemd/sd-event/sd-event.c:2500
        r = <optimized out>
#15 manager_run (m=0x55f224c49060) at ../src/login/logind.c:1097
        r = <optimized out>
#16 main (argc=<optimized out>, argv=<optimized out>) at ../src/login/logind.c:1159
        m = <optimized out>
        r = <optimized out>
        __func__ = "main"
Contributor

mbiebl commented Sep 4, 2015

@jsynacek I'm able to reproduce the problem in a fedora rawhide VM with systemd v225 as well.

Contributor

didrocks commented Sep 4, 2015

Confirming as well on Ubuntu with systemd v225 (225-1ubuntu1) where systemd-logind segfault when you shutdown at any giving time as soon as the wall message prints.

Contributor

jsynacek commented Sep 4, 2015

I can reproduce this as well, the important part is that there is no polkit installed.

Owner

poettering commented Sep 4, 2015

Does #1141 fix the issue? (specifically 2cf088b) If not, could you run logind in valgrind, trigger the issue and attach the output here?

Contributor

jsynacek commented Sep 4, 2015

@poettering I applied both patches, but they didn't help.

Assertion 's' failed at src/basic/util.c:1520, function cescape(). Aborting.
==63==
==63== Process terminating with default action of signal 6 (SIGABRT)
==63== at 0x58D5518: raise (in /usr/lib64/libc-2.22.90.so)
==63== by 0x58D7119: abort (in /usr/lib64/libc-2.22.90.so)
==63== by 0x137DB1: log_assert_failed (log.c:773)
==63== by 0x11EEEE: cescape (util.c:1520)
==63== by 0x147FFF: update_schedule_file (logind-dbus.c:1799)
==63== by 0x147FFF: method_schedule_shutdown.lto_priv.410 (logind-dbus.c:1961)
==63== by 0x11369D: method_callbacks_run (bus-objects.c:410)
==63== by 0x11369D: object_find_and_run.lto_priv.199 (bus-objects.c:1247)
==63== by 0x12E833: bus_process_object (bus-objects.c:1363)
==63== by 0x12E833: process_message (sd-bus.c:2543)
==63== by 0x12E833: process_running (sd-bus.c:2585)
==63== by 0x12E833: bus_process_internal.constprop.99 (sd-bus.c:2774)
==63== by 0x1315D0: UnknownInlinedFun (sd-bus.c:2793)
==63== by 0x1315D0: io_callback (sd-bus.c:3063)
==63== by 0x16B042: source_dispatch.lto_priv.358 (sd-event.c:2116)
==63== by 0x11656E: sd_event_dispatch (sd-event.c:2472)
==63== by 0x11656E: sd_event_run (sd-event.c:2500)
==63== by 0x11656E: manager_run (logind.c:1097)
==63== by 0x11656E: main (logind.c:1159)
==63==

I don't know why, but, at least on my machine, after SetWallMessage is called over dbus, the method_set_wall_message() handler in logind-dbus.c calls "sd_bus_message_read(message, "sb", &wall_message, &enable_wall_messages);", which for some reason passes but leaves wall_message undefined. Any idea why that might happen?

@poettering poettering added this to the v226 milestone Sep 5, 2015

Owner

dvdhrm commented Sep 5, 2015

This should fix it:

diff --git a/src/login/logind-dbus.c b/src/login/logind-dbus.c
index 7cc68d4..beae75e 100644
--- a/src/login/logind-dbus.c
+++ b/src/login/logind-dbus.c
@@ -1796,9 +1796,11 @@ static int update_schedule_file(Manager *m) {
         if (r < 0)
                 return log_error_errno(r, "Failed to create shutdown subdirectory: %m");

-        t = cescape(m->wall_message);
-        if (!t)
-                return log_oom();
+        if (!isempty(m->wall_message)) {
+                t = cescape(m->wall_message);
+                if (!t)
+                        return log_oom();
+        }

         r = fopen_temporary("/run/systemd/shutdown/scheduled", &f, &temp_path);
         if (r < 0)
Owner

dvdhrm commented Sep 5, 2015

Btw., this is a regression from:

commit 5744f59a3ee883ef3a78214bd5236157acdc35ba
Author: Lennart Poettering <lennart@poettering.net>
Date:   Fri Sep 4 10:34:47 2015 +0200

    logind: treat an empty wall message like a NULL one

dvdhrm added a commit to dvdhrm/systemd that referenced this issue Sep 5, 2015

login: fix NULL-deref on wall_message
We treat an empty wall-message equal to a NULL wall-message since:

        commit 5744f59
        Author: Lennart Poettering <lennart@poettering.net>
        Date:   Fri Sep 4 10:34:47 2015 +0200

            logind: treat an empty wall message like a NULL one

Fix the shutdown scheduler to not deref a NULL pointer, but properly
check for an empty wall-message.

Fixes: #1120
Contributor

mbiebl commented Sep 5, 2015

@dvdhrm not sure if 5744f59 is really the culprit. v225 does not even have that commit and I'm running v225 here.

Owner

dvdhrm commented Sep 5, 2015

@mbiebl, 5744f59 made the bug just more apparent. The code in question existed before and I think it was never correct.

Contributor

mbiebl commented Sep 5, 2015

@dvdhrm confirmed fixed with those patches applied, thanks! Closing the bug report.

@mbiebl mbiebl closed this Sep 5, 2015

Contributor

jsynacek commented Sep 7, 2015

@dvdhrm Thanks for fixing this!

@huguesb huguesb referenced this issue in coreos/bugs Dec 2, 2015

Closed

Shutdown issues in CoreOS 835.8.0 #1009

pbx0 added a commit to pbx0/systemd that referenced this issue Dec 3, 2015

login: fix NULL-deref on wall_message
We treat an empty wall-message equal to a NULL wall-message since:

        commit 5744f59
        Author: Lennart Poettering <lennart@poettering.net>
        Date:   Fri Sep 4 10:34:47 2015 +0200

            logind: treat an empty wall message like a NULL one

Fix the shutdown scheduler to not deref a NULL pointer, but properly
check for an empty wall-message.

Fixes: #1120

Still broken or reverted:
Failed to call ScheduleShutdown in logind, proceeding with immediate shutdown: Action org.freedesktop.login1.poweroff is not registered

Not sure how this is possible.

Owner

zonque commented Feb 1, 2016

This is fixed with commit 0671d7f.

This will be part of the next release. In the meantime, could you try the current git head, or ask your distribution to cherry-pick the commit, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment