Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutdown -h h:m broken, shuts down the system immediately #1120

Closed
mbiebl opened this issue Sep 2, 2015 · 21 comments
Closed

shutdown -h h:m broken, shuts down the system immediately #1120

mbiebl opened this issue Sep 2, 2015 · 21 comments
Labels
login needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer regression ⚠️ A bug in something that used to work correctly and broke through some recent commit release-critical
Milestone

Comments

@mbiebl
Copy link
Contributor

mbiebl commented Sep 2, 2015

Version: 225
Bug-Debian: http://bugs.debian.org/797763

After upgrading systemd from 224-2 to 225-1 shutdown at a scheduled time (shutdown h:m) shuts down the system immediately, not at
the scheduled time.
Using systemd 224-2 it works correctly, shutting down at the scheduled time according to /var/run/systemd/shutdown/scheduled.

A git bisect shows 9ef1502 as the faulty commit and reverting that indeed fixes the issue.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 2, 2015

@jsynacek this is probably for you

@jsynacek
Copy link
Contributor

jsynacek commented Sep 3, 2015

Works for me on Fedora Rawhide:

# shutdown -h 3:4
Shutdown scheduled for Fri 2015-09-04 03:04:00 CEST, use 'shutdown -c' to cancel.
# 

Is the action blocked by AppArmor? Is there anything interesting in the logs? Maybe polkit is blocking the action?

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 3, 2015

AppArmor is not enabled. As for polkit: I issued that command as root, so I don't think it should have any effect.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 3, 2015

root@pluto:~# shutdown -h 3:4
Failed to set wall message, ignoring: Message did not receive a reply (timeout by message bus)
Failed to call ScheduleShutdown in logind, proceeding with immediate shutdown: Message did not receive a reply (timeout by message bus)

Why does a failed wall message trigger an immediate shutdown? That sounds wrong.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 3, 2015

fwiw, this is from a minimal chroot, no desktop session or polkit agent running.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 4, 2015

I'm also seeing this in the journal, when that happens:

Sep 04 04:06:43 pluto dbus[60]: [system] Activating via systemd: service name='org.freedesktop.login1' unit='dbus-org.freedesktop.login1.service'
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Main process exited, code=killed, status=11/SEGV
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Unit entered failed state.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Failed with result 'signal'.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Service has no hold-off time, scheduling restart.
Sep 04 04:06:43 pluto systemd[1]: Stopped Login Service.
Sep 04 04:06:43 pluto systemd[1]: Starting Login Service...
Sep 04 04:06:43 pluto dbus[60]: [system] Successfully activated service 'org.freedesktop.login1'
Sep 04 04:06:43 pluto systemd-logind[78]: New seat seat0.
Sep 04 04:06:43 pluto systemd-logind[78]: New session 14 of user root.
Sep 04 04:06:43 pluto systemd[1]: Started Login Service.
Sep 04 04:06:43 pluto systemd-logind[78]: Assertion 's' failed at ../src/basic/util.c:1520, function cescape(). Aborting.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Main process exited, code=killed, status=6/ABRT
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Unit entered failed state.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Failed with result 'signal'.
Sep 04 04:06:43 pluto systemd[1]: systemd-logind.service: Service has no hold-off time, scheduling restart.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 4, 2015

and the backtrace

(gdb) bt full
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
No locals.
#1  0x00007fef396ef9de in __GI___strdup (s=0x55f224000000 <error: Cannot access memory at address 0x55f224000000>) at strdup.c:41
        len = <optimized out>
        new = <optimized out>
#2  0x000055f22309508b in free_and_strdup (p=0x55f224c491b8, s=0x55f224000000 <error: Cannot access memory at address 0x55f224000000>) at ../src/basic/util.c:6023
        t = <optimized out>
        __PRETTY_FUNCTION__ = "free_and_strdup"
#3  0x000055f2230cf431 in method_set_wall_message.lto_priv.426 (message=0x55f224c515a0, userdata=0x55f224c49060, error=0x7fffde8adf30) at ../src/login/logind-dbus.c:2319
        r = <optimized out>
        m = <optimized out>
        wall_message = 0x55f224000000 <error: Cannot access memory at address 0x55f224000000>
        enable_wall_messages = true
        __PRETTY_FUNCTION__ = "method_set_wall_message"
        __func__ = "method_set_wall_message"
#4  0x000055f22308b830 in method_callbacks_run (found_object=0x7fffde8ae090, require_fallback=false, c=0x55f224c4edf0, m=0x55f224c515a0, bus=0x55f224c4b3e0) at ../src/libsystemd/sd-bus/bus-objects.c:410
        slot = 0x55f224c4d490
        u = 0x55f224c49060
        error = {name = 0x0, message = 0x0, _need_free = 0}
        signature = 0x55f224c51350 "sb"
        r = <optimized out>
#5  object_find_and_run.lto_priv.203 (bus=0x55f224c4b3e0, m=0x55f224c515a0, p=0x55f224c4bd48 "/org/freedesktop/login1", require_fallback=false, found_object=0x7fffde8ae090) at ../src/libsystemd/sd-bus/bus-objects.c:1247
        vtable_key = {path = 0x55f224c4bd48 "/org/freedesktop/login1", interface = 0x55f224c4bd80 "org.freedesktop.login1.Manager", member = 0x55f224c4bd68 "SetWallMessage", parent = 0x0, last_iteration = 0,
          vtable = 0x7fef39a11620 <main_arena>}
        r = <optimized out>
        __PRETTY_FUNCTION__ = "object_find_and_run"
#6  0x000055f2230b996b in bus_process_object (m=0x55f224c515a0, bus=<optimized out>) at ../src/libsystemd/sd-bus/bus-objects.c:1363
        prefix = 0x7fffde8adfd0 "\340\263\304$\362U"
        r = <optimized out>
        pl = <optimized out>
        found_object = true
#7  process_message (m=0x55f224c515a0, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2543
        r = <optimized out>
#8  process_running (priority=0, ret=0x0, hint_priority=false, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2585
        m = 0x55f224c515a0
        r = <optimized out>
#9  bus_process_internal.constprop.98 (bus=<optimized out>, ret=0x0, priority=0, hint_priority=false) at ../src/libsystemd/sd-bus/sd-bus.c:2774
        _dont_destroy_bus = 0x55f224c4b3e0
#10 0x000055f2230e85c1 in sd_bus_process (ret=0x0, bus=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:2793
No locals.
#11 io_callback (s=<optimized out>, fd=<optimized out>, revents=<optimized out>, userdata=<optimized out>) at ../src/libsystemd/sd-bus/sd-bus.c:3063
        bus = <optimized out>
        __PRETTY_FUNCTION__ = "io_callback"
#12 0x000055f2230dc3f3 in source_dispatch.lto_priv.367 (s=0x55f224c51c10) at ../src/libsystemd/sd-event/sd-event.c:2116
        r = <optimized out>
        __PRETTY_FUNCTION__ = "source_dispatch"
        __func__ = "source_dispatch"
#13 0x000055f22308df1d in sd_event_dispatch (e=0x55f224c4a260) at ../src/libsystemd/sd-event/sd-event.c:2472
        p = <optimized out>
        r = <optimized out>
#14 sd_event_run (timeout=18446744073709551615, e=0x55f224c4a260) at ../src/libsystemd/sd-event/sd-event.c:2500
        r = <optimized out>
#15 manager_run (m=0x55f224c49060) at ../src/login/logind.c:1097
        r = <optimized out>
#16 main (argc=<optimized out>, argv=<optimized out>) at ../src/login/logind.c:1159
        m = <optimized out>
        r = <optimized out>
        __func__ = "main"

@mbiebl mbiebl added login regression ⚠️ A bug in something that used to work correctly and broke through some recent commit labels Sep 4, 2015
@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 4, 2015

@jsynacek I'm able to reproduce the problem in a fedora rawhide VM with systemd v225 as well.

@didrocks
Copy link
Contributor

didrocks commented Sep 4, 2015

Confirming as well on Ubuntu with systemd v225 (225-1ubuntu1) where systemd-logind segfault when you shutdown at any giving time as soon as the wall message prints.

@jsynacek
Copy link
Contributor

jsynacek commented Sep 4, 2015

I can reproduce this as well, the important part is that there is no polkit installed.

@poettering
Copy link
Member

Does #1141 fix the issue? (specifically 2cf088b) If not, could you run logind in valgrind, trigger the issue and attach the output here?

@poettering poettering added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Sep 4, 2015
@jsynacek
Copy link
Contributor

jsynacek commented Sep 4, 2015

@poettering I applied both patches, but they didn't help.

Assertion 's' failed at src/basic/util.c:1520, function cescape(). Aborting.
==63==
==63== Process terminating with default action of signal 6 (SIGABRT)
==63== at 0x58D5518: raise (in /usr/lib64/libc-2.22.90.so)
==63== by 0x58D7119: abort (in /usr/lib64/libc-2.22.90.so)
==63== by 0x137DB1: log_assert_failed (log.c:773)
==63== by 0x11EEEE: cescape (util.c:1520)
==63== by 0x147FFF: update_schedule_file (logind-dbus.c:1799)
==63== by 0x147FFF: method_schedule_shutdown.lto_priv.410 (logind-dbus.c:1961)
==63== by 0x11369D: method_callbacks_run (bus-objects.c:410)
==63== by 0x11369D: object_find_and_run.lto_priv.199 (bus-objects.c:1247)
==63== by 0x12E833: bus_process_object (bus-objects.c:1363)
==63== by 0x12E833: process_message (sd-bus.c:2543)
==63== by 0x12E833: process_running (sd-bus.c:2585)
==63== by 0x12E833: bus_process_internal.constprop.99 (sd-bus.c:2774)
==63== by 0x1315D0: UnknownInlinedFun (sd-bus.c:2793)
==63== by 0x1315D0: io_callback (sd-bus.c:3063)
==63== by 0x16B042: source_dispatch.lto_priv.358 (sd-event.c:2116)
==63== by 0x11656E: sd_event_dispatch (sd-event.c:2472)
==63== by 0x11656E: sd_event_run (sd-event.c:2500)
==63== by 0x11656E: manager_run (logind.c:1097)
==63== by 0x11656E: main (logind.c:1159)
==63==

I don't know why, but, at least on my machine, after SetWallMessage is called over dbus, the method_set_wall_message() handler in logind-dbus.c calls "sd_bus_message_read(message, "sb", &wall_message, &enable_wall_messages);", which for some reason passes but leaves wall_message undefined. Any idea why that might happen?

@poettering poettering added this to the v226 milestone Sep 5, 2015
@dvdhrm
Copy link
Contributor

dvdhrm commented Sep 5, 2015

This should fix it:

diff --git a/src/login/logind-dbus.c b/src/login/logind-dbus.c
index 7cc68d4..beae75e 100644
--- a/src/login/logind-dbus.c
+++ b/src/login/logind-dbus.c
@@ -1796,9 +1796,11 @@ static int update_schedule_file(Manager *m) {
         if (r < 0)
                 return log_error_errno(r, "Failed to create shutdown subdirectory: %m");

-        t = cescape(m->wall_message);
-        if (!t)
-                return log_oom();
+        if (!isempty(m->wall_message)) {
+                t = cescape(m->wall_message);
+                if (!t)
+                        return log_oom();
+        }

         r = fopen_temporary("/run/systemd/shutdown/scheduled", &f, &temp_path);
         if (r < 0)

@dvdhrm
Copy link
Contributor

dvdhrm commented Sep 5, 2015

Btw., this is a regression from:

commit 5744f59a3ee883ef3a78214bd5236157acdc35ba
Author: Lennart Poettering <lennart@poettering.net>
Date:   Fri Sep 4 10:34:47 2015 +0200

    logind: treat an empty wall message like a NULL one

dvdhrm pushed a commit to dvdhrm/systemd that referenced this issue Sep 5, 2015
We treat an empty wall-message equal to a NULL wall-message since:

        commit 5744f59
        Author: Lennart Poettering <lennart@poettering.net>
        Date:   Fri Sep 4 10:34:47 2015 +0200

            logind: treat an empty wall message like a NULL one

Fix the shutdown scheduler to not deref a NULL pointer, but properly
check for an empty wall-message.

Fixes: systemd#1120
@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 5, 2015

@dvdhrm not sure if 5744f59 is really the culprit. v225 does not even have that commit and I'm running v225 here.

@dvdhrm
Copy link
Contributor

dvdhrm commented Sep 5, 2015

@mbiebl, 5744f59 made the bug just more apparent. The code in question existed before and I think it was never correct.

@mbiebl
Copy link
Contributor Author

mbiebl commented Sep 5, 2015

@dvdhrm confirmed fixed with those patches applied, thanks! Closing the bug report.

@mbiebl mbiebl closed this as completed Sep 5, 2015
@jsynacek
Copy link
Contributor

jsynacek commented Sep 7, 2015

@dvdhrm Thanks for fixing this!

peebs pushed a commit to peebs/systemd that referenced this issue Dec 3, 2015
We treat an empty wall-message equal to a NULL wall-message since:

        commit 5744f59
        Author: Lennart Poettering <lennart@poettering.net>
        Date:   Fri Sep 4 10:34:47 2015 +0200

            logind: treat an empty wall message like a NULL one

Fix the shutdown scheduler to not deref a NULL pointer, but properly
check for an empty wall-message.

Fixes: systemd#1120
@quequotion
Copy link

Still broken or reverted:
Failed to call ScheduleShutdown in logind, proceeding with immediate shutdown: Action org.freedesktop.login1.poweroff is not registered

Not sure how this is possible.

@zonque
Copy link
Member

zonque commented Feb 1, 2016

This is fixed with commit 0671d7f.

This will be part of the next release. In the meantime, could you try the current git head, or ask your distribution to cherry-pick the commit, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
login needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer regression ⚠️ A bug in something that used to work correctly and broke through some recent commit release-critical
Development

No branches or pull requests

8 participants
@zonque @quequotion @dvdhrm @jsynacek @didrocks @poettering @mbiebl and others