Kill user session scope by default #3005

Merged
merged 10 commits into from Apr 21, 2016

Projects

None yet

9 participants

@keszybz
Member
keszybz commented Apr 10, 2016

No description provided.

@arvidjaar arvidjaar and 1 other commented on an outdated diff Apr 10, 2016
man/systemd-run.xml
+
+ <example>
+ <title>Start <command>screen</command> as a user service</title>
+
+ <programlisting>$ systemd-run --scope --user screen -d -m
+Running scope as unit run-r14b0047ab6df45bfb45e7786cc839e76.scope.
+
+$ screen -ls
+There is a screen on:
+ 492..laptop (Detached)
+1 Socket in /var/run/screen/S-fatima.
+</programlisting>
+
+ <para>To allow the screen to persist after the user logs out,
+ even if the session scope is terminated, enable lingering:</para>
+
@arvidjaar
arvidjaar Apr 10, 2016 Contributor

I think this is confusing. systemd-run already enables started screen to persist after logout when lingering is disabled; did you mean to say "Alternative to systemd-run is to enable lingering"?

@keszybz
keszybz Apr 10, 2016 Member

If lingering is disabled, user@.service will be killed when the user logs out from the last session. What about the following wording:

By default, the screen which is now running as a scope unit underneath the user's systemd instance will be terminated when the user logs out of their last session. To prevent that and allow screen to persist, enable lingering:

@arvidjaar
arvidjaar Apr 11, 2016 Contributor

Yes, I misunderstood what linger does and your new version is better. But even that is not enough; e.g. when running something like systemd-run --scope --user /some/program & this program will be sent SIGHUP by shell when exiting, one still needs nohup to prevent it. May be systemd-run should do it itself; but until it does man page better mentions it.

@martinpitt
Contributor

Not a fan of killing user sessions by default, I must say. It's a bad default on personal computers and servers where you often want to use screen. It's a good default for shared computers of course, but locking them down appropriately already takes some manual configuration, and those computers should be administered by a reasonably knowledged admin anyway.

@arvidjaar
Contributor

Well, as examples in this PR show you already can start long running screen if needed pretty easily. Personally I think that this should be rather exception than rule; after all background jobs are normally killed when you exit shell unless you use nohup. Similar logic here.

For personal computers I do not logout at all, I use suspend/resume.

@keszybz
Member
keszybz commented Apr 10, 2016

Yeah, I don't think the concern is justified here. In general we clean up stuff when the owning entity goes away (a process, a service, a scope, etc) as much as possible. We didn't do this for session scopes 'cause of legacy reasons, but with systemd-run we have much nicer alternatives.

In particular, for my gnome session, if I log out, without KillUserProcesses=yes I get some processes which are obviously mistakes. Even if I log in again, I'm much better starting those again cleanly.

@arvidjaar
Contributor

@keszybz

Yeah, I don't think the concern is justified here.

Hmm ... I likely misunderstood what lingering does. Now if as you mentioned this requires lingering enabled the concern is justified. Users cannot enable it by itself, which means by default there will be no way to start long running processes. Unless you also enable linger for all users by default.

@keszybz
Member
keszybz commented Apr 11, 2016

Users cannot enable it by itself, which means by default there will be no way to start long running processes.

Yes. Admin privileges are required.

I think we should allow modify the default polkit policy to allow users to enable lingering by default for themselves. So you'd be able to start long running processes after a manual step, but the cleanup would be there by default. After all, we allow logged in users to reboot the machine after authentication, and leaving a running processes seems like less of an issue. But that's subject for another commit.

@poettering poettering commented on the diff Apr 11, 2016
src/login/logind.c
+ config_item_perf_lookup, logind_gperf_lookup,
+ false, m);
+}
+
+static int manager_dispatch_reload_signal(sd_event_source *s, const struct signalfd_siginfo *si, void *userdata) {
+ Manager *m = userdata;
+ int r;
+
+ r = manager_parse_config_file(m);
+ if (r < 0)
+ log_warning_errno(r, "Failed to parse config file: %m");
+ else
+ log_info("Config file reloaded.");
+
+ return 0;
+}
@poettering
poettering Apr 11, 2016 Member

Not sure how I feel about this. After all this only adds in what's set in logind.conf now, but does not reset what it not set there. Which means, if you have a setting in logind.conf, then comment it and reload logind you won't get the setting reset, and will get a different result than you'd get if instead you had restarted logind.

THis is particularly problematic as KillOnlyUsers= and similar settings take lists of things, and on each SIGHUP we'd keep adding to this list...

To fix this properly, we'd need a manager_reset_config() call which resets all settings to the defaults, and is invoked before anager_parse_config_file() here...

@poettering poettering commented on an outdated diff Apr 11, 2016
src/login/logind.c
@@ -88,6 +88,8 @@ static Manager *manager_new(void) {
if (!m->kill_exclude_users)
goto fail;
+ m->kill_user_processes = true;
+
@poettering
poettering Apr 11, 2016 Member

Not sure how I feel about this one. It's quite a change, we so far didn't dare to make.

I am actually OK with making this change upstream, but I figure we need to make this a "noisy" change, with a larger addition to NEWS, and a clear message to distros what's at stake here, how they can revert to old behaviour, and what our recommended way to handle screen/tmux is.

Moreover, this probably should be something that gets a "configure" switch for selecting the default, where we default to on.

(And on the Fedora side I figure this needs a feature page...)

@poettering
Member

BTW, Lingering does more than just allow the user to leave processes around after logging out. It also means that the user@.service instance of users where this is turned on will be started at boot rather than at first login... This means the recommendation to use this for screen/tmux is a bit skewed, though probably still valid... However this should be mentioned in the man page change I think.

@poettering
Member

I commented on a few things. I don't like the logind.conf change, that needs to be changed, see my comments.

I am fine with turning on the kill user sessions stuff, but see my comments.

@keszybz
Member
keszybz commented Apr 11, 2016

You're right, reloading of logind.conf needs more work. I'll fix that.

I think that KillUserProcesses=yes is a better default and we should make the change at some point. I think that with the left-over processes people are seeing with the recent dbus changes now is a good time.
I'll add the configure switch and NEWS entry and file a F25 feature page. Thanks for the review!

@poettering
Member

And yes, I also agree with changing the PK policy to allow local users to turn on lingering.

@arvidjaar
Contributor

allow local users to turn on lingering

So far the primary use case for me was screen in remote login. Does it mean that users logged in remotely won't have any way to do it?

Lingering does more than just allow the user to leave processes around after logging out. It also means that the user@.service instance of users where this is turned on will be started at boot rather than at first login...

I understand that starting thousands of user processes unconditionally is probably bad. But what about starting it on demand - systemd-run --scope starts user@.service if it is not running; this user service terminates when last systemd-run --scope process is completed (unless lingering is enabled of course).

@keszybz
Member
keszybz commented Apr 13, 2016

Updated.

The policy is changed to allow any user to change their linger setting.

@arvidjaar
Contributor

@keszybz

It also makes KillOnlyUsers symmetrical to KillExcludeUsers.

It was already symmetrical. KillExcludeUsers filters out and KillOnlyUsers filters in. None of them is relevant if killing user processes is disabled globally. I think current behavior is more logical. KillUserProcesses enables or disables it globally and two other options allow to fine tune list of users if it is enabled.

@keszybz
Member
keszybz commented Apr 13, 2016

I had the following case in mind: you have KUP=off, and you want to enable killing for a single user. Without the patch you have to change two settings. With the patch just one. It just seems slightly simpler. But the change is cosmetic, so if everybody thinks it doesn't make things better, I'll be happy to drop it.

@poettering poettering referenced this pull request Apr 15, 2016
Closed

logout leaves running user processes #2975

0 of 1 task complete
@thx1111
thx1111 commented Apr 15, 2016

But the change is cosmetic

I have to chime in here. I say it is inappropriate to dismiss this sort of thing as "cosmetic", where, in fact, it is the very kind of thing that sets the "culture" and User Experience of systemd. "Simplicity", "Predictability", "Discoverability" - these are very important elements of the, shall we say, Administrative Experience. These are NOT simply "cosmetic" issues. I believe it is important to ask ourselves, not so much "how can we make this to work", but rather "how would some naive new user expect this to work?"

To generalize on this idea, naively, I expect there to exist a simple and predictable "initial user state", a "guaranteed user state" of the machine. And I expect that that state can be regenerated by some means other than powering-down and rebooting the machine, which is to suggest that "logging-out" and then "logging-in" would be that means. And IΒ expect that the mechanism for changing the behavior or character of that "guaranteed user state" would be simply and easily discoverable, and not some Easter Egg hunt. Which is to say that configuration settings should be in "one place", and not be spread-out over some many files, or involve multiple kinds of configuration mechanisms.

Just saying, as guidelines...

@keszybz
Member
keszybz commented Apr 15, 2016

Oh, man. The "it" that I considered cosmetic was whether KillOnlyUsers= has any effect when KillUserProcesses=false. The second one was default until now, so it's very unlikely that anyone set the first, why would they? So yeah, it's a change of semantics, but one that no one is likely to be in the position to notice. And anyway, this has nothing to do with "guaranteed user state" or anything like that.

@thx1111
thx1111 commented Apr 15, 2016 edited

Hmm - are there more variables than degrees of freedom?

KillUserProcesses=yes -> KillOnlyUsers= has no meaning.
KillUserProcesses=no -> KillExcludeUsers= has no meaning.

As opposed to using only KillOnlyUsers= and KillExcludeUsers= with the possible value "all", which becomes contradictory if both configurations are set to "all".

A less semantically awkward approach is to have a KillUserProcessesDefault=yes/no and KillUserExceptions=. Then, there are no meaningless or contradictory conditions.

@keszybz
Member
keszybz commented Apr 15, 2016

If I was doing this from scratch, I'd use just one config setting, with "*", "+user", "-user" mixed in any order. But that ship has sailed, and for backwards compat I think we should stay with KillUserProcesses.

@keszybz
Member
keszybz commented Apr 15, 2016

Force pushed with a tweak to the man page text.

@thx1111
thx1111 commented Apr 16, 2016

for backwards compat

In systemd?! Ha! Do what you think is best, but you don't have to resort to invalid justifications. I appreciate that there is little return in re-doing it all, and there are more important issues to address.

I am unclear, though, about how all these configurations interact. Will a user be able to set both "linger" and "not linger"? Will their user setting be able to override the system KillOnlyUsers= and KillExcludeUsers= settings? And will their "linger" setting be able to override the KillUserProcesses= default? Do some users get to choose and some not?

@arvidjaar
Contributor

@thx1111

And will their "linger" setting be able to override the KillUserProcesses= default?

These two settings are orthogonal to each other.

@keszybz

allow any user to request lingering

That sounds like an overkill. Going in line with other settings, this is difference between "persistent" and "run-time". We want every user to be able to start long-running process. This is implemented most easily by plugging into lingering concept. But then it should be enabled only until reboot, i.e. become run-time setting.

@keszybz
Member
keszybz commented Apr 16, 2016

But then it should be enabled only until reboot, i.e. become run-time setting.

It's something that I thought about too. It would mean that users are allowed to start processes as long as they are able to log in. But it would be a fairly significant change to lingering, with three states: on, off, runtime.

@thx1111
thx1111 commented Apr 16, 2016

These two settings are orthogonal to each other.

So if the administrator sets KillUserProcesses=no and KillOnlyUsers=<some_user>, then <some_user> would still be able to start a long-running process by configuring "linger"? There is no hierarchy of authority? And KillExcludeUsers=<some_user> is simply redundant, since "linger" can be set instead?

Is "linger" a per process setting?

@thx1111
thx1111 commented Apr 18, 2016

BTW, even with KillUserProcesses=yes, logging-out does not clean-up user files in /tmp. So there is still no definite "initial user state" without rebooting.

@arvidjaar
Contributor

@thx1111

logging-out does not clean-up user files in /tmp

How do you determine which files can be removed?

@thx1111
thx1111 commented Apr 18, 2016

Presumably, all files owned by the user being logged-out in the last instance.

But yes, I suppose that that is more involved if the user is leaving a long-running process. Still, if there is no long-running user process and the user log-out is the last instance, then what logged-out-user owned /tmp/ files are being used by the system or by some other logged-in user? And why would such a thing happen? I think that would suggest that some such process is "broken", using someone else's /tmp/ file.

The potential that leaves me uncomfortable here, is suppose that there is created, for instance, some kind of /tmp/ cache file or authorization file, and suppose, unfortunately, some application using those files has a bug which is triggered by the state of that cache file or that authorization file. The user is going to have a "mysterious" application failure. The application "works" when the machine is re-booted, but then begins to "fail", perhaps after they log-out and log-in. And linux starts to feel like windows. And people begin to superstitiously re-boot their machines, to regain a working state. Not good...

Of course, if systemd could guarantee that no one would ever write such a buggy application, then that use-case would not need to be addressed. ;)

@poettering poettering and 1 other commented on an outdated diff Apr 20, 2016
src/login/logind.c
@@ -92,6 +92,8 @@ static Manager *manager_new(void) {
if (!m->devices || !m->seats || !m->sessions || !m->users || !m->inhibitors || !m->buttons || !m->user_units || !m->session_units)
goto fail;
+ m->kill_user_processes = true;
+
@poettering
poettering Apr 20, 2016 Member

This should be in the new manager_reset_config() function, right?

@keszybz
keszybz Apr 21, 2016 Member

Rebase fail. Will rebase the rebase again.

@poettering
Member

looks all good to me, except the misplaced setting, which really needs to be in manager_reset_conf().

@poettering poettering commented on the diff Apr 20, 2016
src/login/logind.c
@@ -66,11 +56,29 @@ static Manager *manager_new(void) {
@poettering
poettering Apr 20, 2016 Member

hmm, m->power_key_ignore_inhibited must be rest to false explicitl here now, too. Same for suspend_key_ignore_inhibted, and hibernate_key_ignore_inhibited.

keszybz added some commits Apr 9, 2016
@keszybz keszybz core: prefix selinux messages with "selinux: "
SELinux outputs semi-random messages like "Unknown permission start for class
system", and the user has to dig into message metadata to find out where
they are comming from. Add a prefix to give a hint.
c8a806f
@keszybz keszybz logind: reload config on SIGHUP
v2:
- fix setting of kill_user_processes and
  *_ignore_inhibited settings
a2ed707
@keszybz keszybz man: reformat examples using <example> 4eac7f5
@keszybz keszybz man: expand description of lingering and KillUserProcesses setting
The description in the man page was wrong, KillUserProcesses does
not kill all processes of the user. Describe what the setting
does, and also add links between the relavant sections of the
manual.

Also, add an extensive example which shows how to launch screen
in the background.
65eb37f
@keszybz keszybz logind: flip KillUserProcesses to on by default
This ensures that users sessions are properly cleaned up after.
The admin can still enable or disable linger for specific users to allow
them to run processes after they log out. Doing that through the user
session is much cleaner and provides better control.

dbus daemon can now be run in the user session (with --enable-user-session,
added in 1.10.2), and most distributions opted to pick this configuration.
In the normal case it makes a lot of sense to kill remaining processes.
The exception is stuff like screen and tmux. But it's easy enough to
work around, a simple example was added to the man page in previous
commit. In the long run those services should integrate with the systemd
users session on their own.

https://bugs.freedesktop.org/show_bug.cgi?id=94508
systemd#2900
97e5530
@keszybz keszybz build-sys: add --without-kill-user-processes configure option 95365a5
@keszybz keszybz logind: allow any user to request lingering
We enable lingering for anyone who wants this. It is still disabled by
default to avoid keeping long-running processes accidentally.
Admins might want to customize this policy on multi-user sites.
152199f
@keszybz keszybz loginctl: show linger status in user-status
zbyszek (1002)
           Since: Tue 2016-04-12 23:11:46 EDT; 23min ago
           State: active
        Sessions: *3
          Linger: yes
            Unit: user-1002.slice
                  β”œβ”€user@1002.service
                  β”‚ └─init.scope
                  β”‚   β”œβ”€38 /usr/lib/systemd/systemd --user
                  β”‚   └─39 (sd-pam)
                  └─session-3.scope
                    β”œβ”€   31 login -- zbyszek
                    β”œβ”€   44 -bash
                    β”œβ”€15076 loginctl user-status zbyszek
                    └─15077 less
26e00f0
@keszybz keszybz shared/logs-show: fix memleak in add_matches_for_unit
42fbdf4
@keszybz keszybz logind: make KillOnlyUsers override KillUserProcesses
Instead of KillOnlyUsers being a filter for KillUserProcesses, it can now be
used to specify users to kill, independently of the KillUserProcesses
setting. Having the settings orthogonal seems to make more sense. It also
makes KillOnlyUsers symmetrical to KillExcludeUsers.
921f831
@keszybz
Member
keszybz commented Apr 21, 2016

Thank you for the review. Force-pushed.

@poettering poettering merged commit 52b9b66 into systemd:master Apr 21, 2016

3 checks passed

ubuntu-amd64 autopkgtest finished (success)
Details
ubuntu-i386 autopkgtest finished (success)
Details
ubuntu-s390x autopkgtest finished (success)
Details
@arvidjaar
Contributor

I still believe commit 921f831 was mistake and change for the sake of change.

@keszybz keszybz deleted the keszybz:kill-user-proceses branch Apr 21, 2016
@small1
small1 commented on 97e5530 May 29, 2016

If this gets the default in upcoming systemd releases i will probably switch from using systemd on ALL my servers. And probably on my desktops as well.

I don't want to have to turn that stupid thing off on the servers i maintain. This add extra maintenance as most of the users i have leaves processes in the background.

Member

@small1 please vent your frustrations elsewhere, this is a code commenting tool. As long you don't find any actual code issues please don't comment here.

@StenSoft
StenSoft commented May 31, 2016 edited

Can I suggest to set the default to kill user's processes with SIGHUP instead of SIGTERM? It would solve the D-Bus/Gnome issue and yet nohup et al. would continue to work.

@mbiebl
Contributor
mbiebl commented Jun 1, 2016

While this would not terminate screen itself, the processes inside the screen session most likely will not handle SIGHUP.

Another suggestion is from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#100

How about only sending the HUP signal to processes that don't have a
pseudo-TTY assigned? That would kill all those daemons, leave anything
running inside a screen/tmux/terminal untouched?
@poettering
Member

we actually kill user login sessions with SIGTERM first, and SIGHUP second, by using the SendSIGHUP=yes option for login scope units.

@StenSoft
StenSoft commented Jun 1, 2016

@mbiebl Good point. It may be better to check whether it has a controlling terminal (7th field in /proc/[pid]/stat is non-zero) instead of any TTY. For sessions with a controlling terminal, the process of session termination is already handled by the kernel.

@poettering Well, that still sends SIGTERM and requests the process to terminate. The point was to not request termination of processes that opt to keep running which is traditionally done by ignoring SIGHUP (I don't think any process should ignore SIGTERM).

@amirse
amirse commented Jul 21, 2016 edited

I reached this thread after something a bit unintuitive happened to me:
I was adding my desktop user to the Wireshark group so I could capture. So, obviously, I need to logout and login back to refresh my processes' group list. I logout, relogin on gdm. Then, just because I'm used to - I don't run Wireshark from the Gnome interface, but rather open a gnome-terminal and run it there. No go. I think "Wait what? I just opened this gnome-terminal under my GUI session. Why isn't it inheriting my changed group membership?". Then, I look at the process list and see my nice GUI application is actually under the systemd process tree. This is rather unintuitive, isnt it? I think it's reasonable to assume that when you relogin on GUI, everything from that point on, which is opened under that new GUI session, will be "new". Just my two cents as an end-user who stumbled onto this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment