New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrappers: allow user mode systemd daemons #5822

Closed
wants to merge 17 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@jhenstridge
Copy link
Contributor

jhenstridge commented Sep 12, 2018

This patch extends the snap daemon support to support systemd running in user mode via a daemon-mode: user parameter.

The main differences are:

  1. Installed to /etc/systemd/user instead of /etc/systemd/system
  2. Can not depend on system units (and conversely system units can't depend on user units)
  3. Sockets associated with user daemons are limited to user writable locations (e.g. $SNAP_USER_DATA).
  4. snapd can not enable/disable user daemons during upgrades, as it can't communicate with the user instance(s) of systemd.

My eventual goal is to build D-Bus activation on top of this for both user session and system daemons.

I've been trying to write a spread test for this, but so far I've failed. My initial thought was to try and manually run /lib/systemd/systemd --user as the test user and then check how the daemon runs under that instance. Unfortunately that fails, and it looks like the only supported way of starting user instances is via pam_systemd + logind. It might be possible with the logind CreateSession method, but I haven't yet worked out what arguments to pass to get it to work.

@zyga

zyga approved these changes Sep 12, 2018

Copy link
Contributor

zyga left a comment

Looks very reasonable! I only asked a question about the differences between the new user services and system services wrtd dependencies on the mount unit.

+1 - please get another review from @chipaca

} else {
wrapperData.ServicesTarget = systemd.ServicesTarget
wrapperData.PrerequisiteTarget = systemd.PrerequisiteTarget
wrapperData.MountUnit = filepath.Base(systemd.MountUnitPath(appInfo.Snap.MountDir()))

This comment has been minimized.

@zyga

zyga Sep 12, 2018

Contributor

Why is the mount unit not referenced for user services? Can user services depend on system services somehow?

This comment has been minimized.

@jhenstridge

jhenstridge Sep 12, 2018

Author Contributor

Because it is a system unit, so belongs to a different systemd instance. It looks like there are synthesised mount units in the user systemd instance, but they end up with names based on the mount point:

$ systemctl --user list-units | grep skype
snap-skype-54.mount     loaded active     mounted   /snap/skype/54

Presumably these units would have different names on e.g. Fedora where we mount snaps under /var/lib/snapd/snap.

This comment has been minimized.

@jhenstridge

jhenstridge Sep 12, 2018

Author Contributor

I couldn't quickly find a ref in the systemd docs, but the Arch wiki says:

systemd --user runs as a separate process from the systemd --system process. User units can not reference or depend on system units.

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Sep 12, 2018

And it looks like I was incorrect about enabling/disabling user services. It seems I want:

systemctl --global enable user-service-name

@jhenstridge jhenstridge force-pushed the jhenstridge:user-daemons branch 2 times, most recently from 740f111 to 831eb09 Sep 13, 2018

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Sep 13, 2018

So I worked out that I can get the user systemd instance started in the spread test by activating user@12345.service in the system instance. Two simple spread tests added showing a user daemon started with the default target, and user daemon sockets being created in the expected locations.

@jhenstridge jhenstridge force-pushed the jhenstridge:user-daemons branch 4 times, most recently from 371434b to 9d88f75 Sep 13, 2018

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Sep 14, 2018

I've disabled the new spread tests on a few systems:

  1. Ubuntu Core: installing a snap with user services fails because /etc/systemd/user is not in the writable-paths list. If the core snap is updated the test will probably work, but core systems don't usually have a user session.

  2. Ubuntu 14.04: it looks like the user@.service method of starting the user instance of systemd was added in version 205, while that distro release ships version 204.

  3. Amazon Linux 2: this one also fails to load user@.service in the tests. I would have thought it would have a new enough systemd, but haven't investigated further since it is unlikely people will run a user session on this distro.

@mvo5
Copy link
Collaborator

mvo5 left a comment

Thanks for working on this. This looks really nice, I added some question inline.

@@ -85,7 +85,7 @@ func (b *Backend) Setup(snapInfo *snap.Info, confinement interfaces.ConfinementO
}
// Ensure the service is running right now and on reboots
for _, service := range changed {
if err := systemd.Enable(service); err != nil {
if err := systemd.Enable(service, false); err != nil {

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

I would prefer a flag argument here. Reading: systemd.Enable(service, true) does not convey much information. Something like systemd.Enable(service, systemd.UserInstance) or systemd.Enable(service, 0) is a bit more explicit.

@@ -1427,6 +1427,7 @@ apps:
description: svc one
stop-timeout: 25s
daemon: forking
daemon-mode: system

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

Do we need a test that checks that without daemon-mode set we default to "system"?

This comment has been minimized.

@jdstrand

jdstrand Nov 29, 2018

Contributor

Yes, please.

echo "And the user mode systemd instance is started"
systemctl start user@"$(id -u test)".service
echo "It's sockets are created in the test user's directories"

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

Should we do a real end-to-end test here, i.e. have something connect to the socket and verify that the snap actually replies?

This comment has been minimized.

@jhenstridge

jhenstridge Sep 17, 2018

Author Contributor

I was looking for a similar test for system daemons, and all I could find was tests/main/install-socket-activation, which isn't a full end-to-end test either. I agree that it would be a good idea to extend the test to do this.

This comment has been minimized.

@jhenstridge

jhenstridge Sep 18, 2018

Author Contributor

I've updated the test to use a Python daemon implementation of the service that responds differently to the different sockets. This could be reused to test socket activation of system services.

systemctl start user@"$(id -u test)".service
echo "We can see the service running"
as_user systemctl --user status snap.test-snapd-user-service.test-snapd-user-service|MATCH "running"

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

(nitpick) I think we should check for Active: active just for good measure to ensure that the thing is still running.

@@ -274,6 +290,10 @@ func StopServices(apps []*snap.AppInfo, reason snap.ServiceStopReason, inter int
if !app.IsService() || !osutil.FileExists(app.ServiceFile()) {
continue
}
// We can't stop services that run under a user mode systemd
if app.IsUserService() {

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

The fact that we don't stop services is different than for system services. Can this break user services?

This comment has been minimized.

@jhenstridge

jhenstridge Sep 17, 2018

Author Contributor

It's more the fact that snapd can't talk to the user mode systemd instance (or multiple instances if you happen to have a multi-seat system).

As for what could break, it'd be the same things that can break for snap applications running over an upgrade or uninstall.

This comment has been minimized.

@pedronis

pedronis Jan 9, 2019

Contributor

This is problematic. We either need to find a way to stop them or we need infrastructure enough to inform the snap that is about to be updated and for it to do the right thing with those?

@jhenstridge also does this mean that "snap start/stop" etc don't work for these services?

This comment has been minimized.

@jhenstridge

jhenstridge Jan 10, 2019

Author Contributor

Correct: "snap start/stop" will not do anything because snapd can not talk to the user session systemd instance.

So we're essentially in the same position as existing long running snap applications (e.g. a messaging app started via xdg autostart that runs minimised and is only visible through the notifications it posts).

We are in a better position than the status quo though, in that if snapd grows a user session agent to handle upgrades, there is a defined method of restarting these daemons (unlike a long running GUI application).

@@ -377,9 +397,13 @@ func genServiceFile(appInfo *snap.AppInfo) []byte {
serviceTemplate := `[Unit]
# Auto-generated, DO NOT EDIT
Description=Service for snap application {{.App.Snap.InstanceName}}.{{.App.Name}}
{{- if .MountUnit }}
Requires={{.MountUnit}}

This comment has been minimized.

@mvo5

mvo5 Sep 17, 2018

Collaborator

Why does a user systemd service not need the mount unit? We added the Requires={{.MountUnit}} so that the service does not start before the snap is mounted. Are user service run after all the mounts? If so, a comment would be nice so that our future selfs will remember this.

This comment has been minimized.

@jhenstridge

jhenstridge Sep 17, 2018

Author Contributor

I don't think there is any way a user unit can depend on or trigger a system unit to start. The user mode systemd generates transient units for existing mount points, but it wouldn't have any way to cause the mount to occur as a prerequisite. We're pretty much in the same position as a snap app here.

@jhenstridge jhenstridge force-pushed the jhenstridge:user-daemons branch 4 times, most recently from 48be212 to 6a7a002 Sep 18, 2018

@niemeyer
Copy link
Contributor

niemeyer left a comment

No changes required right now. Just marking it as I want to have a careful dive on this one, given that it's a major new area being opened.

@jhenstridge jhenstridge force-pushed the jhenstridge:user-daemons branch 2 times, most recently from 8df294d to 58465fe Nov 22, 2018

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Nov 23, 2018

I've updated the branch to work with current master. I'm not running the snap-user-service tests on CentOS 7 for the same reason as Amazon Linux 2: it ships an old version of systemd where the current method of starting a user session (activating the user@$uid.service unit) doesn't work.

case systemd.UserMode:
return dirs.SnapUserServicesDir
default:
panic("unknown systemd.InstanceMode")

This comment has been minimized.

@jdstrand

jdstrand Nov 29, 2018

Contributor

Should this (and other places below) be systemd.DaemonMode?

This comment has been minimized.

@jhenstridge

jhenstridge Nov 30, 2018

Author Contributor

I called the type InstanceMode, which is why I included that in the message. I'm open to changing it though. There's two ways it is used at the moment:

  1. which instance of systemd are we talking to?
  2. what mode does this service run in?
# Amazon Linux 2 gives error "Unit user@12345.service not loaded."
- -amazon-linux-2-*
# Centos 7 gives error "Unit user@12345.service not loaded."
- -centos-7-*

This comment has been minimized.

@jdstrand

jdstrand Nov 29, 2018

Contributor

What is the user experience for someone installing a snap with a user service on these distributions (or others where user services aren't in use yet)?

This comment has been minimized.

@jhenstridge

jhenstridge Nov 30, 2018

Author Contributor

The problem is that the method of launching the user mode systemd has changed over the course of releases.

With current versions of systemd, the supported method is to ask the pid 1 systemd instance to start the templated user@.service service. With older releases, my understanding is that systemd --user was invoked directly by logind.

As written, the test relies on the new method of launching the user instance, since trying directly execute it will fail with complaints about cgroups. I haven't looked into what would be necessary to get the test fixture running on these old systemd's, and wasn't sure how important it was.

My primary motive for this feature is for use in conjunction with dbus service activation. For that particular case, I plan to use the same method of supporting distros without a user instance of systemd as most services do currently: include both an Exec and SystemdService line in the dbus service activation file.

For other types of user services I don't have a good answer. It is basically the same situation as running a snap that uses XDG Autostart on a desktop that doesn't support it.

@@ -114,24 +114,26 @@ func StartServices(apps []*snap.AppInfo, inter interacter) (err error) {
if err == nil {
return
}
if e := stopService(sysd, app, inter); e != nil {
inter.Notify(fmt.Sprintf("While trying to stop previously started service %q: %v", app.ServiceName(), e))
if app.ServiceMode() == systemd.SystemMode {

This comment has been minimized.

@jdstrand

jdstrand Nov 29, 2018

Contributor

I think you are doing the right thing here (ie, not mixing in user services with the snap start|stop|restart commands), but it makes me wonder if snap start|stop|restart should grow a --user option or similar for these...

This comment has been minimized.

@jhenstridge

jhenstridge Nov 30, 2018

Author Contributor

It's not something that snapd can control though: in general snapd running as pid 0 is not going to be able to communicate with the systemd instance running as the user.

At the moment we don't do anything about snap apps running as the user during an upgrade, so I don't think it is that big a stretch to do the same for user services.

This comment has been minimized.

@jdstrand

jdstrand Nov 30, 2018

Contributor

Note that this is something that (iirc), @chipaca is looking into. It would be possible for snapd to communicate with userd to do something sensible on refresh. Also note that my question is not meant to block this PR; just putting it out there for a potential followup.

@jdstrand
Copy link
Contributor

jdstrand left a comment

I looked at this PR from a security point of view (but had a few questions other questions inline). In general, this PR LGTM in that regard. One thing that came to mind is that, unlike system services, user services are likely to be coupled with typical desktop interfaces (eg, X11, desktop-legacy; either intentionally or devs taking a shotgun approach with their plugs) which could open up interesting (from a security POV) interactions between snaps. @jhenstridge, curious on your thoughts about adding review-tools check to warn if specifying 'daemon-mode: user' for a command that 'plugs: [ x11 ]' (or similar)?

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Nov 30, 2018

@jhenstridge, curious on your thoughts about adding review-tools check to warn if specifying 'daemon-mode: user' for a command that 'plugs: [ x11 ]' (or similar)?

I don't think it would be out of the ordinary for session D-Bus daemons (for instance) to want to talk to the display, so I don't think it is something review-tools should block.

While this does provide a new way for a snap author to have their code executed, is it that different to e.g. the way we allow apps to launch via XDG autostart?

@jdstrand

This comment has been minimized.

Copy link
Contributor

jdstrand commented Nov 30, 2018

@jhenstridge, curious on your thoughts about adding review-tools check to warn if specifying 'daemon-mode: user' for a command that 'plugs: [ x11 ]' (or similar)?

...

While this does provide a new way for a snap author to have their code executed, is it that different to e.g. the way we allow apps to launch via XDG autostart?

With autostart the user should be prompted, with this PR, the user session service is just there potentially long running in the background (eg, this makes something like a persistent keylogger easier). Again, X is deeply flawed from a security POV so not saying we should block this on that (users will always need to trust the publisher when a snap plugs x11, desktop-legacy, etc and this PR doesn't change that). I just wanted your thoughts and you answered my question wrt dbus services wanting to talk to the display.

@pedronis pedronis self-requested a review Dec 20, 2018

@jhenstridge jhenstridge force-pushed the jhenstridge:user-daemons branch from 58465fe to b659e12 Jan 2, 2019

@pedronis

This comment has been minimized.

Copy link
Contributor

pedronis commented Jan 23, 2019

I'm closing this until there is a proposal (on the forum) of how to stop these services at refresh like we stop any other services, and what snap start/stop etc should do with them

@pedronis pedronis closed this Jan 23, 2019

@jhenstridge

This comment has been minimized.

Copy link
Contributor Author

jhenstridge commented Jan 24, 2019

Closing this pull request like this is not constructive. This is a feature hasn't disappeared from our list of priorities, so could you please reopen it?

In discussions with @zyga I was under the impression that this wasn't considered a problem as these processes effectively fall into the same class as other long running user processes. Closing the PR after 2 weeks of radio silence is not helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment