RFC: systemdspawner without `systemd-run` #100

behrmann · 2022-11-25T13:26:09Z

This is a reimplementation of what I first tried with #40 and I'm opening it as a draft for now, since it's still missing pieces that I need to document.

This spawner is called StaticSystemdSpawner, since it tries to push out as much logic as possible out off the spawner and into static unit files, that are currently named like this jupyterhub-singleuser-{USERNAME}@.service. The only thing the spawner does is start those units with an instance name that is either name name of a named server or default, i.e. starting a plain instance on JupyterHub will result in the singleuser server running as jupyterhub-singleuser-{USERNAME}@default.service, and write out secrets for the single user server, which are passed into the service of the single user server via the credentials mechanism that exists in systemd 249 (available in Debian bullseye, I've only tested it with systemd 251, though).

Currently where these units come from is left open, I pregenerate them for all users that are supposed to be able to use JupyterHub and put them into /run/systemd/system. I will explore next, though, whether to allow to generate the units if they don't exist via another systemd service that is run by the spawner.

The services I generate look something like this (simplified, because I have a few hacks on top to make JupyterLab plugins work, that want to write to the venv, which they can't write to)

[Unit]
Description=Jupyterhub Single-User Server for user {username} with name '%I'
Wants=jupyterhub.service jupyterhub-httpproxy.service

[Service]
Type=simple
User={username}

Environment="PATH=/opt/jupyter/hubenv/bin:/usr/bin:/usr/local/bin:/bin"

LoadCredential=envfile:/var/lib/jupyterhub/spawnerconf/{username}/%i
ExecStart=/opt/jupyter/jupyter-singleuser-wrapper

WorkingDirectory=/home/{username}

Slice=jupyterhub-singleuser-{username}.slice

Generating these service units outside of the hub and with privileges different from the user the hub is running on has the advantage, that the hub has no control over what the environment for a user looks like and nothing, e.g. sandboxing options, need to be implemented in the spawner, but can be put into the units themselves. It also allows to differentiate these options by user very easily.

The Slice= does not need to exist, but one can use systemd's namespacing. This might for example look like this

# /etc/systemd/system/jupter-singleuser-.slice.d/10-defaults.conf
[Unit]
Description=Slice for %j
StopWhenUnneeded=yes

[Slice]
TasksMax=33%
CPUQuota=400%
MemoryHigh=8G
MemoryMax=9G

also sandboxing defaults can be made similarly

# /etc/systemd/system/jupyter-singleuser-.service.d/40-hardening.conf
[Service]
PrivateTmp=yes
PrivateDevices=yes
ProtectSystem=strict
ProtectHome=read-only
ProtectKernelTunables=yes
ProtectControlGroups=yes
ReadWritePaths=/home/%j
NoNewPrivileges=yes

The wrapper in the ExecStart= is quite simple, all it does is source the credentials file $CREDENTIALS_DIRECTORY/envfile and export all its variables, because credentials cannot be used as environment files (since secrets should never be environment variables, since they get inherited down the process tree). Patching the singleuser server to read secrets from a file instead of environment variables is something I've yet to do.

All in all this allows for the hub itself to run as a unprivileged user with some strong sandboxing. Using the credentials mechanism the configuration directory can also be made free of any cleartext secrets.

[Unit]
Description=Jupyterhub
After=network-online.target postgresql.service
Wants=postgresql.service
PartOf=jupyterhub.target

[Service]
User=jupyter
Environment="PATH=/opt/jupyter/hubenv/bin:/usr/bin:/usr/local/bin:/bin"
ConfigurationDirectory=jupyterhub
RuntimeDirectory=jupyterhub
StateDirectory=jupyterhub

LoadCredentialEncrypted=jupyter_configproxy_auth_token.cred
LoadCredentialEncrypted=jupyter_cookie_secret.cred
LoadCredentialEncrypted=jupyter_db_password.cred
LoadCredentialEncrypted=jupyter_ldap_password.cred

ExecStart=/opt/jupyter/hubenv/bin/jupyterhub --config ${CONFIGURATION_DIRECTORY}/jupyterhub_config.py

PrivateTmp=yes
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
WorkingDirectory=/var/lib/jupyterhub

[Install]
WantedBy=multi-user.target

and (when running the proxy seperately) it can be run stateless as a dynamic user with similar sandboxing.

To be able to start units I use polkit

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.systemd1.manage-units") {
	polkit.log("action=" + action);
	polkit.log("subject=" + subject);
	if (action.lookup("unit").match(/^jupyterhub-singleuser-[a-zA-Z]?\w*@[a-zA-Z0-9_\\]+.service$/)) {
	    var verb = action.lookup("verb");
	    if ((verb == "start" || verb == "stop" || verb == "reset-failed") && subject.user == "jupyter") {
		return polkit.Result.YES;
	    }
	}
    }
});

The same would be possible with sudo rules, but I haven't added that, since I don't use it. polkit has the advantage that it allows for NoNewPrivileges=yes on the hub.

Besides automatically generating the singleuser server units on first use I still need to explore

allowing for multiple user units jupyterhub-singleuser-foo-username@.service and jupyterhub-singleuser-bar-username@.service, chosen via the spawner options form, to allow for different setups, e.g. lab, classical notebook, collaborative or different base images (disk images or maybe OCI images) via RootImage=.
running on multiple hosts, by using the systemctl switch -H to manage units on different hosts, for which I need to have a look at the SSH spawner.

My question: Does it make sense to integrate this here, or should I pursue a friendly fork, since I totally understand if the focus is too different.

Both for filling in template names into units as well as the plain escape. These wrappers are non-async, since we want to use them while instantiating a class.

Frank-Steiner · 2022-11-25T14:18:04Z

From a user-perspective I'd like to second this approach. Running jupyter as non-root is essential from a security perspective, but the sudo spawner lacks all the benefits of the systemd spawner. Combining the advantages of both as shown here sound like a great idea.

The idea is to delegate all logic to the service manager and use pre-existing unit files. If you want to use sandboxing or ressource control, implement it in the unit level, e.g. via (namespaced) drop ins. This way this logic doesn't need to be recreated in the spawner.

The renaming of templates will probably not be a common occurance and if they happen one can just stop all affected services, remove their units and daemon-reload. JupyterHub does not need knowledge of the implementation detail what the units are named. Having the unit name resurrected from some database field only leads to hard to debug problems when state becomes split between what's in the filesystem and what's in the database, which is too much logic in the spawner. It should just start and stop units.

behrmann · 2022-12-02T16:14:56Z

I've updated the PR a bit:

The handling of the default unnamed server and the named spawner have been split up, because it's impossible to have only servers with a name (the default_server_name works a bit different than I assumed).
I added a hook to start a unit that generates unit files. This is optional, but makes the usability a bit nicer. Since the hub can still only pass a single argument to the unit (the user), this can be heavily sandboxed as well.
I removed the all the state tracking, since I found no use for it, but pulled out my hair for a bit, after having renamed the units without nuking the database. The state tracking probably had a real use at some point, but since I want to remove as much logic as possible from the spawner and have it as thin a wrapper
I added some scaffolding for internal SSL, which is needed to go on to support this for multiple hosts akin SSHSpawner. I haven't tested the internal SSL yet, because I still need to figure out how to make this work in my setup with hub and proxy running in different processes. As a corollary of this work the singleuser services are not passed in a whole directory as a credential instead of a single file, since we will have to pass in all the certificates.

This made it necessary to update the polkit rule a bit

polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.systemd1.manage-units") {
        polkit.log("action=" + action);
        polkit.log("subject=" + subject);
        if (action.lookup("unit").match(/^jupyterhub-singleuser-[a-z][a-z0-9]{0,30}.service$/) ||
            action.lookup("unit").match(/^jupyterhub-singleuser-[a-z][a-z0-9]{0,30}@[a-zA-Z0-9_\\]+.service$/) ||
            action.lookup("unit").match(/^jupyterhub-unitgenerator@[a-z][a-z0-9]{0,30}.service$/)) {
            var verb = action.lookup("verb");
            if ((verb == "start" || verb == "stop" || verb == "reset-failed") && subject.user == "jupyter") {
                return polkit.Result.YES;
            }
        }
    }
});

When not autogenerating units this can be made a bit stricter, by leaving out the line matching jupyterhub-unitgenerator@.service

The unit to generate singleuser units could look something like

# jupyterhub-unitgenerator@.service
[Unit]
Description=Generate singleuser JupyterHub services for user %i

[Service]
Type=oneshot
User=root
ExecStart=/path/to/script/that/generates/units %i

PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/run/systemd/system

On the todolist:

make internal SSL work
use internal SSL to have the singleuser servers be spawned on differet machines
document everything and provide example services and scripts

behrmann · 2022-12-02T16:40:30Z

One thing that doesn't work: The custom error messages via exceptions. Since these are shown in the progress screen, only a non-descript internal server error will be shown if the spawner fails before the redirect happens.

yuvipanda · 2022-12-02T16:45:39Z

Thank you so much for your work on this, @behrmann. I'm generally in favor of (and very happy about!) this extra mode existing! I don't have a clear answer yet on wether it should live in this repo, or be in a separate repo. I currently don't have review cycles until January unfortunately :( I can spend some time reviewing the code, and helping figure out if it should be in this repo, or be split out onto its own (and heavily advertised from here) (unless someone else gets to it first). Thank you, and I appreciate your patience

behrmann · 2022-12-02T17:33:28Z

Yeah, no hurries, @yuvipanda. This is still in flux until I get all my ducks in a row (well, I am using this in production). I'm mostly keeping this here, because people showed interest in something like this.

With what's around here now, I don't think this can be properly reviewed, because this needs some serious scaffolding through supporting units, which I haven't added, because mine are a bit specialised to my department and wouldn't be of general use. Generally, I don't think this will ever end up in anything that's a turn key solution and will need some integration work to be usable, just because everything is outside of the spawner itself. I'm aiming for an easy building block to fit this into rather vanilla Linux systems (i.e. people who don't want to deploy some Kubernetes or who don't want to pay its performance penalty). I'm currently deploying it with an Ansible playbook, which is pretty small and mostly copies static files around, so it's not a lot of work, but still).

Also, the systems this targets might not be everybody has. It should run out of the box on Fedora and Arch, but this needs a proper polkit package (unavailable in any released stock Debian and - I think - Ubuntu, but in Debian available from the experimental repo and slated for inclusion in Debian bookworm, which will release in about half a year) and a sufficiently new systemd (v247 available in Debian buster should be enough, but I've only tested it on v251), which should be available on CentOS and its derivatives, but might not be there out of the box.

With christmas approaching, I'm not sure I'll finish up everything before then, so January should be more than early enough.

behrmann · 2022-12-02T17:34:05Z

I opened jupyterhub/jupyterhub#4244 for the exception issue

yuvipanda · 2024-06-21T05:00:07Z

Hi @behrmann :) Thank you very much for this wonderful piece of work!

Having had time to think about this some more (hah!), I think now that if this were to be, it should be a separate spawner of its own (StaticSystemdSpawner). I think if it gets mature enough, we can replace the existing systemdspawner with it, and transition TLJH or similar towards that. However, I think progress towards such a goal is best done by treating it as its own first class package, and allowing it to evolve separately (and at speed) - rather than as a PR here, where progress is dependent on being able to get this merged.

So if you're still interested in moving this forward, I'd suggest publishing StaticSystemdSpawner to PyPI, and advertising it a bit more on discourse.jupyter.org! Once it reaches a point where you feel it can replace SystemdSpawner, we can have that conversation then and try to make a major release with those changes merged in.

With that, I'm going to close this PR. I love the concept, and I do agree that switching this out reduces attack surface. I think it has a better chance of progressing this way, and I'd love to see it progress!

Frank-Steiner · 2024-06-21T08:01:00Z

@behrmann; I'd be happy to test. We are just realizing that we would need to move all the user kernel processes into a separate cgroup slice which is not possible with sudospawner. Only systemd-spawner is able to do that, but need the non-root option...

behrmann · 2024-06-23T15:31:25Z

@yuvipanda Thanks for the encouragement. Will split this out into its own thing once I'm back from vacation.

behrmann added 3 commits November 7, 2022 13:30

Add systemctl wrapper to start a regular unit

a4c4260

Add wrappers for systemd-escape

65b81b9

Both for filling in template names into units as well as the plain escape. These wrappers are non-async, since we want to use them while instantiating a class.

Add a systemctl wrapper to check if a unit exists

7c85ff1

behrmann mentioned this pull request Nov 25, 2022

Add possibility to use systemdspawner as non-root #99

Open

behrmann added 3 commits December 2, 2022 16:20

Add move_certs method to support internal ssl

04cc460

behrmann force-pushed the staticspawner branch from 6a491ab to f9db5b4 Compare December 2, 2022 15:46

This was referenced May 31, 2023

Rely on systemd-run's --working-directory, and refactor for readability #124

Merged

Transition towards passing through config to Systemd spawned services without providing dedicated config? #128

Open

Debilski mentioned this pull request Apr 25, 2024

Improved Pelita server mode ASPP/pelita#777

Merged

yuvipanda closed this Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: systemdspawner without `systemd-run` #100

RFC: systemdspawner without `systemd-run` #100

behrmann commented Nov 25, 2022

Frank-Steiner commented Nov 25, 2022

behrmann commented Dec 2, 2022

behrmann commented Dec 2, 2022

yuvipanda commented Dec 2, 2022

behrmann commented Dec 2, 2022

behrmann commented Dec 2, 2022

yuvipanda commented Jun 21, 2024

Frank-Steiner commented Jun 21, 2024

behrmann commented Jun 23, 2024

RFC: systemdspawner without systemd-run #100

RFC: systemdspawner without systemd-run #100

Conversation

behrmann commented Nov 25, 2022

Frank-Steiner commented Nov 25, 2022

behrmann commented Dec 2, 2022

behrmann commented Dec 2, 2022

yuvipanda commented Dec 2, 2022

behrmann commented Dec 2, 2022

behrmann commented Dec 2, 2022

yuvipanda commented Jun 21, 2024

Frank-Steiner commented Jun 21, 2024

behrmann commented Jun 23, 2024

RFC: systemdspawner without `systemd-run` #100

RFC: systemdspawner without `systemd-run` #100